CATEGORY:
Wandering Towards a Goal Essay Contest (2016-2017)
[back]
TOPIC:
Intention is Physical by Natesh Ganesh
[refresh]
Login or
create account to post reply or comment.
Author Natesh Ganesh wrote on Feb. 21, 2017 @ 17:18 GMT
Essay AbstractIn this essay, I will present the fundamental relationship between energy dissipation and learning dynamics in physical systems. I will use this relationship to explain how intention is physical, and present recent results from non-equilibrium thermodynamics to unify individual learning with dissipation driven adaptation. I will conclude the essay by establishing the connection between the ideas presented here and the critical brain hypothesis, and it's implications on cognition.
Author BioNatesh Ganesh is a PhD student in the Electrical and Computer Engineering Dept. at Umass, Amherst. Research interests include fundamental limits of energy efficiency in computing systems, neuromorphic computing architectures, machine learning algorithms, a physical basis for learning, coherent definitions for information and consciousness, and the philosophy of mind.
Download Essay PDF File
Member George F. R. Ellis wrote on Feb. 22, 2017 @ 07:10 GMT
Dear Natesh
this is an impressive piece of work.
I have not had time to try to work through the details, but for me two things are important. First, you have set a context of predictive estimation. That is a key issue, and I agree whole-heartedly. So there has to be a structure that underlies the existence of this function, and the kay issue is where this structure came from. That cannot be via non-equlibrium thermodynamics alone.
Second, you say Agency is the capacity of a system/entity/agent/organism to act on it’s environment. Is is the Moon an agent n that respect? (after all it causes tides on the Earth). "We will define sense of agency (SA) as the pre-reflective subjective awareness that one is initiating, executing, and controlling one’s own volitional actions in the world." That is already assuming key elements of psychology that do not arise in any simple way from physics.
I will try to reflect more on what you have written in due course. Your principle may well be important at the physical level, when the rest of the context is given.
Best wishes
George
report post as inappropriate
Author Natesh Ganesh replied on Feb. 22, 2017 @ 14:43 GMT
Professor Ellis,
Thank you for your comments. Since we seem to be dealing with two different conversation threads with different points raised in both, I will try and keep my answers separate to avoid confusion as much as possible.
"Second, you say Agency is the capacity of a system/entity/agent/ organism to act on it’s environment. Is is the Moon an agent n that respect? (after...
view entire post
Professor Ellis,
Thank you for your comments. Since we seem to be dealing with two different conversation threads with different points raised in both, I will try and keep my answers separate to avoid confusion as much as possible.
"Second, you say Agency is the capacity of a system/entity/agent/ organism to act on it’s environment. Is is the Moon an agent n that respect? (after all it causes tides on the Earth)."
--> Correct me if I am wrong, but you seem to prescribe to a definition of agency rooted in psychology? The agency I am talking of (takes the definition from philosophy and it) is simply the capacity to act. To act involuntarily, unconsciously or consciously with a purpose will all fall under it. The moon with the ability to act on earths waters makes it an agent, but doesn't have to fall under the category of making it a purposeful one for the moon. It is very possible to think of physical systems that have no agency, can change their state based on the influence of external systems but do not have the ability to 'act' and affect its environment. It is also possible to have systems that have 'agency' as I define but not have a purpose or intent for that agency. I limit purpose and intent only to a small set of systems that fall under my hypothesis. And a sense of agency as defined in the essay, need not be present in all agents. I have it there to show that a minimal dissipative system with an hierarchical structure can have something like that because of the physics of the structure alone.
"So there has to be a structure that underlies the existence of this function, and the kay issue is where this structure came from. That cannot be via non-equlibrium thermodynamics alone."
--> For the emergence of the structures themselves, I used England's dissipation driven adaptation to explain that. Your statement on structures not coming from non-equilibrium thermodynamics alone is an assumption I would argue. While the biological explanation of selection mechanisms are more explanatory and necessary, it is possible these selection mechanisms themselves are manifestations of deeper thermodynamic principles, which is what England argue for in his papers. While some aspects of his hypothesis seems to have been misunderstood and his results are specialized, I used it to clarify anyone who immediately dismisses the essay that minimal dissipation and dissipation driven are obviously contradicting (something that I struggled with for a while and some of my colleagues pointed out). Section 4 was dedicated to better clarify the assumptions in England's hypothesis and how it complements my derivations (Two sides of the same coin).
Thanks for a delightful exchange. I am enjoying myself!!
Natesh
view post as summary
Lee Bloomquist wrote on Feb. 22, 2017 @ 20:28 GMT
Natesh Ganesh asks, "Can minimal dissipation alone be a sufficient condition for learning?"
Two biological systems might fit this description.
First, in Charles Gallistel's book The Organization of Learning, Chapter 11 tells of foraging experiments. Consider fish (are neurons like fish?). Given probabilistic feeding stations, the school divides itself proportionally— which from the...
view entire post
Natesh Ganesh asks, "Can minimal dissipation alone be a sufficient condition for learning?"
Two biological systems might fit this description.
First, in Charles Gallistel's book
The Organization of Learning, Chapter 11 tells of foraging experiments. Consider fish (are neurons like fish?). Given probabilistic feeding stations, the school divides itself proportionally— which from the perspective of game theory looks like a Nash equilibrium: no single fish can increase its payoff by leaving one station to feed at another. In this situation, no food energy is "dissipated" because every scrap is eaten. So according to the hypothesis, learning must be involved. Indeed, if this were a probability learning experiment with just one fish and food energy given per play of the game, we would see probability learning (also as in Chapter 11). In probability learning food is "dissipated." Perhaps because of fear of leaving one group and joining another, the visiting behavior that's seen in probability learning is attenuated when the school forages as above; and then no food energy is dissipated.
Second, in Goranson and Cardier's
A two-sorted logic for structurally modeling systems, there is a review of apoptosis (programmed cell death, which according to the paper happens in the human body about a million times a second). In the olfactory system, apoptosis of neurons amounts to learning new smells, which involves consciousness.
In Barwise and Seligman's
Information Flow: The Logic of Distributed Systems, the example of a flashlight is given. The system— a flashlight in this case— supports different "local logics" and therefore different languages between which there may be perfect or imperfect translations. The language in this essay could be one such language. But there are others, as the example of the flashlight shows (including the specification language that specifies the purpose of the flashlight). Goranson describes software by which many different languages, local logics, or situations like this can be organized.
view post as summary
report post as inappropriate
Author Natesh Ganesh replied on Feb. 23, 2017 @ 07:33 GMT
Dear Bloomquist,
Thanks for your comments and very interesting links. I will have to look through them in detail as soon as I can. I am hoping we can find many more examples of biological systems that satisfy the idea
presented in this essay. Here are some of my thoughts on your comments-
I have not thought about the behavior of a larger system comprising of many small minimally dissipative parts in detail but if I have to venture a guess, I would think some sort of cooperative behaviour would emerge. Also with respect to the fish example, the fish is a system that can act on its environment and thus its behavior is a tradeoff between 'exploration vs exploitation' under this idea and would not just be a form of predictive learning that we would see in a system that
cannot act on its environment.
While there may be many languages, some providing a more detailed and useful definition of purpose, the language in this paper would explain the emergence of what these other languages describe better.
Thanks again for your comments.
Natesh
Lee Bloomquist replied on Feb. 24, 2017 @ 12:50 GMT
Natesh, thank you for your reply! You wrote:
"Also with respect to the fish example, the fish is a system that can act on its environment and thus its behavior is a tradeoff between 'exploration vs exploitation' under this idea and would not just be a form of predictive learning that we would see in a system that cannot act on its environment."
Is what you wrote, above, about probability-learning foraging fish implied by the definitions of terms in your hypothesis? That is, do the definitions of the terms in your hypothesis (like "open," "constraints," etc.) imply what you have written, above?
Your hypothesis: "Open physical systems with constraints on their finite complexity, that dissipate minimally when driven by external fields, will necessarily exhibit learning and inference dynamics."
Or, is more than this required to understand your hypothesis— more than just the above statement of your hypothesis together with definitions of the terms used in your hypothesis?
report post as inappropriate
Author Natesh Ganesh replied on Feb. 28, 2017 @ 15:31 GMT
Hi Lee,
Apologies for the late reply. Have been away at a conference to talk about some of the ideas here and how to relate it to computing.
"Or, is more than this required to understand your hypothesis— more than just the above statement of your hypothesis together with definitions of the terms used in your hypothesis?"
--> All that is needed to understand my hypothesis is that statement. I have provided as many definitions as I can there in the essay but due to space limitations I have had to reference some of the other definitions in former papers.
"Is what you wrote, above, about probability-learning foraging fish implied by the definitions of terms in your hypothesis? That is, do the definitions of the terms in your hypothesis (like "open," "constraints," etc.) imply what you have written, above?"
--> Yes. It does.
Joseph Murphy Brisendine wrote on Feb. 22, 2017 @ 22:20 GMT
Hi Natesh,
I think this essay is fantastic and basically completely correct. I love that you took the time to make explicit connections between the Landauer limit, which many biochemical processes ahve been shown to assymptotically approach, and the importance of a predictor circuit and feedback between sensing and acting, and you even bring in the flcutuation theorems at the end in discussing the problem of assigning probabilitites to brain states, I think it's wonderful and very informed with regard to current research in stat mech, neuroscience, and machine learning. You have the diversity of background required to address this question which is at the intersection of so many fields.
I hope you might take the time to peruse my submission, entitled "A sign without meaning." I took a very different approach and went with an equation-free text in the hopes of being as accessible as possible, but I think you'll find that we agree on a great number of issues, and I'm glad that the question is being addressed from multiple perspectives but with the right foundation in statistical mechanics.
Best of luck to you in the competition I think you wrote a hell of an essay!
--Joe Brisendine
report post as inappropriate
Author Natesh Ganesh replied on Feb. 23, 2017 @ 08:06 GMT
Dear Joe,
Thank you for your very kind and encouraging comments. Inspires me to work harder. I am glad that I managed to communicate the ideas in the essay coherently to you. Yes, the topic of this essay is at a very unique intersection of so many different fields. I wish I wasn't right at the word limit and had more room to discuss a bunch of other things. There is a much needed discussion of semantics, consciousness and the implications of the ideas presented here on the philosophy of the mind that I would have loved to delve into.
The title of your essay is very intriguing. I am caught up at a conference for the next two days but I will definitely read your essay in detail over the weekend and get back to you with questions/comments. I look forward to reading your thoughts on this problem. Thanks a lot again for your encouragement.
Natesh
James Lee Hoover wrote on Feb. 22, 2017 @ 23:18 GMT
Quite interesting, Natesh. The emergence of intention, purpose, goals, and learning are automatically achieved with England's restructuring and replication thesis as dissipation occurs -- but humanly done with purpose and goals? Your emphasis on computer modeling seems to blur the distinction between the human and machine but that is probably my failure to view it after one quick read.
Impressive study.
Jim
report post as inappropriate
Author Natesh Ganesh replied on Feb. 23, 2017 @ 07:54 GMT
Dear Jim,
I am glad you find it interesting. Yes, while England's ideas have been a big step forward in the right direction, there are some caveats in his hypothesis and I illustrate those points and present a way to unify individual learning and evolutionary processes under the single fluctuation theorems.
"but humanly done with purpose and goals?" I am sorry but I fail to understand your question. Can you help me out here?
I might have used finite state automata/machine which are popular in computer engineering and and being one I am very familiar with them. Their popularity in computer engineering does not reduce their general applicability.
Thanks for your comments. Let me know if there are other things I can clarify if you get a chance to view it in detail.
Natesh
Jeff Yee wrote on Feb. 23, 2017 @ 00:50 GMT
Natesh - You did a good job on your essay and I like how you've been able to incorporate math, which was suggested in the essay rules. Well done! I gave you a good community rating which I hope helps to give your essay the visibility/rating it deserves.
report post as inappropriate
Author Natesh Ganesh replied on Feb. 23, 2017 @ 07:58 GMT
Dear Jeff,
Thank you for your encouraging comments and kind rating. Gives me greater confidence to carry on and to work harder. Yes, it was tough but after several edits I think I managed to find a good balance of math vs no-math. And the language of math is always beautiful and adds so much to the discussion, wouldnt you agree. I am hoping more people will read this essay.
Natesh
Lee Bloomquist wrote on Feb. 23, 2017 @ 02:16 GMT
Natesh, I can't find an equation that defines "dissipation" in the essay. Is it in the references? I can find "the lower bound on dissipation in this system as it undergoes a state transition..." But that seems specific to finite state machines, which are not equivalent in power to Turing machines. Is it just "delta E"? where E is energy lost from something like a thermodynamic engine?
report post as inappropriate
Author Natesh Ganesh replied on Feb. 23, 2017 @ 07:46 GMT
Dear Bloomquist,
The dissipation by the system S into the bath B is captured by the \Delta expression of the bath. This would be the change in the average energy of the bath and since only S can exchange energy with B, the increase in energy of the environment would be due to the dissipation by S due to a state transition. A much detailed treatment of the methodology I use is available in the reference as cited in the submission.
Addressing your comment about finite state automata (FSA) vs Turing machines, you are right that Turing machines have greater power but the Markov FSA as I have defined in the essay is still very general and will allow for a wide range of scenarios. Furthermore I have heard arguments that biological organisms need not be Turing machines and capable of computing all functions. Having said that I recognize that the model prescribed here can be vastly improved. I am currently working on something a little more general than the FSA I have prescribed here (still not a Turing machine yet) that will still allow for some insightful takeaways.
I want to note that I substitute the dissipation bound for the actual dissipation, since for the biological processes we would be interested in the bounds are good approximations of the actual dissipation. Furthermore I want to add that, the dissipation bounds though might have a Landauer-like essence to it, it is more rigorously derived and overcomes some of the objections that critics have raised against the hand wavy like calculations in Landauer's original paper. You can view it as energy lost by a very specialized type of thermodynamic engine.
Thank you again for the comments. Please let me know if I missed in address anything else you have brought up.
Natesh
Anonymous wrote on Feb. 23, 2017 @ 13:29 GMT
Thank you for your patience with me, Natesh. I see you are at a nanotechnology lab, so I imagine that "I squared R" dissipates a lot of heat that is of concern. How much dissipation in your hypothesis is from I^(2) R ?
From my work experience in software, implementing a FSM by an array with two indexes, one for current state, one for current input signal, storing at those indexes next...
view entire post
Thank you for your patience with me, Natesh. I see you are at a nanotechnology lab, so I imagine that "I squared R" dissipates a lot of heat that is of concern. How much dissipation in your hypothesis is from I^(2) R ?
From my work experience in software, implementing a FSM by an array with two indexes, one for current state, one for current input signal, storing at those indexes next state and output signal, is much quicker and easier to debug than leaving the FSM in a lot of "if-then" statements-- I.e. using data instead of code increases the speed of execution and reduces debugging significantly. Then if the rest of the code is needed for the full Turing machine, most of the speed loss and complexity in my coding experience would come from the full Turing machine, not the FSM. That's why I ask. And, in your line of work, it seems to me that the vonNeumann architecture would be more relevant for I^(2) R dissipation than more abstract models like co-algebras, streams, or Chu spaces for modeling computing. For example, as in Samson Abramsky's
Big Toy Models.
In engineering terms, apoptosis is a quality control in animals (it never occurs in plants) that seems like a way to minimize energy loss and therefore may be relevant to your dissipation hypothesis. In apoptosis, defective cells that would be energy-wise inefficient are destroyed and their components re-used to build other cells, thus holding onto the potential energy in the sub-assemblies and again reducing energy loss. But I don't see how I^(2) R heat loss plays a role in apoptosis.
If you had an equation for dissipation rather than a verbal explanation, It might give me something more general than I^(2) R to think about-- especially regarding apoptosis as an example of minimizing dissipation of energy. What are your thoughts?
Regarding your references-- I will try to find them online. Are they in arxiv? The closest big universities where I could photocopy these papers are hours away from me.
view post as summary
report post as inappropriate
Lee Bloomquist replied on Feb. 23, 2017 @ 13:31 GMT
Sorry, the previous post was me. Don't know how that happened.
Lee Bloomquist
report post as inappropriate
Author Natesh Ganesh replied on Feb. 23, 2017 @ 16:20 GMT
Hi Lee,
I am happy to answer all questions. I want to point out again that I use the dissipation bound as a (good) approximation of the actual dissipation in the processes that we are interested in. The dissipation bound expression as entropy \delta S and mutual information terms I. The dissipation bound is fundamental and relates to dissipation associated with irreversible information...
view entire post
Hi Lee,
I am happy to answer all questions. I want to point out again that I use the dissipation bound as a (good) approximation of the actual dissipation in the processes that we are interested in. The dissipation bound expression as entropy \delta S and mutual information terms I. The dissipation bound is fundamental and relates to dissipation associated with irreversible information loss. It is implementation independent. When you talk about I^2*R expression for dissipation, you are thinking about wires and charges moving through those wires. So that expression will not hold for spin based architectures. However irrespective of implementation, if there is information loss, then there will be fundamental dissipation associated with it characterized by entropy and information terms. Hence their presence in the expression. For a long time, these bounds from fundamental law (informally called Landauer bounds) were many many orders of magnitude lower than the I^2*R and leakage dissipation and were not significant. But decades later and thanks to Moores law, we will hit such Landauer limits in a decade or so. At the most fundamental level, we define a broad physical implementation of finite state machine that is architecture or technology (like CMOS) independent, and the bound in the paper is the bound of that nature and only depends upon just definition achieved physically. We can always add the dissipation of the architecture and circuit on top of this. Unfortunately the paper in the references that can make this so much clearer is not on arxiv. If you can access them on a college campus somewhere, here are some other papers that will make a lot clearer and what exactly I am talking about, especially the second one.
Anderson, Neal G. "On the physical implementation of logical transformations: Generalized L-machines." Theoretical Computer Science 411.48 (2010): 4179-4199.
Anderson, Neal G., Ilke Ercan, and Natesh Ganesh. "Toward nanoprocessor thermodynamics." IEEE Transactions on Nanotechnology 12.6 (2013): 902-909.
Thanks.
Natesh
view post as summary
Lee Bloomquist wrote on Feb. 23, 2017 @ 22:14 GMT
Natesh. All I can get on the second paper is the abstract and a bit of the intro. The rest is behind a paywall.
You wrote, "When you talk about I^2*R expression for dissipation, you are thinking about wires and charges moving through those wires. So that expression will not hold for spin based architectures." It's Joule heating, as in this...
view entire post
Natesh. All I can get on the second paper is the abstract and a bit of the intro. The rest is behind a paywall.
You wrote, "When you talk about I^2*R expression for dissipation, you are thinking about wires and charges moving through those wires. So that expression will not hold for spin based architectures." It's Joule heating, as in this article:
http://poplab.stanford.edu/pdfs/Grosse-GrapheneConta
ctSJEM-nnano11.pdf
And, you wrote about "entropy":
"I want to point out again that I use the dissipation bound as a (good) approximation of the actual dissipation in the processes that we are interested in. The dissipation bound expression as entropy \delta S and mutual information terms I."
But the abstract talks about "energy dissipation":
****
Abstract:
A hierarchical methodology for the determination of fundamental lower bounds on energy dissipation in nanoprocessors is described. The methodology aims to bridge computational description of nanoprocessors at the instruction-set-architecture level to their physical description at the level of dynamical laws and entropic inequalities. The ultimate objective is hierarchical sets of energy dissipation bounds for nanoprocessors that have the character and predictive force of thermodynamic laws and can be used to understand and evaluate the ultimate performance limits and resource requirements of future nanocomputing systems. The methodology is applied to a simple processor to demonstrate instruction- and architecture-level dissipation analyses.
…I. Introduction
Heat dissipation threatens to limit performance gains achievable from post-CMOS nanocomputing technologies, regardless of future success in nanofabrication. Simple analyses suggest that the component of dissipation resulting solely from logical irreversibility, inherent in most computing paradigms, may be sufficient to challenge heat removal capabilities at the circuit densities and computational throughputs that will be required to supersede ultimate CMOS.1
For 1010devices/cm3 each switching at 1013s−1 and dissipating at the Landauer limit Emin≈kBT have Pdiss=414 W/cm2 at T=300K Comprehensive lower bounds on this fundamental component of heat dissipation, if obtainable for specified nanocomputer implementations in concrete nanocomputing paradigms, will thus be useful for determination of the ultimate performance capabilities of nanocomputing systems under various assumptions regarding circuit density and heat removal capabilities.
****
Are you saying the heat dissipation in post-CMOS devices is just due to spin, and none of that heat is due to Joule heating?
Best Regards,
Lee
view post as summary
report post as inappropriate
sridattadev kancharla wrote on Feb. 25, 2017 @ 13:10 GMT
Dear Ganesh,
I wish you all the best with your in depth analysis of how intentions govern reality. I welcome you to read
there are no goals as such in which I propose that consciousness is the fundamental basis of existence and that intent is the only true content of reality. Also that we can quantify consciousness using Riemann sphere and achieve artificial consciousness as per the article Representation of qdits on Riemann Sphere. I saw that you are also arriving at study of consciousness in physical systems in the conclusion of your essay. Also please see all the diagrams I have attached in my essay.
Love,
I.
report post as inappropriate
Author Natesh Ganesh replied on Feb. 28, 2017 @ 16:06 GMT
Hi,
Thanks for reading my essay and your comments. I have read your work and will politely disagree with you on your premise and conclusion though. I do think that starting from consciousness as a fundamental basis of existence is not the right approach. I would argue that consciousness is an emergent property in input mapping that occur in certain systems due to thermodynamic constraints. I think we will just have to agree to disagree. Also you should check out the integrated information theory of consciousness by Tononi and Koch. I think you will enjoy their work. Please rate my work if you enjoyed reading it. Thanks and good luck in the contest.
Cheers
Natesh
Satyavarapu Naga Parameswara Gupta wrote on Feb. 26, 2017 @ 01:13 GMT
Dear Ganesh,
Thank you for the good essay on “Intension is Physical"
You are observations are excellent, like… “fundamental limits on energy efficiency in new computing paradigms using physical information theory as part of my dissertation”
I have some questions here, you mean to say for every external input, our Brain predicts and takes a correction. It wont be acting on directly by its own self….
Probably if we make energy efficient computer, it will become super intelligent, Probably we may require some software also….
Even though my essay (Distances, Locations, Ages and Reproduction of Galaxies in our Dynamic Universe) is not related to Brain functions, It is on COSMOLOGY….
With axioms like… No Isotropy; No Homogeneity; No Space-time continuum; Non-uniform density of matter(Universe is lumpy); No singularities; No collisions between bodies; No Blackholes; No warm holes; No Bigbang; No repulsion between distant Galaxies; Non-empty Universe; No imaginary or negative time axis; No imaginary X, Y, Z axes; No differential and Integral Equations mathematically; No General Relativity and Model does not reduce to General Relativity on any condition; No Creation of matter like Bigbang or steady-state models; No many mini Bigbangs; No Missing Mass; No Dark matter; No Dark energy; No Bigbang generated CMB detected; No Multi-verses etc.
Dynamic Universe Model gave many results otherwise difficult to explain
Have a look at my essay on Dynamic Universe Model and its blog also…
http://vaksdynamicuniversemodel.blogspot.in/
Best wishes…………….
=snp. gupta
report post as inappropriate
Author Natesh Ganesh replied on Feb. 28, 2017 @ 15:39 GMT
Hi,
Thanks for your comments. Here are my answers to some of the questions you have brought up.
"I have some questions here, you mean to say for every external input, our Brain predicts and takes a correction. It wont be acting on directly by its own self…."
--> If you think of the brain as an hierarchical predictive machine, everything you experience is a result of the continuous prediction-error correction mechanism it is executing. And using terms like "self" is a little loaded and misleading. The sense of self is the result of a physical system that I have described in the essay just evolving under physical law.
"Probably if we make energy efficient computer, it will become super intelligent, Probably we may require some software also…."
--> Yes, it is a very new idea in computing called 'thermodynamic computing'. There is very little work right now but it is gaining momentum. And the point is that there is no fixed algorithm or software. The hardware is set to evolve under larger thermodynamic constraints (as stated in the minimal dissipation hypothesis) for a large set of different environments and it will learn.
I will take a look at your essay but I am an engineer by training and my knowledge in cosmology is very limited, so please forgive me if I cant fully grasp the ideas you express in them. Good luck in the essay contest and please rate if you have enjoyed the work.
Cheers
Natesh
Willy K wrote on Feb. 28, 2017 @ 09:37 GMT
Your work is highly mathematical. Although my math skills are not good enough to understand the details, I think it is a tremendous work. I would love to see your work on consciousness, qualia and meaning, which you hinted at in the essay, primarily because I had considered those areas well-nigh impervious to mathematics. But I am pretty sure you will talk even of those areas using math. Rather impressive, I must say.
I had wanted to evaluate my work on whether it satisfied the mathematical theorems of Ashby and Conant. I am speaking here of the Law of Requisite Variety (Ashby) and the Good Regulator Theorem (Conant). But I could not because my math skills are not good enough for the job. My guess is you would want to evaluate your work as well on its alignment with the basics of those works. Ashby’s work is considered a classic in understanding the functioning of all systems, but on glancing through his works, it is clear that he did his work primarily with the human brain in mind.
report post as inappropriate
Author Natesh Ganesh replied on Feb. 28, 2017 @ 15:48 GMT
Hi Willy,
Thank you for the kind comments. Yes, I did not have space to treat questions like qualia and meaning in detail but I intend to in the near future. I am an engineer and my immediate focus is to leverage the ideas presented here into actual computing systems. The math was a necessary evil and I will work on making the explanations are lot clearer going forward. I find the ability to address such topics with math very exciting.
I like that you brought up both Ashby and Conant, two people whose work has been very influential on me. The minimal dissipation hypothesis is connected to the Good Regulator Theorem in a straight forward manner, and is something I have thought about and worked on. I will have to think a little bit more about rigorously showing the connection to the law of requisite variety but my initial feeling is that it is not impossible.
Thanks again for your comments. Please rate if you enjoyed the essay. It looks like I could use the help.
Cheers
Natesh
Peter Jackson wrote on Mar. 1, 2017 @ 11:36 GMT
Natesh,
Great essay, well informed and organised analysis of a very interesting hypothesis.
I also like most and agree with much, certainly with the addage that; 'our aims and goals are shaped by our history', and the importance of; efficiency, neuronal avalanches, branching parameters, critical regions and that a 'hierarchical predictive coding' model is possible.
I'm not...
view entire post
Natesh,
Great essay, well informed and organised analysis of a very interesting hypothesis.
I also like most and agree with much, certainly with the addage that; 'our aims and goals are shaped by our history', and the importance of; efficiency, neuronal avalanches, branching parameters, critical regions and that a 'hierarchical predictive coding' model is possible.
I'm not yet convinced that minimal dissipation itself is a precondition of learning and 'inference dynamics' and didn't feel you proved that. Do you not think it might be more a ubiquitous characteristic than a driver - in a similar way to 'sum over paths'? If 3 lifeguards head off to save a drowning girl 100m down the beach, one heads straight for her, one to the shore point opposite her position, and one at the right angle to allow for swimming slower than running; Of course the 3rd gets there with least energy, but I feel that underlying that may be a still greater and more useful truth and meaning.
Also might the error/feedback mechanism be better described as iterative value judgement comparisons. Perhaps no 'right or wrong' just different outcomes, which we can't value till compared with previous runs.
Say the first 'run' of consequences is imaginative drawn from input history. Say Do I want a PhD Y/N? gives an 'AIM' (Y1). We run a scenario to imagine it and implications. We then keep running it as more data comes in. Subsequent lower level/consequential Y/N decisions are the same, taking Y1 into the loop, an on hierarchically. If it turns out not to be as envisaged, or we win millions and want to be a playboy instead, we change the aim to N and form new ones.
May it be that you're a little too enamored by the formula and suggest conclusions from those rather than the deeper meaning they're abstracted from.
i.e. Might Lifeguard 3 have said 'how do I get there fastest' rather than '..by using least energy'? (or just done that due to it's successful outcome in past re-runs of the scenario).
Lastly on 'Agency', which I see as a semantic 'shroud'. If all subsequent Y/N decisions in a cascade are simply consequential on the first Y/N decision, and the branches lead to a mechanical motor neuron or chemical response, all repeated in iterative loops, might the concept 'agency' not fade away with most of the mystery?
All 'leading edge' questions I think and thank you for a brilliant essay leading to them. Top mark coming.
Best
Peter
view post as summary
report post as inappropriate
Author Natesh Ganesh replied on Mar. 2, 2017 @ 18:10 GMT
Dear Peter,
Thank you for your kind comments. Much appreciated.
"I also like most and agree with much, certainly with the addage that; 'our aims and goals are shaped by our history', and the importance of; efficiency, neuronal avalanches, branching parameters, critical regions and that a 'hierarchical predictive coding' model is possible."
--> Agreed.
"I'm not yet...
view entire post
Dear Peter,
Thank you for your kind comments. Much appreciated.
"I also like most and agree with much, certainly with the addage that; 'our aims and goals are shaped by our history', and the importance of; efficiency, neuronal avalanches, branching parameters, critical regions and that a 'hierarchical predictive coding' model is possible."
--> Agreed.
"I'm not yet convinced that minimal dissipation itself is a precondition of learning and 'inference dynamics' and didn't feel you proved that. Do you not think it might be more a ubiquitous characteristic than a driver - in a similar way to 'sum over paths'?"
--> I started by calling it the minimal dissipation hypothesis because of the optimization principles I was using to minimize the dissipation terms. This is an evolving idea, and after the feedback I have received I am wondering if I should instead talked about learning as a manifestation of dissipation-complexity trade offs and not just minimizing dissipation, since it is a little confusing when stated that way. I will have to think about this in more detail and will get back to you with an answer.
"Also might the error/feedback mechanism be better described as iterative value judgement comparisons. Perhaps no 'right or wrong' just different outcomes, which we can't value till compared with previous runs."
--> I had not thought about it that way but it sounds very interesting. Let me mull over it.
"Lastly on 'Agency', which I see as a semantic 'shroud'. If all subsequent Y/N decisions in a cascade are simply consequential on the first Y/N decision, and the branches lead to a mechanical motor neuron or chemical response, all repeated in iterative loops, might the concept 'agency' not fade away with most of the mystery?"
--> To a certain extent, I think it does but there is also significant role noise will play in affecting decisions, as well symmetry breaking at critical bifurcation points. I do think that 'agency' in a very a traditional sense is an illusion, something I would have talked about more in detail if I had the space. I would recommend reading Alan Kadin's submission "No ghost in the machine", in which he goes into detail about some of these ideas about agency being an 'illusion', that I did not have the chance to. It is a very interesting read.
Thanks again for your comments. I will get back to you on some of these once I have thought about a detailed response.
Natesh
view post as summary
Ines Samengo wrote on Mar. 5, 2017 @ 01:29 GMT
Hi, Ganesh, I respond here to a question that you asked in my page:
"I will have to ponder over the idea of ascribing goals to any entropy reduction in a system. I am wondering if that is too narrow a definition. After all, a (conscious) observer should be capable of ascribing a system as performing a computation (and hence the goal of performing that computation,) even with no entropy change(?)"
If the computation is invertible, then the output is equal to the input, except for a change of names. I believe that computations are interesting only when they are non-invertible. But perhaps I am missing something…
I saw your essay as soon as it came out, I was impressed, but did not follow all the details. Today I gave it a second look, and I am still impressed, above all, because this strikes me as an original contribution, which I found only very rarely in this forum. Moreover, within the neural networks theory, I've had enough gradient-descent learning rules that come out of the blue, your proposal is so much physical. I confess I must still give it more proper thought – or perhaps, find the time to do the calculations myself – because I intend to take these ideas very seriously. I hope you publish this work as a paper soon, this essay contest does not seem to be the best environment. The work is probably a bit too technical given the contest rules, the length is too constrained, and the audience can be better targeted. I hope that you will consider presenting these ideas in the computational neuroscience audience. They may not have your same physical-computational background, but they will be surely interested in the conceptual result.
Congratulations!
inés.
report post as inappropriate
Author Natesh Ganesh replied on Mar. 5, 2017 @ 20:22 GMT
Hi Ines,
Thanks for your kind comments and encouragement. Yes, I have had issues with a wide variety of gradient descent based learning rules, which is why I wanted something more physically grounded. I am working on a more formal paper as we speak, where I will have the space to discuss the details. This is a continuously evolving idea and after receiving some great feedback, I have...
view entire post
Hi Ines,
Thanks for your kind comments and encouragement. Yes, I have had issues with a wide variety of gradient descent based learning rules, which is why I wanted something more physically grounded. I am working on a more formal paper as we speak, where I will have the space to discuss the details. This is a continuously evolving idea and after receiving some great feedback, I have realized I need to make clear some things and provide better explanation for others. I am an engineer by training and I intend to leverage the ideas here to build something, but I do intend to present some of these results to the (computational) neuroscience field, especially ones connected to the critical brain hypothesis.
I will reply to your comments here, since I get notifications when you reply on my page.
I agree that bijective identity like mappings which lose no information are not interesting. Computation has been defined by the characteristic loss of information from input to output. Let me clarify what I was thinking about. Consider a physical system containing 4 orthogonal distinguishable states A,B,C and D. The system involves to achieve the identity function and we are left with 4 orthogonal states and no entropy change. A (conscious) observer is capable of associating the logical state 0 with the final state A, and the logical state 1 to the final states B, C and D and claim that this physical evolution represents an AND gate if the initial 4 states corresponds to the inputs 00, 01, 10 and 11. I would refer to this as an unfaithful implementation of the abstract logical AND gate, but nonetheless the observer will claim that this physical evolution with zero entropy change has achieved the goal of being an AND gate. Hence while I agree that there is a relationship between goals and entropy reduction processes, I wonder if with the right observer, goals can be ascribed to a non-entropy reducing process. In fact, I have questioned if this ability to imbue goals to zero entropy change or entropy increasing processes, is a defining characteristic of conscious (and definitely intelligent) observers? After all, while we are able to perform input-output computing at will (without it requiring to be entropy reducing), our computer's outputs have computational value only because we as conscious (intelligent) observers interpret them as such. Please let me know if you have any thoughts for/against this.
This idea of computational faithfulness of a physical implementation of logical mappings is discussed in detail here if you would like to know more.
Anderson, Neal G. "On the physical implementation of logical transformations: Generalized L-machines." Theoretical Computer Science 411.48 (2010): 4179-4199.
Neal is my PhD adviser and has been very influential in my thinking. He has been working recently on addressing the importance of observers in determining information as a physical quantity. This paper discusses that in detail and I think you might like it.
Anderson, Neal G. "Information as a Physical Quantity." (2016)
His paper does not state what a conscious observer would be but using the ideas presented in the essay, I have some initial thoughts on how to address that and define such observers in a physically grounded manner.
I am looking forward to hearing your thoughts. Thanks.
Cheers
Natesh
view post as summary
Ines Samengo replied on Mar. 5, 2017 @ 21:36 GMT
Hi, Ganesh, I am afraid I do not understand. "the observer will claim that this physical evolution with zero entropy change has achieved the goal of being an AND gate." Why do you say that the evolution has no entropy change, if the observer has made the association A -> 0, and B,C,D -> 1? This association is entropy-reducing, isn't it? I wait for your reply before elaborating more.
Great to know you are on the way to publish! Your essay is new raw material, so the natural evolution is: get it published. As a neuroscientist, I was more surprised by the learning part of your essay, than by the criticality one, but mind you, I am not truly mainstream, so just take it as one opinion out of many. To me, the learning part is thought provoking, I have the impression that new paradigms, and new understanding may come out of that. The criticality claim seems to be everywhere, but I do not gain much from it, apart from classifying the process as critical. Anyway, surely I am missing something...
best!
ines.
report post as inappropriate
Author Natesh Ganesh replied on Mar. 5, 2017 @ 22:34 GMT
Hi Ines,
Consider the evolution of the system with 4 initial distinguishable states A,B,C and D to 2 orthogonal states 0 and 1, with A evolving to 0, and B,C,D evolving to 1. There is a clearly a reduction in the physical entropy of this system and an observer with access to observe this evolution might decide to associate the AND operation with this evolution. We will call this a faithful...
view entire post
Hi Ines,
Consider the evolution of the system with 4 initial distinguishable states A,B,C and D to 2 orthogonal states 0 and 1, with A evolving to 0, and B,C,D evolving to 1. There is a clearly a reduction in the physical entropy of this system and an observer with access to observe this evolution might decide to associate the AND operation with this evolution. We will call this a faithful physical realization of the AND operation in a system.
Now consider the evolution of the system with 4 initial distinguishable states A,B,C and D to 4 orthogonal end states 0, 1,2 and 3 with a one-to-one evolution. There is no reduction in the physical entropy of this system and another observer with access might decide to associate the physical state 0 with the logical state '0' and the physical states 1,2 and 3 with the logical state '1'. Such an observer will associate the AND operation with this evolution (this is the principle of reversible computing where there is no minimum dissipation) and will not be wrong. The difference being that this is what we refer to as an unfaithful physical realization of the AND operation.
I was trying to point out that it is possible to associate interesting computation with a system evolution in which there is no change in the physical entropy of the system. There might be reduction in the entropy of the observer, not sure there has to be though. Perhaps I am missing/misunderstood something? Are you saying that the reduction in the entropy of the observer (and not necessarily the system) is enough to imbue the system with a goal? I was contesting the idea that entropy reduction in the system alone is enough to achieve that.
Yes, the equations that I have obtained are themselves well established in the Information Bottleneck method (used in clustering and machine learning) and my main contribution is tying it all together in a physical sense. I was pointing out the criticality part, since it seemed the idea though popular is still debated as there seemed to be no clear theoretical foundation for why the brain needs to be a critical system. Most criticality arguments are made from observing neuronal avalanches on EEGs and other experimental data, which can be explained away using critical behavior. And calculations of expected branching parameters is much lower than what is seen in the critical brain. Being able to view different cognition states as phase transition in input signal mapping can allow us to bypass these past hurdles I think. But I have to give it much more thought.
And please call me Natesh. Ganesh is my father. We have a whole different first name, last name system.
Cheers
Natesh
view post as summary
Ines Samengo wrote on Mar. 6, 2017 @ 00:46 GMT
Hi Natesh,
sorry about the names! I am a slave of rhymes, and tend to map together all what sounds similar. Actually it's even worse, I also cluster faces together. By all means, I must learn to represent information more injectively...
Yes, sure, as I see it, it may well happen in the brain of the observer. There are many possible settings, which I discuss below. But in all...
view entire post
Hi Natesh,
sorry about the names! I am a slave of rhymes, and tend to map together all what sounds similar. Actually it's even worse, I also cluster faces together. By all means, I must learn to represent information more injectively...
Yes, sure, as I see it, it may well happen in the brain of the observer. There are many possible settings, which I discuss below. But in all settings, ascribing agency (as I understand it) requires an entropy-reducing mapping. If the mapping is not injective, it may still be interesting or useful for the observer, and he or she may still make a valuable acquisition in their life by learning the mapping. But I can hardly relate this operation with perceiving a system that seeks to achieve a "goal". For me, a goal is something that tends to be reached irrespective of factors that tend do interfere with its accomplishment. That is why I require the non-injectivity: the putative obstacles are a collection of input conditions, that are supposed to be overcome by the goal-seeking agent.
Now for the variable settings.
One option is: I am the observer, and I receive information of the behavior of some external system (say, a replicating DNA molecule outside me), and by defining the system under study in some specific way, I conclude that there is some goal-oriented behavior in the process.
Another option is: I am given some information which I represent in my head, and I perform a certain computation with that information, for example, the AND gate you mention. The computation happens inside my head. But I also have observers inside my head, that monitor what other parts of my head are doing. For one such observer, what the other end of the brain is doing, acts as "external" (in all computations except for self-awareness, which is the last part of my essay). One such observer can assign agency (if it wants to!) to the computation, and conclude that the AND gate (inside my head!) "tends" to reduce the numbers 1, 2, 3 to the digit 0 (or whichever implementation we choose). A weird goal for an agent, but why not.
All I am claiming is: goal-directed behavior does not exist without an observer that tends to see things in a very particular way: as non-injective mappings. I am not claiming that this is the only thing that can be learned by a plastic system (one-to-one mappings can also be learned). I am not claiming that the only thing that can be done with a non-injective mapping is to arrogate it with agency and goals. There are many more things happening in the world that may be good to learn, as well as in the brain of the observer. And there are many more computations to do, other arrogating agency. My only point is: if we see a goal (inside or outside us), then we have trained ourselves to interpret the system in the right manner for the goal to emerge. The goal is not intrinsic to nature, it is a way of being seen.
Not much, just one tiny point. Or perhaps, just a definition. If you look through the essays, there are as many definitions of "goal", "agent" and "intention" as authors...
> my main contribution is tying it all together in a physical sense
Yes, and that is precisely why it is so great! And perhaps you are right, within your framework, what so far has been presented as a mere description of the critical brain, now can be seen as the natural consequence when certain physical conditions are assumed. I do appreciate that.
Anyhow, time to rest! buenas noches!
inés.
view post as summary
report post as inappropriate
Ines Samengo replied on Mar. 6, 2017 @ 00:51 GMT
Sorry, one more point: when an observer learns a one-to-one mapping, arrogating agency has not much sense, because there is no entropy loss in the mapping. The process of learning the mapping, though, can be arrogated with agency: the observer tends to learn. But here there is a meta-observer observing the first observer, right? It is the learning observer that may be arrogated with agency, not the lower-level system under study. And now yes, I go. Sorry for the long speeches!
report post as inappropriate
Author Natesh Ganesh replied on Mar. 7, 2017 @ 04:27 GMT
Hi Ines,
Thanks for the long detailed response. You have given me much to think about and definitely introduced me a new point of view.
"Yes, sure, as I see it, it may well happen in the brain of the observer. There are many possible settings, which I discuss below. But in all settings, ascribing agency (as I understand it) requires an entropy-reducing mapping. If the mapping is not...
view entire post
Hi Ines,
Thanks for the long detailed response. You have given me much to think about and definitely introduced me a new point of view.
"Yes, sure, as I see it, it may well happen in the brain of the observer. There are many possible settings, which I discuss below. But in all settings, ascribing agency (as I understand it) requires an entropy-reducing mapping. If the mapping is not injective, it may still be interesting or useful for the observer, and he or she may still make a valuable acquisition in their life by learning the mapping. But I can hardly relate this operation with perceiving a system that seeks to achieve a "goal". For me, a goal is something that tends to be reached irrespective of factors that tend do interfere with its accomplishment. That is why I require the non-injectivity: the putative obstacles are a collection of input conditions, that are supposed to be overcome by the goal-seeking agent."
-->Agreed.
"All I am claiming is: goal-directed behavior does not exist without an observer that tends to see things in a very particular way: as non-injective mappings."
-->I agree again on that. I think I have described such observer systems in this essay that tend to associate goals with their sensory inputs.
"The goal is not intrinsic to nature, it is a way of being seen."
--> Yes, you are completely right about that. Goals are purely subjective.
"when an observer learns a one-to-one mapping, arrogating agency has not much sense, because there is no entropy loss in the mapping. The process of learning the mapping, though, can be arrogated with agency: the observer tends to learn. But here there is a meta-observer observing the first observer, right? It is the learning observer that may be arrogated with agency, not the lower-level system under study."
--> I see your point but I need more time to think about this.
Thanks again for a delightful exchange. Please let me know if you have other questions/comments related to the idea in general. I would be happy to answer.
Cheers
Natesh
view post as summary
George Kirakosyan wrote on Mar. 22, 2017 @ 08:48 GMT
Hi Dear Ganesh
I have read your article (as we usually speaking this!) and I will simply tell you that I am somewhat skeptical on its possible success. I see that your approach is presented by well logical flow, but I am skeptical as these are based on hypotheses. Maybe you will very right, however who can say you this certainly and definitely for today? Thus, your essay seems to me as the interesting Ideas represented in nice form and by impressing narration. I hope you can understand my point (and maybe somewhat to become agree with me!) if you find time to check my work. Then we can continue talk in my page, if you see we can have somewhat common views.
Best regards
report post as inappropriate
Author Natesh Ganesh replied on Mar. 23, 2017 @ 22:40 GMT
Hi George,
Thank you for your comments. Skepticism is good and an important quality of a good scientist. I welcome it and your criticism, for it will help me grow as a researcher. As a PhD student with deadlines, I am a little busy but I will have a chance to read your work slowly and in detail over the weekend. I shall get back to you once I have understood what you have to say in your submission. Thanks.
Natesh
Shaikh Raisuddin wrote on Mar. 23, 2017 @ 16:19 GMT
Natesh Ganesh,
How matter learns?
Can we say a cyclone is a goal-directed system?
There is periodicity of want-and-hunt/intentions in every living being, how that can be designed?
report post as inappropriate
Author Natesh Ganesh replied on Mar. 23, 2017 @ 22:51 GMT
Hi Shaikh,
Thank you for your comments.
"How matter learns?"
--> This is what I address in section 2 of my submission. I think that minimally dissipative systems necessarily learn in an unsupervised manner naturally. There are hints in the derivations on how reinforcement and supervised learning will be covered as well.
"Can we say a cyclone is a goal-directed system?"
--> Good question!We have to differentiate between us ascribing a goal to a cyclone vs whether the cyclone has a goal for itself and a sense of agency? That is an important distinction. We can ofcourse project a goal from us onto a cyclone but unless a cyclone is a minimally dissipative system (having an hierarchical implementation), which it isnt, then the cyclone does not have a sense of its own agency, goal oriented or not. I would recommend reading Dan Bruiger's submission here which makes this distinction between teleology and teleonomy very clearly.
"There is periodicity of want-and-hunt/intentions in every living being, how that can be designed?"
--> I do not know the answer for that yet, on how to make systems like the ones I describe in my submission. I am not even sure if 'design' is the right way to go about it. All the systems that I refer to have been self-organized. Perhaps we should look to creating the conditions/ constraints for such systems to emerge and let physical law do its thing. This is how I would imagine we would achieve a new way of computing, called 'thermodynamic computing' that is being theorized about.
Hope I have satisfactorily answered your questions.
Cheers
Natesh
Edwin Eugene Klingman wrote on Mar. 24, 2017 @ 05:06 GMT
Dear Natesh,
I very much enjoyed reading your most impressive essay. Since you have read mine and commented, I will look at possible correlations, based on the assumption that one of us is actually representative of reality. In fact, even if my essay is correct about the universal nature of awareness, your model may well 'simulate' awareness and may describe realistic governing...
view entire post
Dear Natesh,
I very much enjoyed reading your most impressive essay. Since you have read mine and commented, I will look at possible correlations, based on the assumption that one of us is actually representative of reality. In fact, even if my essay is correct about the universal nature of awareness, your model may well 'simulate' awareness and may describe realistic governing constraints on dynamics of learning. For example you model a sense of agency as "the awareness of an action being performed as it is being performed." This is compatible with your definition of agency as "pre-reflective subjective awareness that one is initiating, executing, and controlling one's own volitional actions in the world." The key word is of course 'subjective', and that is the great question underlying whether or not the singularity is possible.
Let me first say that the qualms I have about quantum mechanics are based on common interpretations of physical reality. I have no problem at all with your usage of QM in your essay. It is interesting however that Crooks fluctuation theorem of non-equilibrium thermodynamics is essentially a classical, not a quantum analysis. Part of this, I believe, is that
work is not an observable in quantum mechanics, and the relevant work averages are given by time-ordered correlation functions of the exponentiated Hamiltonian rather than by expectation values of an operator representing the work as a pretended observable. [Talkner, Lutz, and Hanggi] I'm not familiar enough with England's approach but from what you present of it it appears to be essentially classical.
Although I did not elaborate in my essay, I have in response to questions on my page noted that the field as I hypothesize it
senses (and affects) momentum density, and this is very relevant. One could say to me: "You claim that the consciousness field interacts with ions in axons and vesicles flowing across synaptic gaps. Why then would not the same field interact with electrons flowing in circuitry, since the momentum density of an electron is greater than that of an ion or a vesicle?"
An excellent question. Part of the answer is that the charge-to-mass ratio of ions and vesicles makes them less susceptible to EM fields. But the key answer is that momentum density flow in the brain (and even the blood) is in 3-D and the consciousness field exists in 3-D, and our subjective awareness of 3-D is very strongly linked to these facts. Current circuitry (see my paper
FPGA Programming: step-by-step) is 2-D, and even the 2-D arrangements of circuits are designed to optimize timing. There is no spatial aspect to computer circuitry, of the sort we find in the brain.
If (and it's a big if) we ever reach the point where circuitry (say a nanotube network) could span the 3-D volume (with suitable I/O: see
FPGA Design from the Outside In) then I would think it might be possible that a 'super brain' could be built, but this is contingent on the existence of the consciousness field as the seat of awareness! Doing without the field and without 3-D (as opposed to computations of 3-D) is one heck of a task.
In addition to the work I've done on pattern recognition and learning (hinted at in my endnotes) I also covered Steven Grossberg's mathematical model of neural circuits [
The Automatic Theory of Physics (my ref.5)] I hope you are so close to finishing your PhD that you have no use for any of this information, but, given your familiarity with my
microprocessor systems design you would at least find the info readable, and perhaps even a source of ideas. I hope this discussion stimulates useful thoughts for you.
I would be very surprised if your essay does not win one of the prizes. It is an exceptional essay, and I wish you well in this field.
My very best regards,
Edwin Eugene Klingman
view post as summary
report post as inappropriate
Author Natesh Ganesh replied on Mar. 26, 2017 @ 19:18 GMT
Dear Edwin,
Sorry for the delayed response. I had a conference paper deadline and finally have some time to myself. Thank you for your detailed response and encouraging comments. It fills me greater confidence to keep working harder at a solution.
I really enjoyed your discussion on the effect of the consciousness field on electrons vs ions. It is an interesting point you make. Some colleagues of mine in the department are working on ion-based memristor devices, which might actually serve as a better substrate to interact with a consciousness field than an electronic device. Furthermore I completely agree with you on the concept of a 3D structure, than the traditional 2D architecture. I too am convinced that any system capable of a comparable consciousness should have some kind of 3D structure. Interestingly I am in discussion with them to possibly constructing 3D array of sorts with these ionic memristors, with the type of constraints that I talk about in my essay (if we figure out how to impose them) and just let it run in an input environment to see what it does. Should be very interesting I think.
I am about 6-7 months from finishing and in full writing mode, but I will definitely take a look at the resources you mentioned (especially the ones on pattern recognition). One can never learn enough and I am sure they will provide some new insight for me. Thanks.
Natesh
PS: I finally got around to rating your essay. I would appreciate it if you rate mine, if you havent already. If you have already, thank you very much!
Edwin Eugene Klingman replied on Mar. 26, 2017 @ 19:44 GMT
Dear Natesh,
I have now rated you (10). Past experience has indicated that there may be turbulence in the final hours, so I had planned to hold off to help you then, but perhaps increased visibility will help now. Some earlier essays that I pushed up for visibility were immediately given '1's by whatever trolls lurk in low places.
The final decisions are made by FQXi judges, and I think they will judge your work well.
I am very glad that you agree about the 3-D structure. What you say about ionic memristors is very interesting! I'm glad to hear this. I hope we stay in touch.
Best,
Edwin Eugene Klingman
report post as inappropriate
Author Natesh Ganesh replied on Mar. 26, 2017 @ 20:59 GMT
Dear Edwin,
Thank you for your kind rating. Yes, I agree with you on the sad trolling that has been going on, that I fear is hurting the contest overall. I was hit with 5 continuous 1's without any feedback, which sent my essay in freefall and left me disheartened earlier. Hopefully I will have the opportunity to have the work judged by the FQXi panel. Good luck on the contest and I would very much like to stay in touch. Thanks.
Natesh
Robert Groess wrote on Mar. 25, 2017 @ 09:10 GMT
Dear Natesh Ganesh,
Thank you for your beautifully written, and rigorously argued essay. I agree that your "minimal dissipation hypothesis" is a very good indicator of intent and that goal-directed agency emerges from, as you put it, systems that dissipate minimally.
Just as a quick question, have you followed some of Charlie Bennett's work on computational efficiency from thermodynamic considerations? I had the privilege of spending some time with him about a decade ago and found him to be a great source of talent and inspiration in that regard.
Good luck in the contest, I have rated your essay and thoroughly enjoyed reading it.
Regards,
Robert
report post as inappropriate
Author Natesh Ganesh replied on Mar. 26, 2017 @ 19:37 GMT
Dear Robert,
Thank you for your encouraging reply. I am happy to hear that you liked my submission. Yes, I do think that "minimal dissipation" might provide a sufficient condition for emergence of goal-oriented agency.
Yes, I have across Bennett's work! I think he has been one of the most influential thinkers of our time!! I study the fundamental thermodynamic limits to computing as part of my dissertation and I often use the works of Landauer and Bennett. I also like his work on reversible computing, and am hoping the field will gain more momentum. This is my favorite paper of his "Dissipation-error tradeoff in proofreading." (Apologies for my long winded rant.)
Good luck on the contest. I will definitely take a look at your submission. Thanks.
Natesh
PS: My title is actually a play of words on Landauer's famous paper "Information is Physical".
Member Simon DeDeo wrote on Mar. 26, 2017 @ 19:50 GMT
Dear Natesh —
Let me ask a very basic question. Say I take a simple Newtonian system, two planets orbiting around each other.
I hit one with a rock, and thereby change the orbital parameters. There's a map from the parameters that describe the incoming rock, and the resulting shift in the system. The system appears to have "learned" something about the environment with minimal (in fact, zero) dissipation.
If I let the rock bounce off elastically, then there is strictly no change in entropy. I could probably arrange the environment in such a way that the system would show decreasing amounts of change to rocks flying at random times in from a particular direction. In general, there will be nice correlations between the two systems.
Why is this open system not an inferential agent?
I suppose I'm trying to get a sense of where the magic enters for you. I think you're cashing out efficiency in terms of KL distance between "predictor" at t and world at time t+1, presumably with some mapping to determine which states are to correspond. This seems to work very well in a lot of situations. But you can also construct cases where it seems to fail. Perhaps because the notion of computation complexity doesn't appear?
Thank you for a stimulating read. It's a pleasure to see the Friston work cited alongside (e.g.) Jeremy England.
Yours,
Simon
report post as inappropriate
Author Natesh Ganesh replied on Mar. 26, 2017 @ 20:29 GMT
Dear Simon,
Thank you for your comments and questions. This is a nice coincidence. I just finished reading about the Borgesian library and currently on section 2 "the physics of the gap". Great piece of writing and I will reach out on your page once I am done reading and re-reading and digesting it.
"Why is this open system not an inferential agent?"
--> Yes, it technically...
view entire post
Dear Simon,
Thank you for your comments and questions. This is a nice coincidence. I just finished reading about the Borgesian library and currently on section 2 "the physics of the gap". Great piece of writing and I will reach out on your page once I am done reading and re-reading and digesting it.
"Why is this open system not an inferential agent?"
--> Yes, it technically is for that very particular environment providing those particular input signals. If those planets saw ONLY the type of conditions, that allowed it maintain a specific macrostate at minimal dissipation, then we might have to entertain the possibility that it is an inferential agent in that environment. In section-2 of my submission, I introduced the notion of the link between minimal dissipation and learning. I added section-4 to not only show the link to Englands work, but also explain why we should focus on systems that are minimally dissipative in all the input signals from their environment, that they might encounter as they maintain their macrostate. For example, if we thought about a system that was minimally dissipative for one input signal but not the rest of it, I would think that system is not an inferential agent, unless the probability of that particular signal goes to 1.
"This seems to work very well in a lot of situations. But you can also construct cases where it seems to fail. Perhaps because the notion of computation complexity doesn't appear?"
--> Can you please give me a simple example. I seem to be not following here. If you mean that it is possible to construct simple cases of systems that are minimally dissipative in a particular environment, and do not learn anything, my first guess is that it does not possess the sufficient complexity to do so, hence not satisfying that constraint of the hypothesis. After all, there are large nice periods of blissful dreamless unconscious sleep where we dont learn or infer anything either, which would be explained by changes to our computational complexity while maintaining the minimal dissipation.
On a side note, I do wonder that given our finite computational complexity and if our brain is indeed a minimally dissipative system, might serve to explain why there might be some computational problems, that our brain simply cannot solve by itself.
I agree that both Friston and England's works are very influential, and drove to me to look for a link between the two. Hopefully I satisfactorily answered your great questions. If I have not, please let me know and I will take another crack at it.
Cheers
Natesh
PS: I am continuing to work on an updated version of the essay to better clarify and explain myself without the constraints of a word limit. The questions you have asked are very useful, and I will include explanations that version to better address them.
view post as summary
Member Simon DeDeo replied on Mar. 26, 2017 @ 21:10 GMT
Dear Natesh — thank you for your very thoughtful response.
You asked me about this remark:
"I think you're cashing out efficiency in terms of KL distance between "predictor" at t and world at time t+1, presumably with some mapping to determine which states are to correspond. This seems to work very well in a lot of situations. But you can also construct cases where it seems to...
view entire post
Dear Natesh — thank you for your very thoughtful response.
You asked me about this remark:
"I think you're cashing out efficiency in terms of KL distance between "predictor" at t and world at time t+1, presumably with some mapping to determine which states are to correspond. This seems to work very well in a lot of situations. But you can also construct cases where it seems to fail."
Saying "Can you please give me a simple example."
So an example would be running two deterministic systems, with identical initial conditions, and with one started a second after the first. The first machine would be a fantastic predictor and learner. There there's correlation, but some kind of causal connection, once initial conditions are fixed, is missing from the pair. Minimally dissipative.
Another example (more complicated, but works for proabilistic/non-deterministic evolution) would be the Waterfall (or Wordstar) problems. With a lot of work, I can create a map between any two systems. It might require strange disjunctive unions of things ("System 1 State A corresponds to System 2 State B at time t, C at time t+1, W or X at time t+2...") and be very hard to compute, but it's there. I'm not sure how dissipative the two could be, but my guess is that it's hard to rule out the possibility that the coarse-grained state spaces the maps imply could have low dissipation.
(Scott Aaronson has a nice piece on computational complexity and Waterfall problems—http://www.scottaaronson.com/papers/philos.pdf)
Yo
u see a version of this in the ways in which deep learning algorithms are able to do amazing prediction/classification tasks. System 1 it turns out, with a lot of calculations and work, really does predict System 2. But if System 1 is the X-ray image of an aircraft part and System 2 is in-flight airplane performance, does it really make sense to say that System 1 has "learned", or is inferring, or doing anything agent-like? Really the effort is in the map-maker.
Yours,
Simon
view post as summary
report post as inappropriate
Author Natesh Ganesh replied on Mar. 26, 2017 @ 22:15 GMT
Dear Simon,
I address your comments/questions below:
"So an example would be running two deterministic systems, with identical initial conditions, and with one started a second after the first. The first machine would be a fantastic predictor and learner. There there's correlation, but some kind of causal connection, once initial conditions are fixed, is missing from the pair....
view entire post
Dear Simon,
I address your comments/questions below:
"So an example would be running two deterministic systems, with identical initial conditions, and with one started a second after the first. The first machine would be a fantastic predictor and learner. There there's correlation, but some kind of causal connection, once initial conditions are fixed, is missing from the pair. Minimally dissipative."
--> Please bear with me, but I take my time in understanding all the tiny details completely. Correct me if I am wrongly characterizing what you are saying, If the two systems are run in the manner that you describe, are you saying that the joint system is minimally dissipative or just the second system? If they are jointly dissipative, then the correlation between the two would be plastic as expected. I mention this at the start of section 5, where I discuss how subsystem relationships should be plastic if the joint system is minimally dissipative. And the correlation will hence vary depending upon the input provided. Does that answer your point?
"Another example (more complicated, but works for proabilistic/non-deterministic evolution) would be the Waterfall (or Wordstar) problems."
--> Let me get back to you on this once I have a firmer grasp on what these problems are exactly. I remember reading about it in Aaronson's blog a while ago and I need to revisit it. Thank you for that particular link. I am an avid fan of his blog and work, and the updated version of the essay has references to his blog post on the Integrated Information Theory.
"You see a version of this in the ways in which deep learning algorithms are able to do amazing prediction/classification tasks. System 1 it turns out, with a lot of calculations and work, really does predict System 2. But if System 1 is the X-ray image of an aircraft part and System 2 is in-flight airplane performance, does it really make sense to say that System 1 has "learned", or is inferring, or doing anything agent-like? Really the effort is in the map-maker."
-->I agree that while deep learning networks are learning in a manner similar to us, there are large differences between us and deep learning algorithms. Along the lines of John Searle's Chinese room argument, I would argue that such algorithms are only syntatical and there are no semantics there. Furthermore running such algorithms on a von Neumann architecture GPUs (as they traditional are) means these are not minimally dissipative systems. I think the plastic subsystem connections are needed for any system to be minimally dissipative and von Neumann architecture does not have that. If we went to systems with neuromorphic architecture, then it becomes a lot more interesting I think.
I agree with you that the effort is really with the map-making and this is why I am very interested in unsupervised learning with an array of devices called memristors (Look for Prof. Yang's group at Umass,Amherst. They are doing cool things like this). Short of starting with an artificial primordial soup and evolving/self-organize an artificial brain on silicon in an accelerated manner, I think such an approach is the best way to test my ideas and build an agent remotely close to us (Since we know somethings about the final product aka our brain, we can cheat and start with an array of memristors since they can behave as neurons and synapses. How to impose other thermodynamic constraints on this array is something I am thinking about now). We just set up the array of physical devices without any preprogramming or map making, let it run and supply it with inputs and it is allowed to make its own maps and provide outputs. If such a system is able to answer questions about flight performance, based on x-ray image of the airplane, I think a) that would be amazing and b) we have to seriously entertain the possibility that it is an agent like us (I am not touching the question whether such an agent is conscious or not by a 10 foot pole haha)
I hope I didnt miss anything and answered your questions. Let me know if I need to further clarify more things.
Cheers
Natesh
PS: In all of this I think I might have to seriously step back and see if there is some fundamental difference between self-organized systems and those systems which are designed by another 'intelligent' systems, and if that changes things.
view post as summary
Member Simon DeDeo replied on Mar. 27, 2017 @ 00:49 GMT
Dear Natesh —
It's fun to go back and forth on this.
If the time-delayed system is indeed learning according to your scheme, this seems to be a problem for your scheme. Two independently-evolving systems should not be described as one "learning" the other. Of course, it is a limit case, perhaps most useful for pointing out what might be missing in the story, rather than claiming there's something bad about the story.
The machine learning case provides a different challenge, I think. You seem to agree that the real difficulty is contained in the map-making. But then this makes the prediction/learning story hard to get going without an external goal for the map-maker. Remember, without some attention to the mapping problem, the example of X-ray images predicting in-flight behavior implies that the X-ray images themselves are predicting/learning/in goal directed relationship to the in-flight behavior; not the algorithm, which is just the discovery of a mapping. More colloquially, when my computer makes a prediction, I have to know how to read it off the screen (printout, graph, alarm bell sequence). Without knowledge of the code (learning or discovered post hoc) the prediction is in theory only.
You write, "In all of this I think I might have to seriously step back and see if there is some fundamental difference between self-organized systems and those systems which are designed by another 'intelligent' systems, and if that changes things." I think that might be the main point of difference. I'm happy to use the stories you tell to determine whether an engineered system is doing something, and this seems like a really interesting criterion. Yet I'm just not sure how to use your prescriptions in the absence of (for example) a pre-specified agent who has desires and needs satisfied by the prediction.
Thank you again for a provocative and interesting essay.
Yours,
Simon
report post as inappropriate
Anonymous replied on Mar. 27, 2017 @ 19:01 GMT
Dear Simon,
"It's fun to go back and forth on this."
-->Agreed.
"If the time-delayed system is indeed learning according to your scheme, this seems to be a problem for your scheme. Two independently-evolving systems should not be described as one "learning" the other."
--> I think I misunderstood the problem you had presented (a simple case of lost in translation...
view entire post
Dear Simon,
"It's fun to go back and forth on this."
-->Agreed.
"If the time-delayed system is indeed learning according to your scheme, this seems to be a problem for your scheme. Two independently-evolving systems should not be described as one "learning" the other."
--> I think I misunderstood the problem you had presented (a simple case of lost in translation I guess). If the two systems are evolving independently and there are no inputs being presented to either one of them, then I am not sure what it is that they can learn in the first place. But then again, if this is a limiting case of no inputs at all, then I must think about this further. Since my derivations start with the assumptions that there are some external inputs affecting the physical system in question, I would say that a system just evolving without being affected by external inputs and dissipating minimally is not learning anything. This is further captured by the fact that the mutual information complexity measure can serve as a measure of memory/history in the system.
"Remember, without some attention to the mapping problem, the example of X-ray images predicting in-flight behavior implies that the X-ray images themselves are predicting/learning/in goal directed relationship to the in-flight behavior; not the algorithm, which is just the discovery of a mapping."
--> I agree that the X-ray image itself cannot predict but a minimally dissipative system which is presented with the X-ray image as inputs that affect it's state transition might be capable of learning and predicting from the input image.
"Yet I'm just not sure how to use your prescriptions in the absence of (for example) a pre-specified agent who has desires and needs satisfied by the prediction."
--> I argue that my constraints specify which systems could be goal oriented agents in the first place. And that goals and desires are created and evolved as such systems interact with their input environment.
"Thank you again for a provocative and interesting essay."
--> Thanks for a very a stimulating discussion. I am pretty convinced that I should rename the minimal dissipation hypothesis to something like the "dissipation-complexity" tradeoff principle to reduce the confusion.
Cheers
Natesh
view post as summary
report post as inappropriate
Member Simon DeDeo replied on Mar. 28, 2017 @ 02:24 GMT
Dear Natesh —
I'm just finishing up an article on learning and thermodynamic efficiency (using the Still & al. framework of driving), so I think my head's full of a set of ideas that are competing and overlapping with your insights here. To be clear, I think this is a fantastic piece, and one of the most provocative in a (very good) bunch.
I hope we see more cross-over work at the interface of origin of life, themodynamics, and machine learning and I encourage you to publish a version of this in a journal (you might consider Phys. Rev., or perhaps the journal Entropy).
Yours,
Simon
report post as inappropriate
hide replies
Don Limuti wrote on Mar. 26, 2017 @ 23:20 GMT
Hi Natesh,
The posts in this blog are as interesting a conversation as are in this contest. In particular your conversation with Ines Samengo is most interesting. More on that in a moment.
The wording of FQXi.org's contest is nebulous, unless you realize it is about the MUH of Tegmark. Tegmark's emphasis is about Mathematics. Landauer's emphasis is about Information. Your emphasis is...
view entire post
Hi Natesh,
The posts in this blog are as interesting a conversation as are in this contest. In particular your conversation with Ines Samengo is most interesting. More on that in a moment.
The wording of FQXi.org's contest is nebulous, unless you realize it is about the MUH of Tegmark. Tegmark's emphasis is about Mathematics. Landauer's emphasis is about Information. Your emphasis is about Intention. My emphasis is about how we choose. I would make a hierarchy as shown below:
"Mathematics is Physical"...........Tegmark
"Information is Physical"..............Landauer
"Intention is Physical"..................Ganesh
"Choice (intention from a personal viewpoint) is Physical (maybe), but we can never know it"......Limuti
I did read your essay and honestly I had trouble following it (I did however spot the insulated gate mosfet structures :))
The image your essay triggered in me was Valentino Braitenber's book "Vehicles, Experiments in Synthetic Psychology". It is easy to make the vehicles look as if they had "emergent" goals.
Non equilibrium thermodynamics as treated by you and Ines was interesting. Ines brought out the memory clearing needed by Maxwell's demon to control the entropy (I think I got that right). Perhaps this memory clearing is why we can't know how we choose. For example, move your finger. How did you do that? Do not point to MRIs or brain function. I maintain that you have no direct experiential record (knowledge or memory) of how you moved your finger. I believe the answer is that you moved your finger, but you do not know directly how you did that. Was Maxwell's demon involved? I know this is a bit esoteric, but would like to know what you think.
In my essay I hoped to get across how convoluted the language of determinism and freewill is. Don and Lexi each took a side. However, each also used Unconsciously the other viewpoint during the conversation.
You forced me to think....minor miracle. Therefore this is a super essay!
Thanks,
Don Limuti
view post as summary
report post as inappropriate
Author Natesh Ganesh replied on Mar. 27, 2017 @ 18:42 GMT
Hi Don,
Thank you for your very kind comments. I am glad to see that you liked the essay. Ines's work was outstanding and it was very insightful to discuss ideas with her.
"The image your essay triggered in me was Valentino Braitenber's book "Vehicles, Experiments in Synthetic Psychology". It is easy to make the vehicles look as if they had "emergent" goals. "
--> I will check this book out.
"In my essay I hoped to get across how convoluted the language of determinism and freewill is. Don and Lexi each took a side. However, each also used Unconsciously the other viewpoint during the conversation."
--> Ha!! Wonderful. I did not immediately get that but it adds much more to your submission. Thanks.
Cheers
Natesh
Stefan Keppeler wrote on Mar. 28, 2017 @ 15:56 GMT
Dear Natesh,
thanks for your kind comments on my page, which led me to your interesting essay.
I'm afraid, you lose me on page 2. What is
A Hilbert space? What are the
A basis for this Hilbert space? Similarly, what are
and the
What does
mean? Is that some kind of product? The transition mappings
are they unitary, stochastic, or...? You write that some time evolution is governed by a Schrödinger equation, what's the corresponding Hamiltonian? How is this Hamiltonian related to the
Or maybe we can go back one step, away from the technical details: What does it mean that a system has "constraints on its finite complexity"? And can I think of dissipation as energy transfer from the system to the heat bath?
Sorry for so many questions, I just feel I can't get the message, when I don't even understand the terminology on the first few pages.
Cheers, Stefan
PS: Sorry for the rendering - I don't know how to do inline math here. Each equation-tag causes a linebreak. :-(
report post as inappropriate
Author Natesh Ganesh replied on Mar. 28, 2017 @ 16:28 GMT
Hi Stefan,
No problem at all. I had the same problem and pretty much gave up on using Latex in this forum :D. Given the word limit, I could not get into explaining all the terms you had stated in detail, but here is a paper with all the details- Ganesh, Natesh, and Neal G. Anderson. "Irreversibility and dissipation in finite-state automata." Physics Letters A 45.377 (2013): 3266-3271. Let me know if you are having trouble accessing this.
The paper was written for deterministic automata but the extensions to stochastic mappings hold. The entire universe of referent-system-bath evolve unitarily but the system evolution can be non-unitary (probably is). Shortened version would be S-system in which FSA is instantiated with states \sigma. R=R0R1 is the joint system of past inputs R0 and present input R1, with x being a string in that distribution of inputs (In a classical case, all of these are essentially random variables). L is the transition mapping for which the corresponding Hamiltonian of the global joint system can be constructed so as to achieve the necessary state transition.
"What does it mean that a system has constraints on its finite complexity"?
--> If the complexity of the system can be captured by a mutual information measure, then a finite state automata with finite number of states can only have finite complexity. When we perform optimization of a variable, while keeping another condition constant, we call it constrained optimization and the condition as a constraint.
"And can I think of dissipation as energy transfer from the system to the heat bath?"
---> Yes! Thats exactly what it is. Details are in that paper again.
Thanks for your questions. Wish I had more space to explain all the terms in details. I am working on a more formal paper now and hopefully I can be a lot more detailed in that so as to avoid confusion. Let me know if there any more points to be clarified and I shall be happy to do it.
Cheers
Natesh
Stefan Keppeler replied on Mar. 29, 2017 @ 19:50 GMT
Dear Natesh,
thanks, after reading your Phys. Lett. A article, I think I understand the definitions. I think I also understand roughly how you obtain the bound (3) in your article. There is a similar (but not identical?) bound on page 2 of your essay, which I think is neither derived in your article nor in your essay -- or did I overlook anything?
Cheers, Stefan
report post as inappropriate
Author Natesh Ganesh replied on Mar. 29, 2017 @ 20:45 GMT
Hi Stefan,
"There is a similar (but not identical?) bound on page 2 of your essay, which I think is neither derived in your article nor in your essay -- or did I overlook anything?"
--> Yes, the bound in the essay was not derived here but is an extension of the bound in the Phys. Lett. paper. The bound in that paper was derived for independent inputs i.e. R0 and R1 are independent. The bound in the essay is derived for correlated R0 and R1, thus generalizing the bound from Phys. Lett. paper (I am writing a new paper on this generalization but it will hold if you follow the same set of steps from the earlier paper). The bound in the essay will reduce to the one in the 2013 paper if you assume R0 and R1 have zero correlations with the last term in equation (3) going to zero. Hope that explains everything. I am glad to see you are being extremely rigorous with the essay. Please keep the questions and comments coming.
Cheers
Natesh
Stefan Keppeler replied on Mar. 31, 2017 @ 21:42 GMT
Hi Natesh,
I'm slowly making progress. It's not so easy since it's essentially all new for me...
In your Sec. II you state and justify your
minimal dissipation hypothesis. Towards the end of Sec. II you conclude that "learning dynamics are inevitable in a trade off between energy dissipation and statistical complexity". Your essay prompted me to have a (superficial) look at Still et al. 2012, who conclude that "making a predictive model of the environment and using available energy efficiently [are fundamentally related]". Is that essentially the same as your
minimal dissipation hypothesis or is there a subtle difference which I'm missing?
Thanks for bearing with me, cheers, Stefan
report post as inappropriate
Author Natesh Ganesh replied on Apr. 2, 2017 @ 19:25 GMT
Hi Stefan,
"Is that essentially the same as your minimal dissipation hypothesis or is there a subtle difference which I'm missing?"
--> The link between learning and energy dissipation itself is not new. Energy efficiency as a possible unifying principle has been touted before. Still obtains the bounds in her paper (derived under different assumptions) and suggests the same idea of a link between the two. The main difference is that I go further than that and hypothesize that learning is simply a manifestation of energy efficient dynamics, and that (explained in section 4 essay) perhaps we need to look at a framework in which evolution and learning as manifestation of larger thermodynamic principles. The evolution part has been suggested by England's work (discussed in the essay) and I relate it back to the minimal dissipation hypothesis and show how framing the problem as I have in section 2, we can now relate it back to known ideas in machine learning, neuroscience, etc. I saw a recent video by Still where she was trying to do something similar as well starting from her derivation of the bound and a different setup. I am reaching out to her to get her thoughts on this. Hope that answers your question.
Cheers
Natesh
Stefan Keppeler replied on Apr. 3, 2017 @ 20:01 GMT
Hi Natesh, thanks for guiding me through your innovative work. And good luck for contest, Stefan
report post as inappropriate
hide replies
Cristinel Stoica wrote on Mar. 29, 2017 @ 00:26 GMT
Hi Natesh,
Very interesting and well-written essay! I liked the idea of the minimal dissipation hypothesis, and how you used it to learning dynamics and the emergence of goal-oriented agency and the biological evolutionary process.
Best regards,
Cristi
report post as inappropriate
Author Natesh Ganesh replied on Mar. 29, 2017 @ 20:58 GMT
Hi Cristi,
Thanks for your comments. It is definitely a very interesting idea and I intend to keep working on it. I have read and rated your essay and it was very good piece of work. Good luck in the competition. Thanks.
Cheers
Natesh
PS: Kindly rate my essay if you havent already. If you have, thank you very much for doing so.
Vladimir Nikolaevich Fedorov wrote on Mar. 31, 2017 @ 13:11 GMT
Dear Natesh,
With great interest I read your essay, which of course is worthy of the highest rating.
I'm glad that you have your own position
«I will present the fundamental relationship between energy dissipation and learning dynamics in physical systems. I will use this relationship to explain how intention is physical, and present recent results from non-equilibrium thermodynamics to unify individual learning with dissipation driven adaptation.»
«I will refer to as the minimal dissipation hypothesis» Your assumptions are very close to me
«the phase space characterization of self-organized systems which dissipate minimally, improved understanding of internal control mechanisms to maintain criticality, and detailed formulations of cognitive states as phase transitions in a (non-chaotic strange) attractor.» You might also like reading my
essay , where it is claimed that quantum phenomena occur in the macrocosm due to the dynamism of the phase state of the elements of the medium in the form de Broglie waves of electrons, where parametric resonance and soliton occur, and this mechanism of operation is analogous to the principle of the heat pump. At the same time,
«the minimal dissipation hypothesis» is realized.
I wish you success in the contest.
Kind regards,
Vladimir
report post as inappropriate
Jochen Szangolies wrote on Apr. 3, 2017 @ 14:36 GMT
Dear Natesh,
thanks for an interesting, very densely-packed essay! Your minimal dissipation hypothesis carries some immediate intuitive heft: anything minimizing its dissipation must in some ways adapt to the environment. You then turn traditional reasoning on its head, subverting the expectation that because something learns, it may minimize its dissipation (a good thing for any living system with bounded resources), arguing rather that such minimization itself is simply what constitutes learning.
It's sort of like the thinking that got rid of élan vital: once we've explained the moving around, reproducing, seeking out of food etc. it became clear that we don't need additional magic fairy dust---those sorts of things are just what's meant by the term 'life', they're not the consequence of a life-giving force being present. So in a sense, I see you attempting to do something similar for 'learning': once we've realized minimal dissipation in the agent, we find there's nothing else left over.
I'm a bit puzzled regarding your occasional mentions of quantum systems---it seems to me that essentially the same analysis could be carried out classically; nothing seems to ride on any specifically quantum features, such as superposition, interference, or quantum correlations.
Hope you do well in the contest!
Cheers,
Jochen
report post as inappropriate
Author Natesh Ganesh replied on Apr. 3, 2017 @ 19:18 GMT
Hi Jochen,
Thank you for your kind comments! The size limitations forced my hand with respect to the denseness. I could not have stated my ideas and thoughts better. I might borrow your comments on 'getting rid of elan vital' to better explain this for future purposes. In retrospect, I should have maybe limited the broad nature and focused with greater details on certain topics.
Yes, you raise a good point. The derivations are done for quantum systems because that is traditionally the regime I often work with for my dissertation in nanoelectronics, classical information stored in quantum systems. No specific quantum features were needed to be invoked in my submission and as I work on updating this submission and on a formal paper, I will make that a lot clearer. Expressions similar to what I have, can conceivably still be derived for classical systems (and I think would be equal to the expressions I have with the classical Shannon entropy terms). But I am interested in obtaining the equivalent quantum operators/mappings for future work and for that the current framework will serve well. Thanks for pointing it out.
Good luck to you in the contest as well!
Cheers
Natesh
Rajiv K Singh wrote on Apr. 3, 2017 @ 15:18 GMT
Dear Ganesh,
I suppose, you like critical examination of your essay. I must confess that I really could not follow the mathematical derivation entirely, may be due to my own limitation. But, I will grant the concluding remarks by you based on those mathematical expressions. I read this essay twice over a fortnight.
I take the following statement as your motivation. "Open physical...
view entire post
Dear Ganesh,
I suppose, you like critical examination of your essay. I must confess that I really could not follow the mathematical derivation entirely, may be due to my own limitation. But, I will grant the concluding remarks by you based on those mathematical expressions. I read this essay twice over a fortnight.
I take the following statement as your motivation. "Open physical systems with constraints on their finite complexity, that dissipate minimally when driven by external fields, will necessarily exhibit learning and inference dynamics."
In Fig.1b, at the first stage we see the external input coming, which is mixed with the prediction of the same coming from higher level, and up goes the 'prediction error'. This is OK, but from the next stage onwards, we see that the predictive estimator (processor/comparator) receives only the prediction error from lower level, and feedback of prediction from the next higher level. A prediction from higher level cannot be compared with the prediction error that took place at the lower level, it would make no sense. A predictive estimator must receive appropriate modular value derived (or predicted) from observation from lower level in order to be able to compare or generate prediction error. I suppose the direction of flow is incorrect. In fact, a predictive estimator should generate a prediction error internally from the predictions coming from both sides, and use the error to predict for the next higher level as well as for the lower level, which must be appropriate for both sides independently. Natesh, in cases of processing systems, always take the limiting cases to test the hypothesis. For example, when the system makes first observation, at the lowest level there is no prediction coming back from the higher level to compute the error. Similarly, at the highest level no prior action to make a correction with only incoming prediction error. Furthermore, this is also to be noted, in any realistic system, a module may receive input from multiple modules and send its output to multiple modules.
"The joint system SA is a quantum system with two components." From this I also gather a classical system might not be able to achieve what a quantum system does, otherwise, there was no need to classify it as quantum. But then, later on I notice that you identify neo-cortex as S, and A as motor-cortex. I trust, you are equating neo-cortex and motor-cortex as quantum systems, a hard to gulp inference.
"Agency is the capacity of a system/entity/agent/organism to act on it’s environment." And if all physical entities satisfy this definition of agency, then I do not see the need of a separate definition taking the attention of some readers on the side of psychology. Being a part of environment, any reaction to the physical context is equivalent to altering the environment. But when you say, "(I am imbuing system A with agency, but not with a specific goal or purpose)", it is as if there could be a system without agency. As you defined earlier, all physical entities are natural agents. So, by stating this, you are priming a reader with certain preconceived notion of agency. Again when you say, "The optimal encoding of R0 in SA is a trade-off between exploiting known information and exploration", where does the exploration come from? I understood that A would simply react physically as per the input from S. But this reaction is aimless. The term 'exploration' also achieves the same goal of priming the reader with certain kind of agency, reinforcing the sense. "While the state of A depends upon balancing exploration with prediction", further enhances this sense.
Even in cases where the system SA is evolving to predict the incoming input correctly, it is just a prediction of the system R, where is the purposeful goals for self sustenance or whatever coming into picture? Therefore, I suppose, one has to design an extra element in SA system such that S tries to optimize on certain parameter, to signal A to act in a particular manner. Otherwise why would S set the task of throwing a ball in any manner, let alone trying to dunk. The purpose also has to be artificially coded.
"Due to these past inputs, let the state of the system A (motor cortex) that is most likely given the prediction-exploration trade off, corresponds to the action "throw the ball." How did the first input come, and what would be any reason to throw the ball at all?
"We will define sense of agency (SA) as the pre-reflective subjective awareness that one is initiating, executing, and controlling one’s own volitional actions in the world."
"Thus the joint state of SA=("see ball being thrown","throw ball") as the ball is thrown will explain the sense of agency, the awareness of an action being performed as it being performed." The association of an awareness of an action being performed to the system SA is in your/our mind. I do not see where and how exactly this sense of awareness is represented in SA. Then you rain statements like, "For example, in the case of visual perception of a face, the higher levels make predictions corresponding to say, 'seeing a face'." I can accept that the system may have representation of all the parts described, but I do not see how 'seeing a face' is represented.
The masterpiece of all statements is, "Similarly predictions made in the higher levels of the hierarchical model in SA, under the minimal dissipation hypothesis, would correspond to the higher level intention of the action-sense of agency (like say "win game" in our example)...."
As I said about your system that goals and purposes would not arise unless especially coded in the system, the same applies to all systems. In a system like brain, such a coding is achieved by the process of natural evolution in the Darwinian sense. You may quote me on any statement here.
Then comes the attribution of ownership, "Crucial to this process though, is a sense of ownership that the system will learn over time about what is within the system’s control and what is beyond that." Natesh, what you could see as a logical extension from your own perspective of a relation of an actor and its acts, you assign that to the system.
"... arguments have been made for inherent intentionality in every perception event [12]. We can view the upper levels of the hierarchical model in the brain as the source of only intentions and make a strong case that intention is physical."
Imagine if we say, "intention is a specification of an information represented in a physical system", then it does not remain physical, yet it has origins in physical systems. But then, if we insist on the paradigm of 'intention is physical', then there must be a way to measure it. Though I trust what you may have meant is 'intention' arises from physical function of the universe, it does not require or depend on any non-physical phenomena.
As a concluding remark, I am going to consider a stone as a system S embedded in surrounding heat bath, the air in thermal equilibrium. A puff of wind blows that applies certain force on the stone, but the stone remains undisplaced, and no exchange of heat (energy) takes place, i.e., the stone dissipated minimum energy. In such a scenario, what learning has taken place in S that it can predict about wind? So, any development from minimum dissipation hypothesis must conform to this limiting case. I suppose, you may require some other constraint in addition.
In an exchange with Ellis, you wrote, "Thanks for a delightful exchange. I am enjoying myself!!" I consider you a system like SA, so which component of S and A is referring to itself as an enjoyer, and which component is being enjoyed? And why would both be claimed to be as oneself?
I feel favorable to consider reasonably well rating for clever usage of the terms so wisely that the reader might end up with the notion of emergence of goals from minimal dissipative hypothesis. Mr Natesh, you are a magician too.
Rajiv
view post as summary
report post as inappropriate
Author Natesh Ganesh replied on Apr. 3, 2017 @ 20:10 GMT
Hi Rajiv,
Lots to unpack here but I will do my best to answer all queries. To save space, I will paste a few lines from your comments while I try to address the entire paragraph.
"A prediction from higher level cannot be compared with the prediction error that took place at the lower level, it would make no sense. A predictive estimator must receive appropriate modular value derived...
view entire post
Hi Rajiv,
Lots to unpack here but I will do my best to answer all queries. To save space, I will paste a few lines from your comments while I try to address the entire paragraph.
"A prediction from higher level cannot be compared with the prediction error that took place at the lower level, it would make no sense. A predictive estimator must receive appropriate modular value derived (or predicted) from observation from lower level in order to be able to compare or generate prediction error...."
--> I did not include the details of what is inside the box of each predictive estimator, but left it to readers who might want to know more to better acquaint themselves to the concept. The direction of flow I employ is consistent with what is used in the models of hierarchical predictive coding.
"Natesh, in cases of processing systems, always take the limiting cases to test the hypothesis. For example, when the system makes first observation, at the lowest level there is no prediction coming back from the higher level to compute the error."
--> There could be, not necessarily a good prediction but there could be a prediction generated. Loosely stated, we do not expect newborns to make expert predictions of the world from the moment they are born. But if the brain is functioning properly, we do expect it to get better in time.
"The joint system SA is a quantum system with two components." From this I also gather a classical system might not be able to achieve what a quantum system does, otherwise, there was no need to classify it as quantum."
--> Not everyone jumped to that conclusion but I agree that I should have been clearer that I do not think the brain is large quantum system or needs quantum specific features. I derived it based on quantum systems given my familiarity but the same can be done for classical systems.
"(I am imbuing system A with agency, but not with a specific goal or purpose)", it is as if there could be a system without agency. As you defined earlier, all physical entities are natural agents."
--> I make distinctions between different types of agency along the lines of those being involuntary, unconscious and voluntary goal oriented. And I have to define agency the way I use it in my submission. I do not know how to avoid that.
"The optimal encoding of R0 in SA is a trade-off between exploiting known information and exploration", where does the exploration come from? I understood that A would simply react physically as per the input from S. But this reaction is aimless. The term 'exploration' also achieves the same goal of priming the reader with certain kind of agency, reinforcing the sense."
--> The exploration part has been explained in the math, and corresponds to exploration as discussed in its use, say in problems like the multiarm bandit. I do not see how using the word explorations is priming the reader. Please elaborate.
""Due to these past inputs, let the state of the system A (motor cortex) that is most likely given the prediction-exploration trade off, corresponds to the action "throw the ball." How did the first input come, and what would be any reason to throw the ball at all?"
--> If you threw the ball since the states of your brain generated the appropriate signals to do so, then the goal as interpreted by your brain while your actions is being performed, would be to throw the ball. You can always have deeper introspection but I clearly state that is beyond the scope of this essay.
"For example, in the case of visual perception of a face, the higher levels make predictions corresponding to say, 'seeing a face'." I can accept that the system may have representation of all the parts described, but I do not see how 'seeing a face' is represented."
--> I would suggest looking up John Searle's explanation of intentionality in perception to have a better understanding of what I am talking about. Not easy to discuss such a complex topic over a blog post and Searle does a much better job than I ever can. Furthermore the deeper levels of the brain can be seen as performing a coarse graining, which would correspond to generating higher level features as one would expect in a pattern recognition system.
"Though I trust what you may have meant is 'intention' arises from physical function of the universe, it does not require or depend on any non-physical phenomena."
--> Agreed.
"As a concluding remark, I am going to consider a stone as a system S embedded in surrounding heat bath, the air in thermal equilibrium. A puff of wind blows that applies certain force on the stone, but the stone remains undisplaced, and no exchange of heat (energy) takes place, i.e., the stone dissipated minimum energy. In such a scenario, what learning has taken place in S that it can predict about wind? So, any development from minimum dissipation hypothesis must conform to this limiting case."
--> Depends. You have not specified anything about the complexity of the stone, with complexity being measured in the way I have prescribed in the essay. The Lagrangian parameter beta in the submission is a variable that can be manipulated by the system itself, and given how the complexities changes, perhaps in the right regions of complexity the rock will learn something about the wind. Similarly in the non optimal regions, we dont learn anything either.
"In an exchange with Ellis, you wrote, "Thanks for a delightful exchange. I am enjoying myself!!" I consider you a system like SA, so which component of S and A is referring to itself as an enjoyer, and which component is being enjoyed? And why would both be claimed to be as oneself?"
--> If I understand this correctly, I would say it is the product of the joint system SA.
"I feel favorable to consider reasonably well rating for clever usage of the terms so wisely that the reader might end up with the notion of emergence of goals from minimal dissipative hypothesis. Mr Natesh, you are a magician too."
--> I am not sure if this is a compliment, but if you are making a personal attack that I am the equivalent of a scientific conman using terms/phrases as a distraction, I am not going to dignify that with a response.
Natesh
view post as summary
Rajiv K Singh replied on Apr. 4, 2017 @ 08:05 GMT
Dear Ganesh,
It would not be wise to attempt to repeat my arguments to emphasize the differences in understanding. But I should still make two points --
1. Consider this that you may have missed the central themes in each of the points that I made, you may be able to discover paths to strengthen your ideas and make them robust. For example, "As I said about your system that goals and purposes would not arise unless especially coded in the system, the same applies to all systems. In a system like brain, such a coding is achieved by the process of natural evolution in the Darwinian sense." If it so happened that you realized later sometime in future the truth of this statement, you may see suddenly a different meaning in each of my statements.
2. When I called you a magician, it was not meant to call you a 'scientific conman', many scientists believe that using the right descriptive words is needed to bring about the understanding of mental processes in physical terms. Often they believe, there isn't anything more than just a reframing of terms. And I meant, that you did a good job of that. Moreover, as you can understand that I would know the meaning of your names, therefore, I thought, you would understand the pun, and therefore, the fun in calling you with a similarly meaning word that goes well with your success. Looks like humor was lost !
By the way, it is not easy being a 'scientific conman' and succeed. Except for a tiny few, most cannot succeed in being so. So, from that perspective also, even if not intended, it is a complement.
Rajiv
P.S. At least you would have noticed the amount of time that I must have spent on your essay !
report post as inappropriate
Rajiv K Singh replied on Apr. 4, 2017 @ 08:46 GMT
Dear Ganesh,
I reread my own earlier comment. And then I realized, that several statements I wrote that could open a young PhD student mind. For example,
1. The association of an awareness of an action being performed to the system SA is in your/our mind. I do not see where and how exactly this sense of awareness is represented in SA.
So, that you can easily see that we have a tendency to assign the logical interpretation of our own mind into the processing systems so very easily -- a learning, what to be wary of in our thinking.
2. I am going to consider a stone as a system S embedded in surrounding heat bath, the air in thermal equilibrium.
So that next time, you will peruse your own statements with greater scrutiny to observe that your statements can be interpreted from very general or very narrow perspective.
3. "Thanks for a delightful exchange. I am enjoying myself!!" I consider you a system like SA, so which component of S and A is referring to itself as an enjoyer, and which component is being enjoyed? And why would both be claimed to be as oneself?
Placing you in a logical dilemma, so that you can ponder and get the deeper meanings of your objectives. It is also to demonstrate that there is always certain deeper implications that one should be careful about. And to open a mind that our understanding of nature must apply to our routine reflections, otherwise, you will not see the universality of applications of scientific thoughts.
I am sure, you will see that such confrontations with rationality could open minds of a budding scientist.
Rajiv
report post as inappropriate
Author Natesh Ganesh replied on Apr. 7, 2017 @ 16:04 GMT
Dear Rajiv,
I appreciate the time you took to provide detailed questions and criticisms of my work. I will also thank you for your complement. Where I come from in India, my name is not associated with the meaning of a 'magician', hence I could not immediately see the humor aspect of your comment. You are a senior scientist with greater experience than I and I will respect that and thank you for your time and important comments that has given me a lot to think about further as I continue my work. Cheers
Natesh
hide replies
Torsten Asselmeyer-Maluga wrote on Apr. 3, 2017 @ 20:59 GMT
Dear Natesh,
what a wonderful essay, full of new ideas. In particular I like your approach using finite automta (most argumentation using statistical physics need infinte systems). Very interesting for me was the criticality hypothesis. If the collective dynamics of the brain is close to a phase transition then this dynamics must be close to chaotic dynamics (like the cascade of harmonics for the logistic map). But at this point the underlying dynamics has a fractal state space.
You use a more statistical physics point of view but in my opinion we got similar results. Maybe you are also interested to read
my essay? I considered a model for the brain network with a phase transition to a tree (so having a goal). Here the transition happens at the topology of the network.
All the best for you and for the contest (with a strong upvoting from my side)
Torsten
report post as inappropriate
Author Natesh Ganesh replied on Apr. 4, 2017 @ 15:23 GMT
Dear Torsten,
Thank you for the encouraging comments and the rating. I agree that Markov finite automata could be a very powerful and useful tool in studying the brain. Yes, I had read about the criticality hypothesis in neuroscience a few years ago and was not very convinced of it back then. It was surprising to me that the minimal dissipation hypothesis could be framed to predict this idea and the results look promising.
I will definitely have a close look at your work and reply on your page in detail later today.
"But at this point the underlying dynamics has a fractal state space."
--> Agreed. I am beginning to study attractors, chaotic dynamics, etc. in more detail now. My knowledge at this moment is very limited, but the idea I am most interested in that area are strange non-chaotic attractors.
"I considered a model for the brain network with a phase transition to a tree (so having a goal). Here the transition happens at the topology of the network."
--> This sounds very interesting and promising. I will reach out to you on your page.
Thanks and good luck on the contest.
Cheers
Natesh
Dizhechko Boris Semyonovich wrote on Apr. 5, 2017 @ 10:03 GMT
Meet up the New Cartesian Physic, based on the identity of space and matter. You need it, because it showed that the formula of mass-energy equivalence comes from the pressure of the Universe, the flow of force which on the corpuscle is equal to the product of Planck's constant to the speed of light.
New Cartesian Physics has enormous potential in understanding the world. To show this potential I ventured to give "materialistic explanations of the paranormal and supernatural" is the title of my essay.
Visit my essay, you will find there the New Cartesian Physic. After you give a post in my topic, I have to do the same in your theme
sincerely,
Dizhechko Boris
report post as inappropriate
Author Natesh Ganesh replied on Apr. 7, 2017 @ 15:01 GMT
Hi Boris,
I do not have the knowledge expertise to understand or critically judge your essay, so I am going to not comment or rate it to be fair. Thanks and good luck.
Natesh
Gary D. Simpson wrote on Apr. 6, 2017 @ 23:58 GMT
Natesh,
I'm glad to see another engineer in the contest. I won't pretend to understand fully what you have presented. My comprehension of your work is at most 50%. Having said that, it is clear to me that you have presented a break-through concept that connects the physical to the mental. You have quantified the ability to learn and done so in a manner that is not human-centric or even life-centric.
I can offer one observation from Chemical Engineering that might be useful to you. We have an area of study called Process Control. This is dedicated to controlling flow rates and temperatures and pressures and all such variables associated with chemical operations. In this field, we use concepts such as critically-damped, under-damped, and over-damped. These concepts are very similar to what you present near the end of your essay. There are also a host of methods available for what type of control to select and how to tune the control loops. If your university has a Department of Chemical Engineering, it might be worth spending a few hours discussing these items with some of the ChE faculty.
Best Regards and Good Luck,
Gary Simpson
report post as inappropriate
Author Natesh Ganesh replied on Apr. 7, 2017 @ 14:51 GMT
Dear Gary,
Thank you for your kind comments. Yes, I did not find many other engineer submissions here as well, so its good to meet another one. I do think the ideas presented here are potentially very useful and perhaps I need to do a better job of explaining myself better that everyone understands the work completely.
"You have quantified the ability to learn and done so in a manner that is not human-centric or even life-centric."
--> Could not have phrased it better myself. That was the goal (pardon the pun, couldnt help myself).
And your comments on over/critical/under damping is very interesting. I would be very interested in the control mechanisms especially and I will follow your advice and reach out to the Chem. Engineering department to better acquaint myself with ideas they have. The level I have presented is too high level and I will need the detailed mechanisms associated at the chemical levels to make further progress. Which as a computer engineer is to figure out how to build one of these systems :)
Cheers
Natesh
Login or
create account to post reply or comment.