Steven Byrnes comments on Neural Annealing: Toward a Neural Theory of Everything (crosspost)

Steven Byrnes 30 Nov 2019 3:11 UTC
14 points
Really? I don’t get it at all. When I read cognitive neuroscience I’m looking for a coalescing between (1) Some story about the brain doing biologically-useful computations, in a way that matches (2) The things we know neurons and evolution are capable of, and (3) The things we know about what the human brain does (e.g. from introspection and scientific studies). I’m not seeing this in the CSHW material—particularly the connection to biologically-useful computations. The power distribution among harmonics carries very little information—dozens of bits, not the billions or trillions of bits that are needed for human-level understanding of the world. So what’s the relation between CSHW and biologically-useful computations? Again, I don’t get it. (Note: This is an initial impression based on cursory reading.)
What links here?
- Steven Byrnes's comment on A Primer on the Symmetry Theory of Valence by Michael Edward Johnson (6 Sep 2021 18:07 UTC; 48 points)
- lsusr 30 Nov 2019 9:31 UTC
  14 points
  Parent
  This makes sense to me because I work full-time on the bleeding edge of applied AI, meditate, and have degree in physics where I taught the acoustical and energy-based models this theory is based upon. Without a solid foundation in all three of these fields this theory might seem less self-evident.
  
  Hopefully this explanation can help you understand the theory behind the theory. First I’ll address points (1), (2) and (3). Then I’ll explain the bandwidth issue in more detail.
  
  (1) While it’s true that these harmonic frequencies have less information bandwidth then synapses that doesn’t mean they don’t perform biologically-useful computations. High-bandwidth pattern matching is trivially easy to do with neural networks. The hardest part about neural networks is time series data. (I know this because I am a specialist in the application of machine learning algorithms to handle time-series inputs.) To simplify the situation right now, we [1] don’t know how to use neural networks to handle time series data [2] don’t know how to get different machine learning systems of any kind to work together—especially with regard to time series data. If CSHW can make any progress in this direction then that would be useful.
  
  (1.1) You are correct that we need a traditional mass of neurons tuned via gradient descent in order to handle high-bandwidth information like our many nerves and to handle complex actions like muscle control. CSHW does not get in the way of these things. Rather CSHW is a simple, elegant way to coordinate many different sub-networks into a human brain. It’s not about “how” do you throw a baseball. It’s “when” do you throw a baseball. When different networks are out of phase with each other the inputs of one turn into static for the other, which is literally equivalent to tuning out a radio. In short, the purpose of CSHW is not to replace the massive information processing solved by neural networks. Instead, it’s purpose is to combine and separate neural networks, as applicable, in response to time-series inputs. It does this fractally, which is the only way to simplify a design to handle massive complexity in a biological system.
  
  (1.2) All CSHW needs to do is to tell which networks should receive information from which other networks. High-frequency waves both propagate shorter distances and oscillate faster (have higher bandwidth) than low-frequency waves, so the information density and response speed gets higher where it needs to be higher (on smaller scales). Remember: the oscillations don’t have to transfer information. That’s performed by the traditionally-understood neuronal connections. The oscillations can bring different systems in and out of sync in a coordinated manner. This happens at a lower frequency than individual neuron firings and involves larger masses than individual neurons so the necessary bandwidth is much lower. Frequency space might just be a dozen bits long, but there’s three spacial dimensions based on actual physical space too.
  
  (1.3) The low bandwidth of the harmonic frequencies explains an important puzzle about consciousness. You know how you can only keep 3-9 concepts in working memory at once? This could be a reflection of the low bandwidth of the low frequency waves.
  
  (2) We have known neurons and evolution are capable of producing waves like this (especially the low frequency ones) for ages. The question neuroscience has been struggling with isn’t “can” neurons produce waves like this. It’s “why”.
  
  (2.1) This theory describes observed behavior especially well once you compare the theory’s predictions to the observed brainwaves in advanced meditators. The brain scans of Tibetan yogis and the traditional subjective descriptions written by Zen masters match descriptions of the low frequency brain resonance predicted by this theory. So does a modern Vipassana manual, though it focuses on the high frequency end of the spectrum. This is ³⁄₃ major Buddhist lineages.
  
  (3) As Michael Edward Johnson (OP) mentioned in another comment, recent advancements in fMRIs have let us observe some of the phenomena described in CSHW.
  What links here?
  - Steven Byrnes's comment on A Primer on the Symmetry Theory of Valence by Michael Edward Johnson (6 Sep 2021 18:07 UTC; 48 points)
  - Mart_Korz 30 Nov 2019 17:38 UTC
    3 points
    Parent
    Thank you for this explanation.
    While reading the OP and trying to match the ideas with my previous models/introspection, I was somewhat confused: on the one hand, the ideas seemed to usefully describe processes that seem familiar using a gears-level model , on the other hand I was unable to fit it with my previous models (I finally settled with sth along the lines of ‘this seems like an intriguing model of top/high-level coordination (=~conscious processes?) in the mind/brain, although it does not seem to address the structure that minds have?’)
    [...] the purpose of CSHW is not to replace the massive information processing solved by neural networks.
    Your comment really helped me put this into perspective
    - lsusr 22 Jun 2020 19:01 UTC
      3 points
      Parent
      Are your previous models single or multi-agent? These ideas match multiagent models of the mind. If you start by assuming the mind to be a single agent then CSHW will not fit in with your previous models of the mind’s structure.
      - Mart_Korz 24 Jun 2020 23:01 UTC
        1 point
        Parent
        Now reading the post for the second time, I again find it fascinating – and I think I can pinpoint my confusion more clearly now:
        
        One aspect that sparks confusion when matched against my (mostly introspection + lesswrong-reading generated) model, is the directedness of annealing:
        On the one hand, I do not see how the mechanism of free energy creates such a strong directedness as the OP describes with ‘aesthetics’,
        on the other hand if in my mind I replace the term “high-energy-state” with “currently-active-goal-function(s)”, this becomes a shockingly strong model describing my introspective experiences (matching large parts of what I would usually think of roughly as ‘System 1-thinking’). Also the aspects of ‘dissonance’ and ‘consonance’ directly being unpleasant and pleasant feel more natural to me if I treat them as (possibly contradicting) goal functions, that also synchronize the perception-, memorizing-, modelling- and execution-parts of the mind. A highly consonant goal function will allow for vibrant and detailed states of mind.
        
        Is there some mechanism that would allow for evolution to somewhat define the ‘landscape’ of harmonics? Is reframing the harmonics as goals compatible with the model? Something like this seems to be pointed at in the quote
        
        Panksepp’s seven core drives (play, panic/grief, fear, rage, seeking, lust, care) might be a decent first-pass approximation for the attractors in this system.
        
        ---
        
        Another aspect where my current model differs is that I do not identify consciousness (at least the part that creates the feeling of pleasure/suffering and the explicit feeling of ‘self’) as part of this goal-setting mechanism. In my model, the part of the mind that generates the feeling of pleasure or suffering is more of a local system (plus complications*) that takes the global state as model- and goal-input and tries to derive strategies from this. In my model, this part of the mind is what usually identifies as ‘self’ and it is this that is most relevant for depression or schizophrenia. But as what I describe as ‘model- and goal-input’ really defines the world and goals that the ‘self’ sees and pursues at each moment (sudden changes can be very disconcerting experiences), the implications of annealing for health would stay similar.
        
        ---
        
        After writing all of this I can finally address the question of the parent comment:
        
        Are your previous models single or multi-agent?
        
        I very much like the multiagent-model sequence although I am not sure how well my “Another aspect [...]”-description matches: On the one hand, my model does have a privileged ‘self’-system that is much less fragmented than the goal-function-landscape. On the other hand, the goal-function-landscape seems best described by “shards of desire” (which is a formulation used in the sequences if I remember correctly) and they can direct and override the self easily. This part fits well with the multiagent-model
        
        ---
        
        *) A complication is that the ‘self’ can also endorse/reject goals and redirect ‘active goal-energy’ (it feels like a kind of delegable voting power that the self as strategy-expert can use if it gained the trust and thus voting-power of goal-setting parts) onto the goal-setting parts themselves in order to shape them.
        
        Michael Edward Johnson 17 Dec 2020 16:02 UTC
        4 points
        Parent
        This will be a terribly late and very incomplete reply, but regarding your question,
        
        >Is there some mechanism that would allow for evolution to somewhat define the ‘landscape’ of harmonics? Is reframing the harmonics as goals compatible with the model? Something like this seems to be pointed at in the quote
        >>Panksepp’s seven core drives (play, panic/grief, fear, rage, seeking, lust, care) might be a decent first-pass approximation for the attractors in this system.
        
        A metaphor that I like to use here is that I see any given brain as a terribly complicated lock. Various stimuli can be thought of as keys. The right key will create harmony in the brain’s harmonics. E.g., if you’re hungry, a nice high-calorie food will create a blast of consonance which will ripple through many different brain systems, updating your tacit drive away from food seeking. If you aren’t hungry—it won’t create this blast of consonance. It’s the wrong key to unlock harmony in your brain.
        
        Under this model, the shape of the connectome is the thing that evolution has built to define the landscape of harmonics and drive adaptive behavior. The success condition is harmony. I.e., the lock is very complex, the ‘key’ that fits a given lock can be either simple or complex, and the success condition (harmony in the brain) is relatively simple.
- algekalipso 30 Nov 2019 5:24 UTC
  5 points
  Parent
  The histogram of CSHW amplitudes seems to have very little information content, while the entire matrix of just-noticeable-differences of our experience seems to have a whole lot of information. If CSHWs are so important to determine a “brain state”, where is all the missing information?
  Two points here. First, according to the theory -as Mike points out- the overall “mood” of the state is largely encoded in the low frequency harmonics, while the higher frequency ones are more important for semantic information. In a sense, you can think of the lower frequency harmonics as creating a set of buckets in which to put, juggle, and recombine the information provided by the higher frequency harmonics. Hence, while the specific information content of the experience might require a very fine level of resolution, both the valence and the broad information-processing steps might not. And second, there is more to the CSHWs than just the histogram of amplitudes. There is also a matrix of phase-locking relations between them, which increases the overall information content by a large amount.
- Michael Edward Johnson 30 Nov 2019 5:11 UTC
  4 points
  Parent
  CSHWs offer pretty compelling Schelling points for the brain to self-organize around, and there’s a *lot* of information in a dynamic power distribution among harmonic modes. We might distinguish CSHW-the-framework from CSHW-the-fMRI-method.