jacob_cannell comments on Is RL involved in sensory processing?

jacob_cannell 30 Oct 2022 6:06 UTC
4 points
The rules are different in episodic RL versus online / within-lifetime RL, right?

I may not understand what you mean by ‘episodic’ - because usually that seems to imply the setup where you have a single reward at the end of an episode, but that isn’t how must game RL works as the rewards come anytime you get points which depends on the game.

If the Atari agent’s has latent activations that correspond to delusional wishful-thinking beliefs about its current situation, the Atari agent is going to fall in the snake pit and die…

But yes I agree that the atari agent in particular is technically doing inter and intra lifetime with SGD, unlike humans/animals. extrinsic reward RL is inefficient for several reasons—the reward sparsity issue, but also just low-D reward signal, and the feedback indirection problem you mentioned earlier.

I think fully describing the objective of a human would probably take like thousands of lines of pseudocode

That sounds reasonable but is extremely low complexity compared to evolved modularity.

Back to aesthetics-of-habitat. If an animal is in the wrong microenvironment—e.g. if an animal with camouflage is not standing in front of the correct background—it can easily be fatal, including without any warning.

Sure, but that clearly can’t be an explanation for general human information value (aesthetics), which should also explain:
- the pleasure of music
- the pleasure of art, etc etc
- the pleasure of beautiful scenery etc etc
- exactly which types of audio stimuli infants will attend to
- why we want to know the answer to trivia questions that we almost know more than those completely unfamiliar or completely familiar
- why we’ll risk electric shocks for magic tricks and trivial knowledge
Sensory information has intrinsic value in terms of what we can learn from it—the intrinsic bayesian value of information. Curiosity and aesthetics are both manifestations of this same unifying principle which explains all the various manifestations listed above and more. There is extensive DL literature on using info-value intrinsic motivation: we know it works and exactly how and why. There is also now growing neurosci lit establishing that the brain uses variations of the same general bayesian-optimal mechanism to estimate value of information, this seems to involve perhaps anterior cingulate cortex, anterior insula, and striatal reward circuits^[1], and the standard dopaminergic pathways more generally^[2] ^[3].

So there really is little remaining room/need for a specific ‘landscape aesthetic’, unless it’s just the region specific version of the same theme.

Animals (and humans to some degree) do have known innate triggered terrain aversions (fear of heights, or fear of open spaces, etc) and thus terrain affections are possible, but I’m not aware of specific evidence for them in humans.
What links here?
- My take on Jacob Cannell’s take on AGI safety by Steven Byrnes (28 Nov 2022 14:01 UTC; 71 points)
- Steven Byrnes 30 Oct 2022 19:46 UTC
  4 points
  Parent
  That sounds reasonable but is extremely low complexity compared to evolved modularity.
  Yes! You don’t have to convince me to be opposed to evolved modularity in the cortex! (i.e. the school of thought that the genome specifies the logic behind intuitive botany, and separately the logic behind intuitive physics, and separately the logic behind grammar, etc., as advocated e.g. by Steven Pinker & Gary Marcus). I myself argue against that school of thought all the time!!
  that clearly can’t be an explanation for general human information value (aesthetics)
  It doesn’t explain everything. But it does explain some important things.
  For example, most people prefer to see a river view out their windows than seeing a brick wall 5 meters away out their windows. Is this a value-of-information thing? No. If both scenes have been the same for years (e.g. no animals or boats on the river), neither view carries any useful information, but people will still tend to prefer the river view. If the brick wall has more information, maybe because there are bugs crawling on it, or a window into your neighbor’s apartment with the blinds always down (so they can’t see you but you can see silhouettes etc.), people still tend to prefer the river view!
  (I myself don’t mind the brick-wall-view, but I still slightly prefer the river view and am clearly in the minority on this anyway—just look at apartment rental prices etc.)
  I’m not sure how you would explain that fact. In my opinion, it’s a direct consequence of superior colliculus heuristic calculations that are ultimately derived from evolutionary pressure to be in the right micro-habitat.
  I think the superior colliculus calculates various heuristic / statistical properties of the visual inputs, including ones related to micro-habitat selection, probably ones related to mate selection, almost definitely heuristics for recognizing snakes and spiders (I think by the way they move [slither / scuttle]), almost definitely heuristics for whether you’re looking at a human face (and perhaps eye contact in particular), and maybe some other things I’m not thinking of.
  I think these various heuristics are constantly outputting scores on a scale of 0 to 100%, and those scores serve as inputs to a reward function.
  But that same reward function also includes lots of other things!! For example, being in pain is bad. If every time I look at orange things I get electrocuted, throughout my whole life, I’m going to wind up with an “aesthetic” distaste for the color orange. But the superior colliculus heuristics have nothing to do with that preference! Instead of superior colliculus → brainstem → negative valence, the main pathway would instead probably be visual cortex → amygdala → brainstem → negative valence, IMO.
  Some kind of “curiosity drive” is part of the reward function too, I firmly believe.
  exactly which types of audio stimuli infants will attend to
  Funny that you brought that up. I was going to bring it up! I think there’s a lot of information in bird songs, and there’s a lot of information in human speech sounds, and I think it’s not inherently obvious in advance which of those is going to be more useful to a human child, and yet I’d bet anything that human infants are more attentive to human speech sounds than bird songs (given equal exposure), and that the brainstem auditory processing (inferior colliculus) is responsible for that difference. (I haven’t specifically done a deep-dive lit review. If it turns out to be crux-y, maybe I will.)
  I think I’m generally much more skeptical than you that an infant animal can figure out value of information sufficiently quickly and reliably for this to be a complete explanation of what they attend to. From our worldly adult perspective, it’s obvious that human speech is carrying much more useful information for a human infant than bird songs are, and that for a baby bird it’s the other way around. But before the baby human/bird has made headway in learning / parsing those auditory patterns, or understanding the world more generally, I don’t think it yet has any way to figure out which of those is better to invest in.
  By the same token, I think I’m much more skeptical than you that an infant animal can figure out what is “empowering” and what isn’t, in a way that completely explains all the complex species-specific infant behavior. From our perspective, it’s obvious that a camouflaged animal is “empowered” by standing in front of the appropriate background when there’s a predator nearby, because it’s less likely to get eaten, and getting eaten is very disempowering. But how is the animal supposed to figure that out, except through probably-fatal trial-and-error?
  Animals (and humans to some degree) do have known innate triggered terrain aversions (fear of heights, or fear of open spaces, etc) and thus terrain affections are possible, but I’m not aware of specific evidence for them in humans.
  So you think that fear-of-heights is innate in non-human animals, but not innate in humans? That strikes me as weird. Looking over a precipice invokes a weirdly specific kind of tingling (in my experience) that doesn’t happen in other equally-frightening contexts. Fear of heights is widespread in the human population way out of proportion to its danger, in comparison with other things like fear-of-knives and fear-of-driving. Fear-of-heights seems obviously evolutionarily useful for humans. If it could evolve in some animals, why not humans?
  Hmm, I feel like we’re at least partly having a stupid argument where you’re trying to convince me that the reward function is less than 100% superior colliculus hardcoded heuristics, and meanwhile I’m trying to convince you that the reward function is more than 0% superior colliculus hardcoded heuristics. Maybe we can just agree that it’s more than 0% and less than 100%! :)
  general human information value (aesthetics)
  I think it’s important that we reserve the word “aesthetics” for something like “finding some things more pleasant to look at than other things”. Then you can propose a theory that a complete explanation for aesthetics is information value. And we can discuss that theory.
  Such a discussion is hard if we define “aesthetics” to be information value by definition.
  - jacob_cannell 30 Oct 2022 21:39 UTC
    4 points
    Parent
    
    I think it’s important that we reserve the word “aesthetics” for something like “finding some things more pleasant to look at than other things”.
    
    Sure—let’s call it information preferences.
    
    Let’s start—as we always should—from core first principles: what is the optimal information preference—ie that which would maximize total future discounted inclusive fitness?
    
    Some information has obvious instrumental planning value—like the natural language instructions for how to start a fire for example. Most information has less obvious immediate instrumental value compared to that extreme, but all information has some potential baseline future utility in how it improves the predictive capacity of the world model.
    
    As you discussed earlier the brain has a ‘world model’ loosely consisting of much of sensory cortex, and there is overwhelming evidence that said world model components are trained via UL/SSL style prediction. There’s also strong foundational reasons why this is in fact bayes optimal (solomonoff induction, bayesian inference, etc) - so it’s hardly surprising we see that in the brain.
    
    But that isn’t nearly enough.
    
    The brain’s model is trained only from a highly localized egocentric observation stream, and the brain thus must actively decide it’s own training dataset. Most information is near useless; the high utility information is extremely sparse and rare. So it’s crucially important to estimate the value of information - the core of which is compression progress or improvement in the world model’s predictive capacity. It’s also not all that difficult to estimate—you can estimate it from learning updates (when using optimal variance adjusted learning rates). A large update then indicates compresison progress (because there was a prediction error followed by an update to reduce prediction error, and the system computing these updates is smart enough to compute the minimal correct sized updates), and is always highly specific to the knowledge already encoded in the model.
    
    This is also ‘curiosity’ but it is far more general and powerful than the typical meaning of that term, and it is fully general enough and powerful enough to explain most all of our information preferences—I’ve already given many examples.
    
    For example, most people prefer to see a river view out their windows than seeing a brick wall 5 meters away out their windows. Is this a value-of-information thing?
    
    Yes, the view of the river is vastly higher value-of-information—it’s constantly changing with time of day lighting, weather, etc. However it’s also an empowerment thing, as the river view suggests you can actually go out and explore the landscape. So it’s confusing multiple intrinsic motivation signals.
    
    A better comparison is a simple static painting of a river landscape vs static painting of a brick wall. The river landscape view still has higher info value—it has higher total compressible entropy for typical humans who have experience with vaguely similar visual patterns.
    
    The brick wall with insects could be higher info value—and indeed some children find a wall with moving insects highly fascinating (as I did) - it’s called an ant farm and it was definitely more interesting than landscape paintings (or most any paintings really).
    
    I think I’m generally much more skeptical than you that an infant animal can figure out value of information sufficiently quickly and reliably for this to be a complete explanation of what they attend to.
    
    They can and I already linked some articles demonstrating exactly this and some of the theory of how it works—although obviously it’s not the only factor for attention, but it is primary and by far the single most important factor.
    
    Optimal bayesian learning rates are simple consequences of uncertainty/variance—namely the tradeoff between uncertainty/precision in the current network weights vs that of some new update. Variance-adjusted learning rates are also critical for modern DL methods, which mostly estimate them using slow rolling gradient statistics (ie Adam etc). That isn’t as plausible for the brain, but the brain does have a system which seems to encode various forms of global uncertainty/variance across different timescales (factored into meta-level predictable uncertainty and unpredicted uncertainty—which is exactly what you want as you need to ignore predicted uncertainty aka unlearnable noise) which seems to be then distributed by the serotogenic ralph nuclei globally to essentially all of the learning regions and is a core learning rate type input.
    
    The local hedonic reward value of information signals are then—i’m guessing—not too difficult to compute from local downstream serotegenic statistics. These somehow ultimately funnel into the dopaminergic rewards centers (which are more typically known for handling reward prediction errors but really it’s more like prediction errors in general). And it has to be localized because each brain module is the ideal place to compute value-of-information relative to it’s current knowledge. This explains why serotonin is indirectly rewarding (and serotonin analogs such as LSD/psilocybin generally increase both plasticity causing ‘neural drift’ and euraka hedonic info-value reward), and much else.
    
    From our worldly adult perspective, it’s obvious that human speech is carrying much more useful information for a human infant than bird songs are, and that for a baby bird it’s the other way around.
    
    Well once again the compression progress or bayesian surprise value of information fully explains exactly why human babies prefer human language and bird babies prefer bird songs—the value of information depends on predictability which depends on the current world model knowledge—it needs to balance novelty with familiarity in the right way.
    
    Adult mono-linguistic humans don’t find other languages super interesting, because the relevant brain regions have transitioned to a low variance/uncertainty and thus low learning rate state, which scales down the value-of-info reward proportionality. It’s all relative to what you currently know which depends on age and total experience history.
    
    A true feral child wouldn’t find human language all that more interesting than bird song—but probably still somewhat more interesting because human language just has more compressible complexity than bird song—aliens would find it more interesting.
    
    Hmm, I feel like we’re at least partly having a stupid argument where you’re trying to convince me that the reward function is less than 100% superior colliculus hardcoded heuristics,
    
    In terms of range of phenomena explained or bits of entropy it’s pretty obviously close to 99% bayesian value of information, and pretty much needs to be. The superior colliculus hardcorded heuristics simply doesn’t have the capacity to have heuristics for the intrinsic value of language, math, science fiction stories, trippy fractal videos, etc, etc. It’s essentially the same argument for universal learning from scratch—and indeed it’s just a higher order effect or consequence of learning from scratch.
    
    For the remaining 1% you still need some low complexity priors that steer or bias the value-of-information heuristic. For example the complexity of food taste, and our preference for just the right amount of novelty, is mostly explained by the same standard info-value, but it clearly also has a low complexity prior bias for +sweetness/carbs, +fats, +sodium, -bitter, etc. (Combined with some amount of additional supervised learning feedback from digestion to learn what’s toxic or not)
    
    But as far as the reward centers are concerned, there is no difference between food taste and other forms of information tastes—it’s all just info taste (with weak low complexity priors/bias) all the way down.
    What links here?
    My take on Jacob Cannell’s take on AGI safety by Steven Byrnes (28 Nov 2022 14:01 UTC; 71 points)
    - Steven Byrnes 31 Oct 2022 14:56 UTC
      4 points
      Parent
      Thanks for all that!
      Hmm, I think maybe we’re talking past each other a bunch here.
      Suppose that a young camouflaged prey animal “wants” to stand in front of the appropriate background (that matches its camouflage), even without trial-and-error experience of getting attacked or not attacked by predators in front of various different backgrounds, and suppose that the proximal cause of this behavior is that the brainstem superior colliculus is calculating heuristics on visual input and then contributing to the overall reward function.
      (I’m still not sure how much you’re inclined to believe that this is a thing that happens in real animals, but let’s assume it for the sake of argument. You did say that “terrain affections are possible”, at least.)
      I feel like you would describe this fact as a victory for empowerment theory, because getting eaten by a predator is disempowering. Therefore we can describe this superior colliculus heuristic calculation thing as “part of an approximation of empowerment”.
      Whereas I would describe that same fact as a failure of empowerment theory, because the thing I care about (for various reasons) is what the brain is actually doing, not why it evolved that way.
      Do you agree so far?
      If so, you can ask, why is that the thing I care about? I.e., if the heuristic evolved to advance empowerment, why don’t I think that matters?
      First, if we’re trying to help humans, and the human brainstem heuristics are coming apart from empowerment (as all approximations do, cf. Goodhart’s law), I think we should pay much more attention to the heuristics than the empowerment. For example, I happen to care about animal welfare, and that caring tendency presumably evolved because it was “empowering” on net in the societies of my ancestors. But maybe in the modern world, and especially if I had an omnipotent AGI assistant, maybe I would be increasing my own personal empowerment if I didn’t care about animal welfare. But I wouldn’t want that. And that’s OK. We shouldn’t be using a normative theory where I’m somehow wrong / confused to choose to care about animal welfare at the expense of my own empowerment.
      Second, we’re going to make an AGI, and presumably in that AGI we (like evolution) will use heuristic approximations of empowerment instead of empowerment itself. And those heuristics will come apart from empowerment. And then the AGI will actually follow the heuristics, not empowerment. At some point, it will be too late to turn off the AGI, and the heuristics will be in charge of the future of terrestrial life. So we should really care directly about what behaviors the heuristics are incentivizing, and we should only care indirectly about what behaviors empowerment would be incentivizing.
      A better comparison is a simple static painting of a river landscape vs static painting of a brick wall.
      Sure. But I claim that lots of people like paintings of river landscapes, even when the painting has been hanging for months or even years and thus offers zero novelty. When I moved from one house to another, my wife hung up many of the same paintings that were on prominent display in the old house, despite us having a pile of paintings that we’ve never hung up at all and thus which would have had dramatically more information value.
      I claim that this fact is a victory for superior colliculus micro-habitat-related heuristics.
      …But I still think that the strongest case is the theoretical one:
      By default, being in the wrong micro-habitat gives a negative reward which can be both very sparse and often irreversibly fatal (e.g. a higher chance of getting eaten by a predator, starving to death, freezing to death, burning to death, drowning, getting stuck in the mud, etc. etc., depending on the species)
      Therefore, it’s very difficult to learn which micro-habitat to occupy purely by trial-and-error without the help of any micro-habitat-specific reward-shaping.
      Such reward-shaping is straightforward to implement by doing heuristic calculations on sensory inputs.
      Animal brains (specifically brainstem & hypothalamus in the case of mammals) seem to be perfectly set up with the corresponding machinery to do this—visual heuristics within the superior colliculus, auditory heuristics within the inferior colliculus, taste heuristics within the medulla, smell heuristics within the hypothalamus, etc.
      So even without any behavioral evidence whatsoever, I would still be shocked if this wasn’t a thing in every animal from flies to humans.
      - jacob_cannell 31 Oct 2022 17:55 UTC
        2 points
        Parent
        
        I feel like you would describe this fact as a victory for empowerment theory, because getting eaten by a predator is disempowering.
        
        I feel like empowerment is mostly a digression here. There is the behavioral empowerment hypothesis which just says something like “in the absence of specific goals, animals act as if they are pursuing empowerment”, which I think is largely true—if we use a broad-empowerment definition like optionality. But there is a difference between “acting as if seeking empowerment”, and specifically seeking empowerment due to empowerment reward as computed by some more direct approximation.
        
        So I agree with you that assuming the superior colliculus is directly computing an innate terrain camo affection, that is at best just “acting as if” behavioral empowerment, and doesn’t really explain much more than “acting as if maximizing genetic fitness”.
        
        In this thread I have mostly been discussing value-of-information, which seems to be one of the primary reward signals for the human brain. Assuming that theory—the serotonin based info-value indirect reward pathway stuff—is all true, then I think we both agree that modeling that directly is naturally important for accurately modelling human values. Info value is a sort of empowerment related signal, but it’s not optionality empowerment. I do think it’s likely the brain also has estimated optionality reward somewhere, but I haven’t researched that as much and not sure where it is. But I think we also would agree that if there is direct optionality reward then modelling that is also important for modelling human values.
        
        There is another use of empowerment which is as a general bound on an unknown utility function. If you break utility down into short and long term components, the long term component converges to empowerment (for many/most agentic utility functions). That really is independent of how the brain may or may not be using empowerment signals. The hope is AGI optimizing for our long term empowerment would seek to acquire power and then hand it over to us, which is pretty much what we want. The main risk is that the short term component matters, but for that we can use some learned approximation of human values. That would eventually diverge in the long term, but that’s ok because the long term component converges to empowerment. Another issue is identifying the agents/agency to empower, which—as you point out—needs to consider altruism. Empowerment of a single human isn’t even the correct bound if we define the agent as our physical brain, as due to altruism our utility function is diffusely wider than a pure selfish rational assumption. The easiest way to handle that is by empowering humanity or agency more broadly. The harder more correct way is using some more complex multi-agent theory of mind, so that empowering a single human is empowering all the simulacra sub-agents they care about (of which the self is just one for an altruist).
        
        But I claim that lots of people like paintings of river landscapes, even when the painting has been hanging for months or even years and thus offers zero novelty. When I moved from one house to another, my wife hung up many of the same paintings that were on prominent display in the old house,
        
        Paintings have high novelty only on the first viewings, after that any novelty only comes from forgetting—but they are still more interesting than a blank wall. Regardless much of the reason people hang art is for the benefit of others.
        
        I claim that this fact is a victory for superior colliculus micro-habitat-related heuristics.
        
        More people like abstract art, surreal art, portraits, etc—landscapes are but a small fraction. So is that defeat for superior colliculus micro-habitat-related heuristics?
        
        My quick read of the human/primate superior colliculus indicates it mostly computes subonscious saccade/gaze targets mostly from V1 feedback but also from multi-modal signals from numerous regions, and is then inhibited/overriden when the cortex directs conscious eye movements from frontal eye fields.
        What links here?
        My take on Jacob Cannell’s take on AGI safety by Steven Byrnes (28 Nov 2022 14:01 UTC; 71 points)
        Steven Byrnes 1 Nov 2022 19:41 UTC
        2 points
        Parent
        Thanks for all that—very helpful! I’ll just respond to the part at the end.
        My quick read of the human/primate superior colliculus indicates it mostly computes subonscious saccade/gaze targets mostly from V1 feedback but also from multi-modal signals from numerous regions, and is then inhibited/overriden when the cortex directs conscious eye movements from frontal eye fields.
        I’m with you except for the “mostly”. Saccades (and “orienting reactions” more generally) are one of the things that the superior colliculus does, and it happens to be a well-studied one for various reasons (namely, it’s technically easy and medically relevant, I think). But I also think that the superior colliculus does other things too, and the literature on those things is underdeveloped (to put it politely). So randomly-selected SC papers “mostly” talk about saccades but we shouldn’t conclude that SC itself “mostly” exists for the purpose of saccades.
        For example, I think there’s abundant evidence that pretty much any animal with eyes has a set of a number (dozens?) of innate reactions to certain types of visual stimuli, and I think there’s strong reason to locate the visual-detection part of those reactions in the superior colliculus (a.k.a. optic tectum in non-mammals):
        Detecting “an unexpected visual stimulus that’s worthy of an orienting reaction” is a special case of this category, and SC definitely does that.
        There’s pretty good literature on face-detection in infant humans, and the author I like (Mark Johnson) suggests that it involves a low-resolution innate face detector (basically three dark blobs forming an inverted triangle) in the superior colliculus.
        My impression is that Mark Johnson was inspired by the filial-imprinting-in-chicks literature, where it’s known that the optic tectum is detecting salient visual things that might be worthy of imprinting on. (I haven’t really dived into this yet, could be wrong.)
        This paper argues that the mouse SC can detect expanding dark blobs in the upper field of view—which serve as a heuristic approximation that maybe there’s an incoming bird-of-prey—and trigger a scamper-away brainstem reaction. Actually this paper is even better on that topic.
        I think there’s good behavioral evidence (see e.g. refs here) that monkeys are scared^[1] of slithering snakes for direct innate reasons, independent of any within-lifetime evidence that slithering snakes are unusually important. If so, we need to explain that, presumably via some innate heuristics somewhere in the brain for what a slithering snake looks like. Which part of the brain? As usual, papers on this topic are a mess, but all the evidence I’ve seen is consistent with SC, and no other possibility makes any sense to me.
        (More generally, the process-of-elimination argument carries a lot of weight for me. The idea that visual cortex could detect snakes / birds / whatever, all on its own, without any other source of ground truth, would fly in the face of everything else I think I know about the cortex. And if it’s not the cortex, the SC is the only other possibility.)
        This paper has some issues but it does present I think decent evidence that the structure and activity of SC is compatible with the types of calculations-on-visual-input that I’m talking about.
        More people like abstract art, surreal art, portraits, etc—landscapes are but a small fraction. So is that defeat for superior colliculus micro-habitat-related heuristics?
        Again, eating ice cream is not actually health-promoting. But it effectively triggers various taste-related heuristics that originally evolved to make us have health-promoting eating behavior.
        By the same token, houses full of “pretty” paintings are not in fact better places for hunter-gatherers to live. But they (probably) do a slightly better job of triggering various vision-related heuristics that originally evolved to reward-shape early humans to find good and safe places to hang out in the African Savannah.
        This can be true even if the “pretty” paintings are abstract and have no superficial resemblance to the African savannah. Ice cream likewise has very little superficial resemblance to the food that our African ancestors ate. And “Three dark blobs forming an inverted triangle” has very little superficial resemblance to a human face, yet that’s supposedly what the human SC’s innate face detector is actually looking for.
        I think that there are many contributions to the decision of what to look at (including info-value) and what paintings to hang in your house (including impressing your friends). I only claim that SC habitat-derived heuristics are ONE of the contributions.
        Likewise, I wasn’t sure before, but my current impression is that you do NOT claim that info-value is a grand unified theory that completely 100% explains what we are looking at at any given moment.
        If so, it might be difficult to make progress by chatting about what paintings people hang or whatever—neither of us has a theory that makes sharp predictions about everything. ¯\_(ツ)_/¯
        ^
        Technically, it seems more like monkeys are innately “aroused” (in the psych jargon sense, not the sexual sense) by the sight of slithering snakes, but this arousal has a strong tendency to transmute into fear during within-lifetime learning. See here.
        jacob_cannell 1 Nov 2022 21:14 UTC
        4 points
        Parent
        The evidence seems pretty clear that the SC controls unconscious saccades/gazes. Given that background it makes perfect sense the SC is also a good location for simple crude innate detectors which bias saccades towards important targets: especially for infants, because human infants are barely functionally conscious at birth and so in the beginning the SC may have complete control. But gradually the higher ‘conscious loops’ involving BG, PFC and various other modules begin to take more control through the FEF (although not always of course).
        
        That all seems compatible with your evidence—I also remember reading that there are central pattern generators which actually start training the visual cortex in the womb on simple face like patterns. But I believe in a general rule of three for the brain: whenever you find evidence that the brain is doing something two different ways that both seem functionally correct, it’s probably using both of those methods and some third one you haven’t thought of yet. And SC having innate circuits to bias saccades seems likely.
        
        But that same evidence doesn’t show the SC is much involved in higher level conscious decisions about whether to stare at a painting or listen to a song for a few minutes vs eating ice cream. That is all reward-shaped high level planning involving the various known dopaminergic decision pathways combined with serotogenic (and other) pathways that feed into general info-value.
        
        Likewise, I wasn’t sure before, but my current impression is that you do NOT claim that info-value is a grand unified theory that completely explains what we are looking at at any given moment.
        
        I do claim it is the likely grand unified theory that most completely explains conscious decisions about info consumption choices in adults—and the evidence from which I sampled earlier is fairly extensive IMHO; whereas innate SC circuits explain infant gazes (in humans, SC probably has a larger role in older smaller brained vertebrates). Moreover, if we generalize from SC to other similar subcortical structures, I do agree those together mostly control the infant for the first few years—as the higher level loops which conscious thinking depend on all require significant training.
        
        Also—as I mentioned earlier I agree that fear of heights is probably innate, simple food taste biases are innate, obviously sexual attraction has some innate bootstrapping, etc so I’m open to the idea there is some landscape biasing in theory, but clearly the SC is unlikely to be involved in food taste shaping, and I don’t think you have shown much convincing evidence it is involved in visual taste shaping. But clearly there most be some innate visual shaping for at least sexual attraction—so evidence that SC drives that would also be good evidence it drives some landscape biasing, for example. But it seems like those reward shapers would need to be looking primarily at higher visual cortex rather than V1. So evidence that the SC’s inputs shift from V1 in infants up to higher visual cortex in adulthood would also be convincing, as that seems somewhat necessary for it to be involved in reward prediction of higher level learned visual patterns.
        
        I’m also curious about this part:
        
        And if it’s not the cortex, the SC is the only other possibility.
        
        And generally if we replace the reward shaping/bias of “savanna landscape” with “sexually attractive humanoid” then I’m more onboard the concept of something that is highly likely an innate circuit somewhere. (I don’t even buy the evolutionary argument for a savana landscape bias—humans spread out to many ecological niches including coastal zones which are nothing like the savana)
        Steven Byrnes 2 Nov 2022 18:20 UTC
        2 points
        Parent
        Here are some of my complaints about info-value as a grand unified theory by itself (i.e., in the absence of innate biases towards certain types of information over other types):
        There are endless fractal-like depths of complexity in rocks, and there are endless fractal-like depths of complexity in ants, and there are endless fractal-like depths of complexity in birdsong, and there are endless fractal-like depths of complexity in the shape of trees, etc. So “follow the gradient where you’re learning new things” by itself seems wildly under-constrained. You cite video-game ML papers, but in this respect, video-games (especially 1980s video-games) are not analogous to the real world. You can easily saturate the novelty in Pac-Man, and then the only way to get more novelty is to “progress” in the game in roughly the way that the game-designers intended. Pac-Man does not have 10,000 elaborate branching fascinating “side-quests” that don’t advance your score, right? If it did, I claim those ML papers would not have worked. But the real world does have “side quests” like that. You can get a lifetime supply of novelty by just closing your eyes and thinking about patterns in prime numbers, etc. Yet children reliably learn relevant things like language and culture much more than irrelevant things like higher-order patterns in pebble shape and coloration. Therefore, I am skeptical of any curiosity / novelty drive completely divorced from “hardcoded” drives that induce disproportionate curiosity / interest in certain specific types of things (e.g. human speech sounds) over other things (e.g. the detailed coloration of pebbles).
        (I think these hardcoded drives, like all hardcoded drives, are based on relatively simple heuristics, as opposed to being exquisitely aimed at specific complex concepts like “hunting”. I think “some simple auditory calculation that disproportionately triggers on human speech sounds” is a very plausible example.)
        If your response is “There is an objective content-neutral metric in which human speech sounds are more interesting than the detailed coloration of pebbles”, then I’m skeptical, especially if the metric looks like a kind of “greedy algorithm” that does not rely on the benefit of hindsight. In other words, once you’ve invested years into learning to decode human speech sounds, then it’s clear that they are surprisingly information-rich. But before making that investment, I think that human speech sounds wouldn’t stand out compared to the coloration of pebbles or the shapes of trees or the behavior of ants or whatever. Or at least they wouldn’t stand out so much that it would explain human children’s attention to them.
        We need to explain the fact that (I claim) different people seem very interested in different things, and these interests are heritable, e.g. interest-in-people versus interest-in-machines. Hardcoded drives in “what is interesting” would explain that, and I’m not sure what else would.
        This is unlikely to convince you, but there is a thing called “specific language impairment” where (according to my understanding) certain otherwise-intelligent kids are unusually inattentive to language, and wind up learning language much slower than their peers (although they often catch up eventually). I’m familiar with this because I think one of my kids has a mild case. If he’s playing, and someone talks to him, he rarely orients to it, just as I rarely orient to bird sounds if I’m in the middle of an activity. Speech just doesn’t draw his attention much! And both his tendency to converse and ability to articulate clearly are way below age level. I claim that’s not a coincidence—learning follows attention. Anyway, a nice theory of this centers around an innate human-speech-sound detector being less active than usual. (Conversely, long story, but I think one aspect of autism is kinda the opposite of that—overwhelming hypersensitivity to certain stimuli often including speech sounds and eye contact, which then leads to avoidance behavior.)
        There’s some evidence that a certain gene variant (ASPM) helps people learn tonal languages like Chinese, and the obvious-to-me mechanism is tweaking the innate human-speech-sound heuristics. That’s probably unlikely to convince you, because there are other possible mechanisms too, and ASPM is expressed all over the brain and you can’t ethically do experiments to figure out what part of the brain is mediating this.
        whenever you find evidence that the brain is doing something two different ways that both seem functionally correct, it’s probably using both of those methods and some third one you haven’t thought of yet
        I strongly disagree with the idea that SC and cortex are doing similar things. See discussion here. I think the cortex + striatum is fundamentally incapable of having an innate snake detector, because the cortex + striatum is fundamentally implementing a learning algorithm. Given a ground-truth loss function for the presence / absence of snakes, the cortex + striatum can do an excellent job learning to detect snakes in particular. But without such a loss function, they can’t. (Well, they can detect snakes without a special loss function, but only as “just another learned latent variable”. This latent variable couldn’t get tied to any special innate reaction, in the absence of trial-and-error experience.)
        Anyway, I claim that SC is playing the role of implementing the snake-heuristic calculations that underlie that loss function. (Among other things.)
        But that same evidence doesn’t show the SC is much involved in higher level conscious decisions about whether to stare at a painting or listen to a song for a few minutes vs eating ice cream. That is all reward-shaped high level planning involving the various known dopaminergic decision pathways combined with serotogenic (and other) pathways that feed into general info-value.
        SC projects to VTA/SNc, which is related to whether we find things positive/negative valence, pleasant/unpleasant etc. It’s not the only contribution, but I claim it’s one contribution.
        clearly the SC is unlikely to be involved in food taste shaping, and I don’t think you have shown much convincing evidence it is involved in visual taste shaping.
        I think the relevant unit is “brainstem and hypothalamus”, of which the SC is one part, the part that seems like it has the right inputs and multi-layer architecture to do things like calculate heuristics on the visual FOV. Food taste shaping is a different part of the brainstem, namely the gustatory nucleus of the medulla.
        But it seems like those reward shapers would need to be looking primarily at higher visual cortex rather than V1.
        I’m surprised that you wrote this. I thought you were on board with the idea that we should think of visual cortex as loosely (or even tightly) analogous to deep learning? Let’s train a 12-layer randomly-initialized ConvNet, and look at the vector of activations from layer 10, and decide on that basis whether you’re looking at a person, in the absence of any ground truth. It’s impossible, right? The ConvNet was randomly initialized, you can’t get any object-level information from the fact that neuron X in layer 10 has positive or negative activation, because it’s not a priori determined what role neuron X is going to wind up playing in the trained model.
        We need ground truth somehow, and my claim is that SC provides it. So my mainline expectation is that SC gets visual information in a way that bypasses the cortex altogether. This is at least partly true (retina→LGN→SC pathway). SC does get inputs from visual cortex, as it turns out, which had me confused for a while but I’m OK with it now. That’s a long story, but I still think the cortical input is unrelated to how SC detects human faces and snakes and whatnot.
        jacob_cannell 2 Nov 2022 23:21 UTC
        2 points
        Parent
        
        There are endless fractal-like depths of complexity in rocks, and there are endless fractal-like depths of complexity in ants, and there are endless fractal-like depths of complexity in birdsong, and there are endless fractal-like depths of complexity in the shape of trees, etc. So “follow the gradient where you’re learning new things” by itself seems wildly under-constrained.
        
        There is not ‘endless fractal-like depths of complexity’ in the retinal images of rocks or ants or trees, which is what is actually relevant here. For any model, a flat uniform color wall has near zero compressible complexity, as does a picture of noise (max entropy but it’s not learnable). Real world images have learnable complexity which crucially varies based on both the image and the model’s current knowledge. But it’s never “endless”: generally it’s going to be on order or less than the image entropy the cortex gets from the retina, which is comparable to compression with modern codecs.
        
        You cite video-game ML papers,
        
        Actually in this thread I cited neurosci papers: first that curiosity/info-value is a reward processed like hunger^[1], and a review article from 2020^[2] which is an update from a similar 2015 paper^[3].
        
        So “follow the gradient where you’re learning new things” by itself seems wildly under-constrained.
        
        Sure—but curiosity/info-gain obviously isn’t all of reward, so the various other components can also steer behavior paths towards fitness relevant directions, which then can indirectly biases the trajectory of the curiosity-driven learning as it’s always relevant to the models’ current knowledge and thus the experience trajectory.
        
        Therefore, I am skeptical of any curiosity / novelty drive completely divorced from “hardcoded” drives that induce disproportionate curiosity / interest in certain specific types of things (e.g. human speech sounds) over other things (e.g. the detailed coloration of pebbles).
        
        Recall that the infant is mostly driven by subcortical structures and innate patterns for the early years, and during all this time it is absolutely bombarded with human speech as the primary consistent complex audio signal. There may be some attentional bias towards human speech, but it may not be necessary, as there aren’t many other audio streams that could come close to competing. Birdsong is both less interesting and far less pervasive/frequent for most children’s audio experience. Also ‘visually interesting pebbles’ don’t seem that different than other early children’s toys: seems children would find them interesting (although there shape is typically boring).
        
        I strongly disagree with the idea that SC and cortex are doing similar things.
        
        I didn’t say they did—I said I’m aware of two proposals for an innate learning bias for faces: CPGs pretraining the viscortex in the womb, and innate attentional bias circuits in the SC. These are effectively doing a similar thing.
        
        We need ground truth somehow, and my claim is that SC provides it. So my mainline expectation is that SC gets visual information in a way that bypasses the cortex altogether. T
        
        For attentional bias/shaping the SC likely can only support very simple pattern biases close to a linear readout. So a simple bias to attend to faces seems possible, but I was actually talking about sexual attraction when I said:
        
        But clearly there most be some innate visual shaping for at least sexual attraction—so evidence that SC drives that would also be good evidence it drives some landscape biasing, for example. But it seems like those reward shapers would need to be looking primarily at higher visual cortex rather than V1.
        
        For sexual attraction the patterns are just too complex, so they are represented in IT or similar higher visual cortex. Any innate circuit that references human body shape images and computes their sexual attraction must get that input from higher viscortex—which then leads to the whole symbol grounding problem—as you point out, and I naturally agree.
        
        But regardless of the specific solution to the symbol grounding problem, a consequence of that solution is that the putative brain region computing attraction value of images of humanoid shapes would need to compute that primarily from higher viscortex/IT input.
        
        I think your model may be something like the SC computes an attentional bias which encodes all the innate sexiness geometry and thus guides us to spend more time saccading at sexy images rather than others, and possibly also outputs some reward info for this.
        
        But that could not work as stated, simply because the sexiness concept is very complex and requires a deepnet to compute (symmetry, fat content, various feature ratios, etc).
        
        Also this must be true, because otherwise we wouldn’t see the failures of sexual imprinting in birds that we do in fact observe.
        
        So how can the genome best specify a complex concept innately using the least number of bits? Just indexing neurons in the learned cortex directly would be bit-minimal, but as you point out that isn’t robust.
        
        However the topographic organization of cortex can help, as it naturally clusters neurons semantically.
        
        Another way to ‘locate’ specific learned neurons more robustly is through proxy matching, where you have dumb simple humanoid shape detectors and symmetry detectors etc encoding a simple sexiness visual concept—which could potentially be in the SC. But then during some critical window the firing patterns of those proxy circuits are used to locate the matching visual concept in visual cortex and connect to that. In other words, you can use the simple innate proxy circuit to indirectly locate a cluster of neurons in cortex, simply based on firing pattern correlation.
        
        This allows the genome to link to a high complex concept by specifying a low complexity proxy match for that concept in its earlier low complexity larval stage.
        
        Proxy matching implies that after critical period training whatever neurons represents innate sexiness must then shift to get their input from higher viscortex: IT rather than just V1, and certainly not LGN.
        
        Another related possibility is that the SC is just used to create the initial bootstrapping signal, and then some other brain region actually establishes the connection to innate downstream dependencies of sexiness and learned sexiness—so separating out the sexiness proxy from the used sexiness concept.
        
        Anyway my point was more that innate sexual attraction must be encoded somewhere, and any evidence that the SC is crucially involved with that is evidence it is crucially involved with other innate visual bias/shaping.
        
        ↩︎
        Shared striatal activity in decisions to satisfy curiosity and hunger at the risk of electric shocks
        
        ↩︎
        Systems neuroscience of curiosity
        
        ↩︎
        The psychology and neuroscience of curiosity
        
        What links here?
        My take on Jacob Cannell’s take on AGI safety by Steven Byrnes (28 Nov 2022 14:01 UTC; 71 points)
        Steven Byrnes 3 Nov 2022 18:13 UTC
        2 points
        Parent
        Thanks again for taking the time to chat, I am finding this super helpful in understanding where you’re coming from.
        Your description of proxy matching is a close match to what I’m thinking. (Sorry if I’ve been describing it poorly!)
        I think I got confused because I’m mapping it to neuroanatomy differently than you. I think SC is the “proxy” part, but the “matching” part is somewhere else, not SC. For example, it might look something like this:
        SC calculates a proxy, based on LGN→SC inputs. The output of this calculation is a signal, which I’ll call “Fits proxy?”
        The “Fits proxy?” signal then gets sent (indirectly) to a certain part of the amygdala, where it’s used as a “ground truth” for supervised learning.
        This part of the amygdala builds a trained model. The input to the trained model is (mostly-high-level) visual information, especially from IT. The output of the trained model is some signal, which I’ll call “Fits model?”
        The SC’s “Fits proxy?” signal and the amygdala’s “Fits model?” signal both go down to the hypothalamus and brainstem, possibly hitting the very same neurons that trigger specific innate reactions.
        Optionally, the trained model in the amygdala could stop updating itself after a critical period early in life.
        Also optionally, as the animal gets older, the “Fits proxy?” signal could have less and less influence on those innate reaction neurons in the hypothalamus & brainstem, while the “Fits model?” signal would have more and more influence.
        (This is one example; there are a bunch of other variations on this theme, including ones where you replace “part of the amygdala” with other parts of the forebrain like nucleus accumbens shell or lateral septum, and also where the proxy is coming from other places besides SC.)
        (This is a generalization of “calculating correlations”. If the amygdala trained model is only one “layer” in the deep learning sense, then it would be just calculating linear correlations between IT signals and the proxy, I think. My best guess is that the amygdala is learning a two-layer feedforward model (more or less), so a bit more complicated than linear correlations, although low confidence on that.)
        Again, since the trained model is in the amygdala, not SC, there’s no need to “shift” the SC’s inputs to IT. That’s why I was confused by what you wrote. :)
        jacob_cannell 3 Nov 2022 23:52 UTC
        4 points
        Parent
        Hey thanks for explaining this—makes sense to me and I think we are mostly in agreement. Using the proxy signal as a supervised learning target to recognize the learned target pattern in IT is a straightforward way to implement the matching, but probably not quite complete in practice. I suspect you also need to combine that with some strong priors to correctly carve out the target concept.
        
        Consider the equivalent example of trying to train a highly accurate cat image detector given a dataset containing say 20% cats combined with a crappy low complexity proxy cat detector to provide the labels. Can you really bootstrap improve discriminative models in that way with non-trivial proxy label noise? I suspect that the key to making this work is using the powerful generative model of the cortex as a regularizer, so you train it to recognize images the proxy detector labels as cats that are also close to the generative model’s data manifold. If you then reoptimize (in evolutionary time) the proxy detector to leverage that I think it makes the problem much more tractable. The generative model allows you to make the learned model far more selective around the actual data manifold to increase robustness. In very simple vague terms the model would then be learning the combination of high proxy probability combined with low distance to the data manifold of examples from the critical training set.
        
        Later if you then test OoD on vague non-cats (dogs, stuffed animals) not encountered in training that would confuse the simple proxy the learned model can reject those—even though it never saw them during critical training—simply because they are far from the generative manifold, and the learned model is ‘shrunk’ to fit that manifold.
        
        I do agree the amygdala does seem like a good fit for the location of the learned symbol circuit, although at that point it raises the question of why not also just have the proxy in the amygdala? If the amygdala has the required inputs from LGN and/or V1 it would be my guess that it could also just colocate the innate proxy circuit. (I haven’t looked in the lit to see if those connections exist)
        
        Also 6 seems required for the system to work as well in adulthood as it typically does, and yet also explain the out of distribution failures for imprinting etc. (Once the IT representation is learned you want to use that exclusively, as it should be strictly superior to the proxy circuit. This seems a little weird at first, but the)
        
        The hope is that this same mechanism which seems well suited for handling imprinting also works for grounding sexual attraction (as an elaboration of imprinting) and then more complex concepts like representations of other’s emotions from facial expression, vocal tone, etc proxies, and then combining that with empathic simulation to ground a model of other’s values/utility for social game theory, altruism, etc.
        Expand this thread
        Steven Byrnes 7 Nov 2022 18:53 UTC
        2 points
        Parent
        The hope is that this same mechanism which seems well suited for handling imprinting also works for grounding sexual attraction (as an elaboration of imprinting) and then more complex concepts like representations of other’s emotions from facial expression, vocal tone, etc proxies, and then combining that with empathic simulation to ground a model of other’s values/utility for social game theory, altruism, etc.
        Yes, that is my hope too! And the main thing I’m working on most days is trying to flesh out the details.
        I do agree the amygdala does seem like a good fit for the location of the learned symbol circuit, although at that point it raises the question of why not also just have the proxy in the amygdala? If the amygdala has the required inputs from LGN and/or V1 it would be my guess that it could also just colocate the innate proxy circuit. (I haven’t looked in the lit to see if those connections exist)
        For example, I claim that all the vision-related inputs to the amygdala have at some point passed through at least one locally-random filter stage (cf. “pattern separation” in neuro literature or “compressed sensing” in DSP literature). That’s perfectly fine if the amygdala is just going to use those inputs as feedstock for an SL model. SL models don’t need to know a priori which input neuron is representing which object-level pattern, because it’s going to learn the connections, so if there’s some randomness involved, it’s fine. But the randomness would be a very big problem if the amygdala needs to use those input signals to calculate a ground-truth proxy.
        As another example, a ground-truth proxy requires zero adjustable parameters (because how would you adjust them?), whereas a learning algorithm does well with as many adjustable parameters as possible, more or less.
        So I see these as very different algorithmic tasks—so different that I would expect them to wind up in different parts of the brain, just on general principles.
        The amygdala is a hodgepodge grouping of nuclei, some of which are “really” (embryologically & evolutionarily) part of the cortex, and the rest of which are “really” part of the striatum (ref). So if we’re going to say that the cortex and striatum are dedicated to running within-lifetime learning algorithms (which I do say), then we should expect the amygdala to be in that same category too.
        By contrast, SC is in the brainstem, and if you go far enough back, SC is supposedly a cousin of the part of the pre-vertebrate (e.g. amphioxus) nervous system that implements a simple “escape circuit” by triggering swimming when it detects a shadow—in other words, a part of the brain that triggers an innate reaction based on a “hardcoded” type of pattern in visual input. So it would make sense to say that the SC is still more-or-less doing those same types of calculations.