Thank you for providing your feedback on what we’ve written so far. I’m a bit surprised that you interpret shard theory as supporting the blank slate position. In hindsight, we should have been more clear about this. I haven’t studied behavior genetics very much, but my own rough prior is that genetics explain about 50% of a given trait’s variance (values included).
Shard theory is mostly making a statement about the mechanisms by which genetics can influence values (that they must overcome / evade information inaccessibility issues). I don’t think shard theory strongly predicts any specific level of heritability, though it probably rules out extreme levels of genetic determinism.
This Shard Theory argument seems to reflect a fundamental misunderstanding of how evolution shapes genomes to produce phenotypic traits and complex adaptations. The genome never needs to ‘scan’ an adaptation and figure out how to reverse-engineer it back into genes. The genetic variants simply build a slightly new phenotypic variant of an adaptation, and if it works better than existing variants, then the genes that built it will tend to propagate through the population. The flow of design information is always from genes to phenotypes, even if the flow of selection pressures is back from phenotypes to genes. This one-way flow of information from DNA to RNA to proteins to adaptations has been called the ‘Central Dogma of molecular biology’, and it still holds largely true (the recent hype about epigenetics notwithstanding).
Shard Theory implies that biology has no mechanism to ‘scan’ the design of fully-mature, complex adaptations back into the genome, and therefore there’s no way for the genome to code for fully-mature, complex adaptations. If we take that argument at face value, then there’s no mechanism for the genome to ‘scan’ the design of a human spine, heart, hormone, antibody, cochlea, or retina, and there would be no way for evolution or genes to influence the design of the human body, physiology, or sensory organs. Evolution would grind to a halt – not just at the level of human values, but at the level of all complex adaptations in all species that have ever evolved.
What we mean is that there are certain constraints on how a hardcoded circuit can interact with a learned world model which make it very difficult for the hardcoded circuit to exactly locate / interact with concepts within that world model. This imposes certain constraints on the types of values-shaping mechanisms that are available to the genome. It’s conceptually no different (though much less rigorous) than using facts about chemistry to impose constraints on how a cell’s internal molecular processes can work. Clearly, they do work. The question is, given what we know of constraints in the domain in question, how do they work? And how can we adapt similar mechanisms for our own purposes?
Shard Theory adopts a relatively ‘Blank Slate’ view of human values, positing that we inherit only a few simple, crude values related to midbrain reward circuitry, which are presumably universal across humans, and all other values are scaffolded and constructed on top of those.
I’d note that simple reward circuitry can influence the formation of very complex values.
Why would reward circuitry be constant across humans? Why would the values it induces be constant either?
I think results such as the domestication of foxes imply a fair degree of variation in the genetically specified reward circuitry between individuals, otherwise they couldn’t have selected for domestication. I expect similar results hold for humans.
Human values are heritable,,,
This is a neat summary of values-related heritability results. I’ll look into these in more detail in future, so thank you for compiling these. However, the provided summaries are roughly in line with what I expect from these sorts of studies.
Shard Theory implies that genes shape human brains mostly before birth, setting up the basic limbic reinforcement system, and then Nurture takes over, such that heritability should decrease from birth to adulthood.
I completely disagree. The reward system applies continuous pressure across your lifetime, so there’s one straightforward mechanism for the genome to influence values developments after birth. There are other more sophisticated such mechanisms.
E.g., Steve Byrnes describes short and long term predictors of low-level sensory experiences. Though the genome specifies which sensory experiences the predictor predicts, how the predictor does so is learned over a lifetime. This allows the genome to have “pointers” to certain parts of the learned world model, which can let genetically specified algorithms steer behavior even well after birth, as Steve outlines here.
Also, you see a similar pattern in current RL agents. A freshly initialized agent acts completely randomly, with no influence from its hard-coded reward function. As training progresses, behavior becomes much more strongly determined by its reward function.
Human Connectome Project studies show that genetic influences on brain structure are not restricted to ‘subcortical hardwiring’ …
Very interesting biological results, but consistent with what we were saying about the brain being mostly randomly initialized.
Our position is that the information content of the brain is mostly learned. “Random initialization” can include a high degree of genetic influence on local neurological structures. In ML, both Gaussian and Xavier initialization count as “random initialization”, even though they lead to different local structures. Similarly, I expect the details of the brain’s stochastic local connectivity pattern at birth vary with genome and brain region. However, the information content from the genome is bounded above by the information content of the genome itself, which is only about 3 billion base pairs. So, most of the brain must be learned from scratch.
Quintin (and also Alex) - first, let me say, thank you for the friendly, collegial, and constructive comments and replies you’ve offered. Many folks get reactive and defensive when they’re hit with a 6,000-word critique of their theory, but you’re remained constructive and intellectually engaged. So, thanks for that.
On the general point about Shard Theory being a relatively ‘Blank Slate’ account, it might help to think about two different meanings of ‘Blank Slate’—mechanistic versus functional.
A mechanistic Blank Slate approach (which I take Shard Theory to be, somewhat, but not entirely, since it does talk about some reinforcement systems being ‘innate’), emphasizes the details of how we get from genome to brain development to adult psychology and behavior. Lots of discussion about Shard Theory has centered around whether the genome can ‘encode’ or ‘hardwire’ or ‘hard-code’ certain bits of human psychology.
A functional Blank Slate approach (which I think Shard Theory pursues even more strongly, to be honest), doesn’t make any positive, theoretically informative use of any evolutionary-functional analysis to characterize animal or human adaptations. Rather, functional Blank Slate approaches tend to emphasize social learning, cross-cultural differences, shared family environments, etc as sources of psychology.
To highlight the distinction: evolutionary psychology doesn’t start by asking ‘what can the genome hard-wire?’ Rather, it starts with the same key questions that animal behavior researchers ask about any behavior in any species: ‘What selection pressures shaped this behavior? What adaptive problems does this behavior solve? How do the design details of this adaptation solve the functional problem that it evolved to cope with?’
In terms of Tinbergen’s Four Questions, a lot of the discussion around Shard Theory seems to focus on proximate ontogeny, whereas my field of evolutionary psychology focuses more on ultimate/evolutionary functions and phylogeny.
I’m aware that many folks on LessWrong take the view that the success of deep learning in neural networks, and neuro-theoretical arguments about random initialization of neocortex (which are basically arguments about proximate ontogeny), mean that it’s useless to do any evolutionary functional or phylogenetic analysis of human behavior when thinking about AI alignment (basically, on the grounds that things like kin detection systems, cheater detection systems, mate preferences, or death-avoidance systems couldn’t possible evolve fulfil those functions in any meaningful sense.)
However, I think there’s substantial evidence, in the 163 years since Darwin’s seminal work, that evolutionary-functional analysis of animal adaptations, preferences, and values has been extremely informative about animal behavior—just as it has about human behavior. So, it’s hard to accept any theoretical argument that the genome couldn’t possible encode any of the behaviors that animal behavior researchers and evolutionary psychologists have been studying for so many decades. It wouldn’t just mean throwing out human evolutionary psychology. It would mean throwing out virtually all scientifically informed research on behavior in all other species, including classic ethology, neuroethology, behavioral ecology, primatology, and evolutionary anthropology.
However, the information content from the genome is bounded above by the information content of the genome itself, which is only about 3 billion base pairs. So, most of the brain must be learned from scratch.
Can you say more on this? This seems invalid to me. This seems like, you have a program whose specification is a few GB, but that doesn’t mean the program can’t specify more than 3GB of meaningful data in the final brain—just that it’s not specifying data with K-complexity greater than eg 4GB.
Quintin & Alex—this is a very tricky issue that’s been discussed in evolutionary psychology since the late 1980s.
Way back then, Leda Cosmides & John Tooby pointed out that the human genome will ‘offload’ any information it can that’s needed for brain development onto any environmental regularities that can be expected to be available externally, out in the world. For example, the genome doesn’t need to specify everything about time, space, and causality that might be relevant in reliably building a brain that can do intuitive physics—as long as kids can expect that they’ll encounter objects and events that obey basic principles of time, space, and causality. In other words, the ‘information content’ of the mature brain represents the genome taking maximum advantage of statistical regularities in the physical and social worlds, in order to build reliably functioning adult adaptations. See, for example, their writings here and here.
Now, should we call that kind of environmentally-driven calibration and scaffolding of evolved adaptations a form of ‘learning’? It is in some ways, but in other ways, the term ‘learning’ would distract attention away from the fact that we’re talking about a rich suite of evolved adaptations that are adapting to cross-generational regularities in the world (e.g. gravity, time, space, causality, the structure of optic flow in visual input, and many game-theoretic regularities of social and sexual interaction) -- rather than to novel variants or to cultural traditions.
Also, if we take such co-determination of brain structure by genome and environmental regularities as just another form of ‘learning’, we’re tempted to ignore the last several decades of evolutionary functional analysis of the psychological adaptations that do reliably develop in mature adults across thousands of species. In practice, labeling something ‘learned’ tends to foreclose any evolutionary-functional analysis of why it works the way it works. (For example, the still-common assumption that jealousy is a ‘learned behavior’ obscured the functional differences and sex differences between sexual jealousy and resource/emotional jealousy).
As an analogy, the genome specifies some details about how the lungs grow—but lung growth depends on environmental regularities such as the existence of oxygen and nitrogen at certain concentrations and pressures in the atmosphere; without those gasses, lungs don’t grow right. Does that mean the lungs ‘learn’ their structure from atmosphere gasses rather than just from the information in the genome? I think that would be a peculiar way to look at it.
The key issue is that there’s a fundamental asymmetry between the information in the genome and the information in the environment: the genome adapts to promote the reliable development of complex functional adaptations that take advantage of environmental regularities, but the environmental regularities doesn’t adapt in that way to help animals survive and reproduce (e.g. time, gravity, causality, and optic flow don’t change to make organismic development easier or more reliable).
Thus, if we’re serious about understanding the functional design of human brains, minds, and values, I think it’s often more fruitful to focus on the genomic side of development, rather than the environmental side (or the ‘learning’ side, as usually construed). (Of course, with the development of cumulative cultural traditions in our species in the last hundred thousand years or so, a lot more adaptively useful information is stored out in the environment—but most of the fundamental human values that we’d want our AIs to align with are shared across most mammalian species, and are not unique to humans with culture.)
Thank you for providing your feedback on what we’ve written so far. I’m a bit surprised that you interpret shard theory as supporting the blank slate position. In hindsight, we should have been more clear about this. I haven’t studied behavior genetics very much, but my own rough prior is that genetics explain about 50% of a given trait’s variance (values included).
Shard theory is mostly making a statement about the mechanisms by which genetics can influence values (that they must overcome / evade information inaccessibility issues). I don’t think shard theory strongly predicts any specific level of heritability, though it probably rules out extreme levels of genetic determinism.
What we mean is that there are certain constraints on how a hardcoded circuit can interact with a learned world model which make it very difficult for the hardcoded circuit to exactly locate / interact with concepts within that world model. This imposes certain constraints on the types of values-shaping mechanisms that are available to the genome. It’s conceptually no different (though much less rigorous) than using facts about chemistry to impose constraints on how a cell’s internal molecular processes can work. Clearly, they do work. The question is, given what we know of constraints in the domain in question, how do they work? And how can we adapt similar mechanisms for our own purposes?
I’d note that simple reward circuitry can influence the formation of very complex values.
Why would reward circuitry be constant across humans? Why would the values it induces be constant either?
I think results such as the domestication of foxes imply a fair degree of variation in the genetically specified reward circuitry between individuals, otherwise they couldn’t have selected for domestication. I expect similar results hold for humans.
This is a neat summary of values-related heritability results. I’ll look into these in more detail in future, so thank you for compiling these. However, the provided summaries are roughly in line with what I expect from these sorts of studies.
I completely disagree. The reward system applies continuous pressure across your lifetime, so there’s one straightforward mechanism for the genome to influence values developments after birth. There are other more sophisticated such mechanisms.
E.g., Steve Byrnes describes short and long term predictors of low-level sensory experiences. Though the genome specifies which sensory experiences the predictor predicts, how the predictor does so is learned over a lifetime. This allows the genome to have “pointers” to certain parts of the learned world model, which can let genetically specified algorithms steer behavior even well after birth, as Steve outlines here.
Also, you see a similar pattern in current RL agents. A freshly initialized agent acts completely randomly, with no influence from its hard-coded reward function. As training progresses, behavior becomes much more strongly determined by its reward function.
Very interesting biological results, but consistent with what we were saying about the brain being mostly randomly initialized.
Our position is that the information content of the brain is mostly learned. “Random initialization” can include a high degree of genetic influence on local neurological structures. In ML, both Gaussian and Xavier initialization count as “random initialization”, even though they lead to different local structures. Similarly, I expect the details of the brain’s stochastic local connectivity pattern at birth vary with genome and brain region. However, the information content from the genome is bounded above by the information content of the genome itself, which is only about 3 billion base pairs. So, most of the brain must be learned from scratch.
Quintin (and also Alex) - first, let me say, thank you for the friendly, collegial, and constructive comments and replies you’ve offered. Many folks get reactive and defensive when they’re hit with a 6,000-word critique of their theory, but you’re remained constructive and intellectually engaged. So, thanks for that.
On the general point about Shard Theory being a relatively ‘Blank Slate’ account, it might help to think about two different meanings of ‘Blank Slate’—mechanistic versus functional.
A mechanistic Blank Slate approach (which I take Shard Theory to be, somewhat, but not entirely, since it does talk about some reinforcement systems being ‘innate’), emphasizes the details of how we get from genome to brain development to adult psychology and behavior. Lots of discussion about Shard Theory has centered around whether the genome can ‘encode’ or ‘hardwire’ or ‘hard-code’ certain bits of human psychology.
A functional Blank Slate approach (which I think Shard Theory pursues even more strongly, to be honest), doesn’t make any positive, theoretically informative use of any evolutionary-functional analysis to characterize animal or human adaptations. Rather, functional Blank Slate approaches tend to emphasize social learning, cross-cultural differences, shared family environments, etc as sources of psychology.
To highlight the distinction: evolutionary psychology doesn’t start by asking ‘what can the genome hard-wire?’ Rather, it starts with the same key questions that animal behavior researchers ask about any behavior in any species: ‘What selection pressures shaped this behavior? What adaptive problems does this behavior solve? How do the design details of this adaptation solve the functional problem that it evolved to cope with?’
In terms of Tinbergen’s Four Questions, a lot of the discussion around Shard Theory seems to focus on proximate ontogeny, whereas my field of evolutionary psychology focuses more on ultimate/evolutionary functions and phylogeny.
I’m aware that many folks on LessWrong take the view that the success of deep learning in neural networks, and neuro-theoretical arguments about random initialization of neocortex (which are basically arguments about proximate ontogeny), mean that it’s useless to do any evolutionary functional or phylogenetic analysis of human behavior when thinking about AI alignment (basically, on the grounds that things like kin detection systems, cheater detection systems, mate preferences, or death-avoidance systems couldn’t possible evolve fulfil those functions in any meaningful sense.)
However, I think there’s substantial evidence, in the 163 years since Darwin’s seminal work, that evolutionary-functional analysis of animal adaptations, preferences, and values has been extremely informative about animal behavior—just as it has about human behavior. So, it’s hard to accept any theoretical argument that the genome couldn’t possible encode any of the behaviors that animal behavior researchers and evolutionary psychologists have been studying for so many decades. It wouldn’t just mean throwing out human evolutionary psychology. It would mean throwing out virtually all scientifically informed research on behavior in all other species, including classic ethology, neuroethology, behavioral ecology, primatology, and evolutionary anthropology.
Can you say more on this? This seems invalid to me. This seems like, you have a program whose specification is a few GB, but that doesn’t mean the program can’t specify more than 3GB of meaningful data in the final brain—just that it’s not specifying data with K-complexity greater than eg 4GB.
Quintin & Alex—this is a very tricky issue that’s been discussed in evolutionary psychology since the late 1980s.
Way back then, Leda Cosmides & John Tooby pointed out that the human genome will ‘offload’ any information it can that’s needed for brain development onto any environmental regularities that can be expected to be available externally, out in the world. For example, the genome doesn’t need to specify everything about time, space, and causality that might be relevant in reliably building a brain that can do intuitive physics—as long as kids can expect that they’ll encounter objects and events that obey basic principles of time, space, and causality. In other words, the ‘information content’ of the mature brain represents the genome taking maximum advantage of statistical regularities in the physical and social worlds, in order to build reliably functioning adult adaptations. See, for example, their writings here and here.
Now, should we call that kind of environmentally-driven calibration and scaffolding of evolved adaptations a form of ‘learning’? It is in some ways, but in other ways, the term ‘learning’ would distract attention away from the fact that we’re talking about a rich suite of evolved adaptations that are adapting to cross-generational regularities in the world (e.g. gravity, time, space, causality, the structure of optic flow in visual input, and many game-theoretic regularities of social and sexual interaction) -- rather than to novel variants or to cultural traditions.
Also, if we take such co-determination of brain structure by genome and environmental regularities as just another form of ‘learning’, we’re tempted to ignore the last several decades of evolutionary functional analysis of the psychological adaptations that do reliably develop in mature adults across thousands of species. In practice, labeling something ‘learned’ tends to foreclose any evolutionary-functional analysis of why it works the way it works. (For example, the still-common assumption that jealousy is a ‘learned behavior’ obscured the functional differences and sex differences between sexual jealousy and resource/emotional jealousy).
As an analogy, the genome specifies some details about how the lungs grow—but lung growth depends on environmental regularities such as the existence of oxygen and nitrogen at certain concentrations and pressures in the atmosphere; without those gasses, lungs don’t grow right. Does that mean the lungs ‘learn’ their structure from atmosphere gasses rather than just from the information in the genome? I think that would be a peculiar way to look at it.
The key issue is that there’s a fundamental asymmetry between the information in the genome and the information in the environment: the genome adapts to promote the reliable development of complex functional adaptations that take advantage of environmental regularities, but the environmental regularities doesn’t adapt in that way to help animals survive and reproduce (e.g. time, gravity, causality, and optic flow don’t change to make organismic development easier or more reliable).
Thus, if we’re serious about understanding the functional design of human brains, minds, and values, I think it’s often more fruitful to focus on the genomic side of development, rather than the environmental side (or the ‘learning’ side, as usually construed). (Of course, with the development of cumulative cultural traditions in our species in the last hundred thousand years or so, a lot more adaptively useful information is stored out in the environment—but most of the fundamental human values that we’d want our AIs to align with are shared across most mammalian species, and are not unique to humans with culture.)