Tentatively, something like: “the selection process selects for cognitive components that historically correlated with better performance according to the metric in the relevant contexts.”
Many observed values in humans and other mammals (e.g. fear, play/boredom, friendship/altruism, love, etc.) seem to be values that were instrumental for promoting inclusive genetic fitness (promoting survival, exploration, cooperation and sexual reproduction/survival of progeny respectively). Yet, humans and mammals seem to value these terminally and not because of their instrumental value on inclusive genetic fitness.
That the instrumentally convergent goals of evolution’s fitness criterion manifested as “terminal” values in mammals is in my opinion strong empirical evidence against the goals ontology and significant evidence in support of shard theory’s basic account of value formation in response to selection pressure[6].
Learning agents in the real world form values in a primarily instrumental manner, in response to the selection pressures they faced.
I don’t have a firm mechanistic grasp of how selection shapes the cognitive/computational circuits that form values, and I’m not sure if the credit assignment based mechanism posited by shard theory is well applicable outside an RL context.
That the instrumentally convergent goals of evolution’s fitness criterion manifested as “terminal” values in mammals is in my opinion strong empirical evidence against the goals ontology and significant evidence in support of shard theory’s basic account of value formation in response to selection pressure
I consider evolution to be unrelated to the cases that I think shard theory covers. So I don’t count this as evidence in favor of shard theory, because I think shard theory does not make predictions about the evolutionary regime, except insofar as the evolved creatures have RL/SSL-like learning processes which mostly learn from scratch. But then that’s not making reference to evolution’s fitness criterion.
(FWIW, I think the “selection” lens is often used inappropriately and often proves too much, too easily. Early versions of shard theory were about selection pressure over neural circuits, and I now think that focus was misguided. But I admit that your tentative definition holds some intuitive appeal, my objections aside.)
Strongly upvoted that comment. I think your point about needing to understand the mechanistic details of the selection process is true/correct.
That said, I do have some contrary thoughts:
The underdetermined consequences of selection does not apply to my hypothesis because my hypothesis did not predict apriori which values would be selected for to promote inclusive genetic fitness in the environment of evolutionary adaptedness (EEA)
Rather it (purports to) explain why the (particular) values that emerged where selected for?
Alternatively, if you take it as a given that “survival, exploration, cooperation and sexual reproduction/survival of progeny” were instrumental for promoting IGF in the EEA, then it retrodicts that terminal values would emerge which were directly instrumental for those features (and perhaps that said terminal values would be somewhat widespread)
Nailing down the particular values that emerged would require conditioning on more information/more knowledge of the inductive biases of evolutionary processes than I possess
I guess you could say that this version of the selection lens proves too little as it says little apriori about what values will be selected for
Without significant predictive power, perhaps selection isn’t pulling its epistemic weight as an explanation?
Potential reasons why selection may nonetheless be a valuable lens
If we condition on more information we might be able to make non-trivial predictions about what properties will be selected for
The properties so selected for might show convergence?
Perhaps in the limit of selection for a particular metric in a given environment, the artifacts under selection pressure converge towards a particular archetype
Such an archetype (if it exists) might be an idealisation of the products of said selection pressures
Empirically, we do see some convergent feature development in e.g. evolution
Intelligent systems are in fact produced by selection processes, so there is probably in fact some mechanistic story of how selection influences learned values
except insofar as the evolved creatures have RL/SSL-like learning processes which mostly learn from scratch. But then that’s not making reference to evolution’s fitness criterion.
Something like genes that promote/facilitate values that promoted inclusive genetic fitness in the ancestral environment (conditional on the rest of the gene pool) would become more pervasive in the population (and vice versa). I think this basic account can still be true even if humans learn from scratch via RL/SSL like learning processes.
To: @Quintin Pope, @TurnTrout
I think “Reward is not the Optimisation Target” generalises straightforwardly to any selection metric.
Tentatively, something like: “the selection process selects for cognitive components that historically correlated with better performance according to the metric in the relevant contexts.”
From “Contra “Strong Coherence”″:
Learning agents in the real world form values in a primarily instrumental manner, in response to the selection pressures they faced.
I don’t have a firm mechanistic grasp of how selection shapes the cognitive/computational circuits that form values, and I’m not sure if the credit assignment based mechanism posited by shard theory is well applicable outside an RL context.
I consider evolution to be unrelated to the cases that I think shard theory covers. So I don’t count this as evidence in favor of shard theory, because I think shard theory does not make predictions about the evolutionary regime, except insofar as the evolved creatures have RL/SSL-like learning processes which mostly learn from scratch. But then that’s not making reference to evolution’s fitness criterion.
(FWIW, I think the “selection” lens is often used inappropriately and often proves too much, too easily. Early versions of shard theory were about selection pressure over neural circuits, and I now think that focus was misguided. But I admit that your tentative definition holds some intuitive appeal, my objections aside.)
Strongly upvoted that comment. I think your point about needing to understand the mechanistic details of the selection process is true/correct.
That said, I do have some contrary thoughts:
The underdetermined consequences of selection does not apply to my hypothesis because my hypothesis did not predict apriori which values would be selected for to promote inclusive genetic fitness in the environment of evolutionary adaptedness (EEA)
Rather it (purports to) explain why the (particular) values that emerged where selected for?
Alternatively, if you take it as a given that “survival, exploration, cooperation and sexual reproduction/survival of progeny” were instrumental for promoting IGF in the EEA, then it retrodicts that terminal values would emerge which were directly instrumental for those features (and perhaps that said terminal values would be somewhat widespread)
Nailing down the particular values that emerged would require conditioning on more information/more knowledge of the inductive biases of evolutionary processes than I possess
I guess you could say that this version of the selection lens proves too little as it says little apriori about what values will be selected for
Without significant predictive power, perhaps selection isn’t pulling its epistemic weight as an explanation?
Potential reasons why selection may nonetheless be a valuable lens
If we condition on more information we might be able to make non-trivial predictions about what properties will be selected for
The properties so selected for might show convergence?
Perhaps in the limit of selection for a particular metric in a given environment, the artifacts under selection pressure converge towards a particular archetype
Such an archetype (if it exists) might be an idealisation of the products of said selection pressures
Empirically, we do see some convergent feature development in e.g. evolution
Intelligent systems are in fact produced by selection processes, so there is probably in fact some mechanistic story of how selection influences learned values
Something like genes that promote/facilitate values that promoted inclusive genetic fitness in the ancestral environment (conditional on the rest of the gene pool) would become more pervasive in the population (and vice versa). I think this basic account can still be true even if humans learn from scratch via RL/SSL like learning processes.