evolution did in fact find some weird way to create humans who rather obviously consciously optimize for IGF! [...]
If Evolution had a lot more time to align humans to relative-gene-replication-count, before humans put an end to biological life, then sure, seems plausible that Evolution might be able to align humans very robustly. But Evolution does not have infinite time or “retries”—humanity is in the process of executing something like a “sharp left turn”, and seems likely to succeed long before the human gene pool is taken over by sperm bank donors and such.
humans put an end to biological life … humanity is in the process of executing something like a “sharp left turn”,
Humans have not put an end to biological life.
Your doom predictions aren’t evidence, and can’t be used in any way in this analogy. To do so is just circular reasoning. “Sure brains haven’t demonstrated misalignment yet, but they are about to because doom is coming! Therefor evolution fails at alignment and thus doom is likely!”
For rational minds, evidence is strictly historical[1]. The evidence we have to date is that humans are enormously successful, despite any slight misalignment or supposed “sharp left turn”.
There are many other scenarios where DNA flourishes even after a posthuman transition.
Look closely at how Solomonoff induction works, for example. Its world model is updated strictly from historical evidence, not its own future predictions.
C’mon, man, that’s obviously a misrepresentation of what I was saying. Or maybe my earlier comment failed badly at communication? In case that’s so, here’s an attempted clarification (bolded parts added):
If Evolution had a lot more time (than I expect it to have) to align humans to relative-gene-replication-count, before humans put an end to biological life , as they seem to me to be on track to do, based on things I have observed in the past, then [...]
But Evolution (almost surely) does not have infinite time [...]
Point being: Sure, Evolution managed to cough up some individuals who explicitly optimize for IGF. But they’re exceptions, not the rule; and humanity seems (based on past observations!) to be on track to (mostly) end DNA-based life. So it seems premature to say that Evolution succeeded at aligning humanity.
In case you’re wondering what past observations lead me to think that humans are unaligned[2] w.r.t. IGF and on track to end (or transcend) biological life, here are some off the top of my head:
Of the people whose opinions on the subject I’m aware of (including myself), nearly all would like to transcend (or end) biological life.[3]
Birth rates in most developed nations have been low or below replacement for a long time.[4] There seems to be a negative correlation between wealth/education and number of offspring produced. That matches my impression that as people gain wealth, education, and empowerment in general, most choose to spend it mostly on something other than producing offspring.
Diligent sperm bank donors are noteworthy exceptions. Most people are not picking obvious low-hanging fruit to increasing their IGF. Rich people waste money on yachts and stuff, instead of using it to churn out as many high-fitness offspring as possible; etc.
AFAIK, most of the many humans racing to build ASI are not doing so with the goal of increasing their IGF. And absent successful attempts to align ASI specifically to producing lots of DNA-based replicators, I don’t see strong reason to expect the future to be optimized for quantity of DNA-based replicators.
Perhaps you disagree with the last point above?
There are many other scenarios where DNA flourishes even after a posthuman transition.
Interesting. Could you list a few of those scenarios?
Note: I wasn’t even talking (only) about doom; I was talking about humanity seemingly being on track to end biological life. I think the “good” outcomes probably also involve transcending biology/DNA-based replicators.
So it seems premature to say that Evolution succeeded at aligning humanity.
Evolution has succeeded at aligning homo sapiens brains to date[1] - that is the historical evidence we have.
I don’t think most transhumanists explicitly want to end biological life, and most would find that abhorrent. Transcending to a postbiological state probably doesn’t end biology any more/less than biology ended geology.
There are many other scenarios where DNA flourishes even after a posthuman transition.
Interesting. Could you list a few of those scenarios?
The future is complex and unknown. Is the ‘DNA’ we are discussing the information content or the physical medium? Seems rather obvious it’s the information that matters, not the medium. Transcendence to a posthuman state probably involves vast computation some of which is applied to ancestral simulations (which we may already be in) which enormously preserves and multiplies the info content of the DNA.
Not perfectly of course but evolution doesn’t do anything perfectly. It aligned brains well enough such that the extent of any misalignment was insignificant compared to the enormous utility our brains provided.
Evolution has succeeded at aligning homo sapiens brains to date
I’m guessing we agree on the following:
Evolution shaped humans to have various context-dependent drives (call them Shards) and the ability to mentally represent and pursue complex goals. Those Shards were good proxies for IGF in the EEA[1].
Those Shards were also good[2] enough to produce billions of humans in the modern environment. However, it is also the case that most modern humans spend at least part of their optimization power on things orthogonal to IGF.
I think our disagreement here maybe boils down to approximately the following question:
With what probability are we in each of the following worlds?
(World A) The Shards only work[2:1] conditional on the environment being sufficiently similar to the EEA, and humans not having too much optimization power. If the environment changes too far OOD, or if humans were to gain a lot of power[3], then the Shards would cease to be good[2:2] proxies.
In this world, we should expect the future to contain only a small fraction[4] of the “value” it would have, if humanity were fully “aligned”[2:3]. I.e. Evolution failed to “(robustly) align humanity”.
(World B) The Shards (in combination with other structures in human DNA/brains) are in fact sufficiently robust that they will keep humanity aligned[2:4] even in the face of distributional shift and humans gaining vast optimization power.
In this world, we should expect the future to contain a large fraction of the “value” it would have, if humanity were fully “aligned”[2:5]. I.e. Evolution succeeded in “(robustly) aligning humanity”.
(World C) Something else?
I think we’re probably in (A), and IIUC, you think we’re most likely in (B).
Do you consider this an adequate characterization?
If yes, the obvious next question would be:
What tests could we run, what observations could we make,[5] that would help us discern whether we’re in (A) or (B) (or (C))?
(For example: I think the kinds of observations I listed in my previous comment are moderate-to-strong evidence for (A); and the existence of some explicit-IGF-maximizing humans is weak evidence for (B).)
I agree with your summary of what we agree on—that evolution succeeded at aligning brains to IGF so far. That was the key point of the OP.
Before getting into World A vs World B, I need to clarify again that my standard for “success at alignment” is a much weaker criterion than you may be assuming. You seem to consider success to require getting near the maximum possible (ie large fraction) utility, which I believe is uselessly unrealistic. By success I simply mean not a failure, as in not the doom scenario of extinction or near zero utility.
So Worlds A is still a partial success if there is some reasonable population of humans (say even just on the order of millions) in bio bodies or in detailed sims.
(World A) The Shards only work[2:1] conditional on the environment being sufficiently similar to the EEA, and humans not having too much optimization power
I don’t agree with this characterization—the EEA ended ~10k years ago and human fitness has exploded since then rather than collapsed to zero. It is a simple fact that according to any useful genetic fitness metric, human fitness has exploded with our exploding optimization power so far.
I believe this is the dominate evidence, and it indicates:
If tech evolution is similar enough to bio evolution then we should roughly expect tech evolution to have a similar level of success
Likewise doom is unlikely unless the tech evolution process producing AGI has substantially different dynamics from the gene evolution process which produced brains
See this comment for more on the tech/gene evolution analogy and potential differences.
I don’t think your evidence from “opinions of people you know” is convincing for the same reasons I don’t think opinions from humans circa 1900 were much useful evidence for predicting the future of 2023.
AFAIK, most of the many humans racing to build ASI are not doing so with the goal of increasing their IGF.
I don’t think “humans explicitly optimizing for the goal of IGF” is even the correct frame to think of how human value learning works (see shard theory).
As a concrete example, Elon Musk seems to be on track for high long term IGF, without consciously optimizing for IGF.
(Ah. Seems we were using the terms “(alignment) success/failure” differently. Thanks for noting it.)
In-retrospect-obvious key question I should’ve already asked:
Conditional on (some representative group of) humans succeeding at aligning ASI, what fraction of the maximum possible value-from-Evolution’s-perspective do you expect the future to attain? [1]
My modal guess is that the future would attain ~1% of maximum possible “Evolution-value”.[2]
If tech evolution is similar enough to bio evolution then we should roughly expect tech evolution to have a similar level of success
Seems like a reasonable (albeit very preliminary/weak) outside view, sure. So, under that heuristic, I’d guess that the future will attain ~1% of max possible “human-value”.
In general I think maximum values are weird because they are potentially nearly unbounded, but it sounds like we may then be in agreement absent terminology.
But in general I do not think of anything “less than 1% of the maximum value” as failure in most endeavors. For example the maximum attainable wealth is perhaps $100T or something, but I don’t think it’d be normal/useful to describe the world’s wealthiest people as failures at being wealthy because they only have ~$100B or whatever.
And regardless the standard doom arguments from EY/MIRI etc are very much “AI will kill us all!”, and not “AI will prevent us from attaining over 1% of maximum future utility!”
vast computation some of which is applied to ancestral simulations
I agree that a successful post-human world would probably involve a large amount[1] of resources spent on simulating (or physically instantiating) things like humans engaging in play, sex, adventure, violence, etc. IOW, engaging in the things for which Evolution installed Shards in us. However, I think that is not the same as [whatever Evolution would care about, if Evolution could care about anything]. For the post-human future to be a success from Evolution’s perspective, I think it would have to be full of something more like [programs (sentient or not, DNA or digital) striving to make as many copies of themselves as possible].
(If we make the notion of “DNA” too broad/vague, then we could interpret almost any future outcome as “success for Evolution”.)
For the post-human future to be a success from Evolution’s perspective, I think it would have to be full of something more like [programs (sentient or not, DNA or digital) striving to make as many copies of themselves as possible].
Any ancestral simulation will naturally be full of that, so it boils down to the simulation argument.
The natural consequence of “postbiological humans” is effective disempowerment if not extinction of humanity as a whole.
Such “transhumanists” clearly do not find the eradication of biology abhorrent, any more than any normal person would find the idea of “substrate independence”(death of all love and life) to be abhorrent.
If Evolution had a lot more time to align humans to relative-gene-replication-count, before humans put an end to biological life, then sure, seems plausible that Evolution might be able to align humans very robustly. But Evolution does not have infinite time or “retries”—humanity is in the process of executing something like a “sharp left turn”, and seems likely to succeed long before the human gene pool is taken over by sperm bank donors and such.
Humans have not put an end to biological life.
Your doom predictions aren’t evidence, and can’t be used in any way in this analogy. To do so is just circular reasoning. “Sure brains haven’t demonstrated misalignment yet, but they are about to because doom is coming! Therefor evolution fails at alignment and thus doom is likely!”
For rational minds, evidence is strictly historical[1]. The evidence we have to date is that humans are enormously successful, despite any slight misalignment or supposed “sharp left turn”.
There are many other scenarios where DNA flourishes even after a posthuman transition.
Look closely at how Solomonoff induction works, for example. Its world model is updated strictly from historical evidence, not its own future predictions.
Yup. I, too, have noticed that.
C’mon, man, that’s obviously a misrepresentation of what I was saying. Or maybe my earlier comment failed badly at communication? In case that’s so, here’s an attempted clarification (bolded parts added):
Point being: Sure, Evolution managed to cough up some individuals who explicitly optimize for IGF. But they’re exceptions, not the rule; and humanity seems (based on past observations!) to be on track to (mostly) end DNA-based life. So it seems premature to say that Evolution succeeded at aligning humanity.
In case you’re wondering what past observations lead me to think that humans are unaligned[2] w.r.t. IGF and on track to end (or transcend) biological life, here are some off the top of my head:
Of the people whose opinions on the subject I’m aware of (including myself), nearly all would like to transcend (or end) biological life.[3]
Birth rates in most developed nations have been low or below replacement for a long time.[4] There seems to be a negative correlation between wealth/education and number of offspring produced. That matches my impression that as people gain wealth, education, and empowerment in general, most choose to spend it mostly on something other than producing offspring.
Diligent sperm bank donors are noteworthy exceptions. Most people are not picking obvious low-hanging fruit to increasing their IGF. Rich people waste money on yachts and stuff, instead of using it to churn out as many high-fitness offspring as possible; etc.
AFAIK, most of the many humans racing to build ASI are not doing so with the goal of increasing their IGF. And absent successful attempts to align ASI specifically to producing lots of DNA-based replicators, I don’t see strong reason to expect the future to be optimized for quantity of DNA-based replicators.
Perhaps you disagree with the last point above?
Interesting. Could you list a few of those scenarios?
Note: I wasn’t even talking (only) about doom; I was talking about humanity seemingly being on track to end biological life. I think the “good” outcomes probably also involve transcending biology/DNA-based replicators.
to the extent that it even makes sense to talk about incoherent things like humans being “(mis/un)aligned” to anything.
My sample might not be super representative of humanity as a whole. Maybe somewhat representative of people involved in AI, though?
At least according to sources like this: https://en.wikipedia.org/wiki/Total_fertility_rate
Evolution has succeeded at aligning homo sapiens brains to date[1] - that is the historical evidence we have.
I don’t think most transhumanists explicitly want to end biological life, and most would find that abhorrent. Transcending to a postbiological state probably doesn’t end biology any more/less than biology ended geology.
The future is complex and unknown. Is the ‘DNA’ we are discussing the information content or the physical medium? Seems rather obvious it’s the information that matters, not the medium. Transcendence to a posthuman state probably involves vast computation some of which is applied to ancestral simulations (which we may already be in) which enormously preserves and multiplies the info content of the DNA.
Not perfectly of course but evolution doesn’t do anything perfectly. It aligned brains well enough such that the extent of any misalignment was insignificant compared to the enormous utility our brains provided.
I’m guessing we agree on the following:
Evolution shaped humans to have various context-dependent drives (call them Shards) and the ability to mentally represent and pursue complex goals. Those Shards were good proxies for IGF in the EEA[1].
Those Shards were also good[2] enough to produce billions of humans in the modern environment. However, it is also the case that most modern humans spend at least part of their optimization power on things orthogonal to IGF.
I think our disagreement here maybe boils down to approximately the following question:
With what probability are we in each of the following worlds?
(World A) The Shards only work[2:1] conditional on the environment being sufficiently similar to the EEA, and humans not having too much optimization power. If the environment changes too far OOD, or if humans were to gain a lot of power[3], then the Shards would cease to be good[2:2] proxies.
In this world, we should expect the future to contain only a small fraction[4] of the “value” it would have, if humanity were fully “aligned”[2:3]. I.e. Evolution failed to “(robustly) align humanity”.
(World B) The Shards (in combination with other structures in human DNA/brains) are in fact sufficiently robust that they will keep humanity aligned[2:4] even in the face of distributional shift and humans gaining vast optimization power.
In this world, we should expect the future to contain a large fraction of the “value” it would have, if humanity were fully “aligned”[2:5]. I.e. Evolution succeeded in “(robustly) aligning humanity”.
(World C) Something else?
I think we’re probably in (A), and IIUC, you think we’re most likely in (B). Do you consider this an adequate characterization?
If yes, the obvious next question would be: What tests could we run, what observations could we make,[5] that would help us discern whether we’re in (A) or (B) (or (C))?
(For example: I think the kinds of observations I listed in my previous comment are moderate-to-strong evidence for (A); and the existence of some explicit-IGF-maximizing humans is weak evidence for (B).)
Environment of evolutionary adaptedness. For humans: hunter-gatherer tribes on the savanna, or maybe primitive subsistence agriculture societies.
in the sense of optimizing for IGF, or whatever we’re imagining Evolution to “care” about.
e.g. ability to upload their minds, construct virtual worlds, etc.
Possibly (but not necessarily) still a large quantity in absolute terms.
Without waiting a possibly-long time to watch how things in fact play out.
I agree with your summary of what we agree on—that evolution succeeded at aligning brains to IGF so far. That was the key point of the OP.
Before getting into World A vs World B, I need to clarify again that my standard for “success at alignment” is a much weaker criterion than you may be assuming. You seem to consider success to require getting near the maximum possible (ie large fraction) utility, which I believe is uselessly unrealistic. By success I simply mean not a failure, as in not the doom scenario of extinction or near zero utility.
So Worlds A is still a partial success if there is some reasonable population of humans (say even just on the order of millions) in bio bodies or in detailed sims.
I don’t agree with this characterization—the EEA ended ~10k years ago and human fitness has exploded since then rather than collapsed to zero. It is a simple fact that according to any useful genetic fitness metric, human fitness has exploded with our exploding optimization power so far.
I believe this is the dominate evidence, and it indicates:
If tech evolution is similar enough to bio evolution then we should roughly expect tech evolution to have a similar level of success
Likewise doom is unlikely unless the tech evolution process producing AGI has substantially different dynamics from the gene evolution process which produced brains
See this comment for more on the tech/gene evolution analogy and potential differences.
I don’t think your evidence from “opinions of people you know” is convincing for the same reasons I don’t think opinions from humans circa 1900 were much useful evidence for predicting the future of 2023.
I don’t think “humans explicitly optimizing for the goal of IGF” is even the correct frame to think of how human value learning works (see shard theory).
As a concrete example, Elon Musk seems to be on track for high long term IGF, without consciously optimizing for IGF.
(Ah. Seems we were using the terms “(alignment) success/failure” differently. Thanks for noting it.)
In-retrospect-obvious key question I should’ve already asked: Conditional on (some representative group of) humans succeeding at aligning ASI, what fraction of the maximum possible value-from-Evolution’s-perspective do you expect the future to attain? [1]
My modal guess is that the future would attain ~1% of maximum possible “Evolution-value”.[2]
Seems like a reasonable (albeit very preliminary/weak) outside view, sure. So, under that heuristic, I’d guess that the future will attain ~1% of max possible “human-value”.
setting completely aside whether to consider the present “success” or “failure” from Evolution’s perspective.
I’d call that failure on Evolution’s part, but IIUC you’d call it partial success? (Since the absolute value would still be high?)
In general I think maximum values are weird because they are potentially nearly unbounded, but it sounds like we may then be in agreement absent terminology.
But in general I do not think of anything “less than 1% of the maximum value” as failure in most endeavors. For example the maximum attainable wealth is perhaps $100T or something, but I don’t think it’d be normal/useful to describe the world’s wealthiest people as failures at being wealthy because they only have ~$100B or whatever.
And regardless the standard doom arguments from EY/MIRI etc are very much “AI will kill us all!”, and not “AI will prevent us from attaining over 1% of maximum future utility!”
I agree that a successful post-human world would probably involve a large amount[1] of resources spent on simulating (or physically instantiating) things like humans engaging in play, sex, adventure, violence, etc. IOW, engaging in the things for which Evolution installed Shards in us. However, I think that is not the same as [whatever Evolution would care about, if Evolution could care about anything]. For the post-human future to be a success from Evolution’s perspective, I think it would have to be full of something more like [programs (sentient or not, DNA or digital) striving to make as many copies of themselves as possible].
(If we make the notion of “DNA” too broad/vague, then we could interpret almost any future outcome as “success for Evolution”.)
a large absolute amount, but maybe not a large relative amount.
Any ancestral simulation will naturally be full of that, so it boils down to the simulation argument.
The natural consequence of “postbiological humans” is effective disempowerment if not extinction of humanity as a whole.
Such “transhumanists” clearly do not find the eradication of biology abhorrent, any more than any normal person would find the idea of “substrate independence”(death of all love and life) to be abhorrent.