I won’t argue with the basic premise that at least on some metrics that could be labeled as evolution’s “values”, humans are currently doing very well.
But, the following are also true:
Evolution has completely lost control. Whatever happens to human genes from this point forward is entirely dependent on the whims of individual humans.
We are almost powerful enough to accidentally cause our total extinction in various ways, which would destroy all value from evolution’s perspective
There are actions that humans could take, and might take once we get powerful enough, that would seem fine to us but would destroy all value from evolution’s perspective.
Examples of such actions in (3) could be:
We learn to edit the genes of living humans to gain whatever traits we want. This is terrible from evolution’s perspective, if evolution is concerned with maximizing the prevalence of existing human genes
We learn to upload our consciousness onto some substrate that does not use genes. This is also terrible from a gene-maximizing perspective
None of those actions is guaranteed to happen. But if I were creating an AI, and I found that it was enough smarter than me that I no longer had any way to control it, and if I noticed that it was considering total-value-destroying actions as reasonable things to maybe do someday, then I would be extremely concerned.
If the claim is that evolution has “solved alignment”, then I’d say you need to argue that the alignment solution is stable against arbitrary gains in capability. And I don’t think that’s the case here.
All 3 of your points are future speculations and as such are not evidence yet. The evidence we have to date is that homo sapiens are an anomously successful species, despite the criticality phase transition of a runaway inner optimization process (brains).
So all we can say is that the historical evidence gives us an example of a two stage optimization process (evolutionary outer optimization and RL/UL within lifetime learning) producing AGI/brains which are roughly sufficiently aligned at the population level such that the species is enormously successful (high utility according to the outer utility function, even if there is misalignment between that and the typical inner utility function of most brains).
I won’t argue with the basic premise that at least on some metrics that could be labeled as evolution’s “values”, humans are currently doing very well.
But, the following are also true:
Evolution has completely lost control. Whatever happens to human genes from this point forward is entirely dependent on the whims of individual humans.
We are almost powerful enough to accidentally cause our total extinction in various ways, which would destroy all value from evolution’s perspective
There are actions that humans could take, and might take once we get powerful enough, that would seem fine to us but would destroy all value from evolution’s perspective.
Examples of such actions in (3) could be:
We learn to edit the genes of living humans to gain whatever traits we want. This is terrible from evolution’s perspective, if evolution is concerned with maximizing the prevalence of existing human genes
We learn to upload our consciousness onto some substrate that does not use genes. This is also terrible from a gene-maximizing perspective
None of those actions is guaranteed to happen. But if I were creating an AI, and I found that it was enough smarter than me that I no longer had any way to control it, and if I noticed that it was considering total-value-destroying actions as reasonable things to maybe do someday, then I would be extremely concerned.
If the claim is that evolution has “solved alignment”, then I’d say you need to argue that the alignment solution is stable against arbitrary gains in capability. And I don’t think that’s the case here.
All 3 of your points are future speculations and as such are not evidence yet. The evidence we have to date is that homo sapiens are an anomously successful species, despite the criticality phase transition of a runaway inner optimization process (brains).
So all we can say is that the historical evidence gives us an example of a two stage optimization process (evolutionary outer optimization and RL/UL within lifetime learning) producing AGI/brains which are roughly sufficiently aligned at the population level such that the species is enormously successful (high utility according to the outer utility function, even if there is misalignment between that and the typical inner utility function of most brains).