I notice a sense of what feels like a deep confusion in the above.
It acts as though the “outer” alignment of humans is “wanting to have kids,” or something? And I am pretty confident this is not the right term in the analogy.
“Wanting to have kids” is another inner optimizer. It’s another set of things that were cobbled together by evolution and survived selection pressure because they had good fitness on the outer goal of propagating the species. It’s the same type of thing as inventing condoms, it’s just a little less obviously askew.
It’s not even all that great, given that it often expresses itself much more in caring a lot about one or two kids, and trying really hard to arrange a good life for those one or two kids, when the strategy of “make fifteen and eight will survive” does much better on the actual “”“goal””” of evolution.
It acts as though the “outer” alignment of humans is “wanting to have kids,” or something
That seems correct to me; to the extent that we can talk about evolution optimizing a species for something, I think it makes the most sense to talk about it as optimizing for the specific traits under selection. When the air in Manchester got more polluted and a dark color started conferring an advantage in hiding, dark-colored moths became more common; this obviously isn’t an instance of an inner optimizer within the moth’s brain since there’s no behavioral element involved, it’s just a physical change to the moth’s color. It’s just evolution directly selecting for a trait that had become useful in Manchester.
Likewise, if “wanting to have kids” is useful for having more surviving descendants, then “wanting to have more kids” becomes a trait that evolution is selecting for, analogously to evolution selecting for dark color in moths. There is an inner optimizer that is executing the trait that has been selected for, but it’s one that’s aligned with the original selection target.
The descriptions of the wanting in the OP seem to be about the inner optimizer doing the execution, though. That’s the distinction I want to make—confusing that for not-an-inner-optimizer seems importantly bad.
I agree with you on what is the inner optimiser. I might not have been able to make myself super clear in the OP, but I see the “outer” alignment as some version “propagate our genes”, and I find it curious that that outer goal produced a very robust “want to have kids” inner alignment. I did also try to make the point that the alignment isn’t maximal in some way, as in yeah, we don’t have 16 kids, and men don’t donate to sperm banks as much as possible and other things that might maximise gene propagation, but even that I find interesting: we fulfill evolution’s “outer goal” somewhat, without going into paperclip-maximiser-style propagate genes at all cost. This seems to me like something we would want out of an AGI.
Especially in the more distant past, making fifteen kids of which eight survive probably often resulted in eight half-starved, unskilled, and probably more diseased offspring that didn’t find viable mates, and extinguished the branch. I don’t think humans are well suited to be nearer the r-strategy end of the spectrum, despite still having some propensity to do so.
In modern times it appears much more viable, and there are some cases of humans who desired and had hundreds of children, so the “inner optimizers” obviously aren’t preventing it. Would a strong desire for everyone to have dozens or hundreds of children in times of plenty benefit their survival, or not? I don’t think think this is a straightforward question to answer, and therefore it’s not clear how closely our inner optimizer goals match the outer.
My main point here is that “wanting kids” is inner, not outer (as is basically any higher brain function that’s going to be based on things like explicit models of the future).
For a different take, think about the promiscuous male strategy, which often in the ancestral environment had very very little to do with wanting kids.
I absolutely agree that “wanting kids” is inner not outer, as is “not wanting kids” or “liking sex”. The question was how well they are aligned with the outer optimizer’s goals along the lines of “have your heritable traits survive for as long as possible”.
I somewhat agree with the original post that the inner goals are actually not as misaligned with the outer goals as they might superficially seem. Even inventing birth control so as to have more non-productive sex without having to take care of a lot of children can be beneficial for the outer goal than not inventing or using birth control.
The biggest flaw with the evolution=outer, culture/thoughts=inner analogy in general though is that the time and scope scales for evolution outer optimization are drastically larger than the timescale of any inner optimizers we might have. When we’re considering AGI inner/outer misalignment, they won’t be anywhere near so different.
I notice a sense of what feels like a deep confusion in the above.
It acts as though the “outer” alignment of humans is “wanting to have kids,” or something? And I am pretty confident this is not the right term in the analogy.
“Wanting to have kids” is another inner optimizer. It’s another set of things that were cobbled together by evolution and survived selection pressure because they had good fitness on the outer goal of propagating the species. It’s the same type of thing as inventing condoms, it’s just a little less obviously askew.
It’s not even all that great, given that it often expresses itself much more in caring a lot about one or two kids, and trying really hard to arrange a good life for those one or two kids, when the strategy of “make fifteen and eight will survive” does much better on the actual “”“goal””” of evolution.
That seems correct to me; to the extent that we can talk about evolution optimizing a species for something, I think it makes the most sense to talk about it as optimizing for the specific traits under selection. When the air in Manchester got more polluted and a dark color started conferring an advantage in hiding, dark-colored moths became more common; this obviously isn’t an instance of an inner optimizer within the moth’s brain since there’s no behavioral element involved, it’s just a physical change to the moth’s color. It’s just evolution directly selecting for a trait that had become useful in Manchester.
Likewise, if “wanting to have kids” is useful for having more surviving descendants, then “wanting to have more kids” becomes a trait that evolution is selecting for, analogously to evolution selecting for dark color in moths. There is an inner optimizer that is executing the trait that has been selected for, but it’s one that’s aligned with the original selection target.
The descriptions of the wanting in the OP seem to be about the inner optimizer doing the execution, though. That’s the distinction I want to make—confusing that for not-an-inner-optimizer seems importantly bad.
I agree with you on what is the inner optimiser. I might not have been able to make myself super clear in the OP, but I see the “outer” alignment as some version “propagate our genes”, and I find it curious that that outer goal produced a very robust “want to have kids” inner alignment. I did also try to make the point that the alignment isn’t maximal in some way, as in yeah, we don’t have 16 kids, and men don’t donate to sperm banks as much as possible and other things that might maximise gene propagation, but even that I find interesting: we fulfill evolution’s “outer goal” somewhat, without going into paperclip-maximiser-style propagate genes at all cost. This seems to me like something we would want out of an AGI.
Does it, though?
Especially in the more distant past, making fifteen kids of which eight survive probably often resulted in eight half-starved, unskilled, and probably more diseased offspring that didn’t find viable mates, and extinguished the branch. I don’t think humans are well suited to be nearer the r-strategy end of the spectrum, despite still having some propensity to do so.
In modern times it appears much more viable, and there are some cases of humans who desired and had hundreds of children, so the “inner optimizers” obviously aren’t preventing it. Would a strong desire for everyone to have dozens or hundreds of children in times of plenty benefit their survival, or not? I don’t think think this is a straightforward question to answer, and therefore it’s not clear how closely our inner optimizer goals match the outer.
My main point here is that “wanting kids” is inner, not outer (as is basically any higher brain function that’s going to be based on things like explicit models of the future).
For a different take, think about the promiscuous male strategy, which often in the ancestral environment had very very little to do with wanting kids.
I absolutely agree that “wanting kids” is inner not outer, as is “not wanting kids” or “liking sex”. The question was how well they are aligned with the outer optimizer’s goals along the lines of “have your heritable traits survive for as long as possible”.
I somewhat agree with the original post that the inner goals are actually not as misaligned with the outer goals as they might superficially seem. Even inventing birth control so as to have more non-productive sex without having to take care of a lot of children can be beneficial for the outer goal than not inventing or using birth control.
The biggest flaw with the evolution=outer, culture/thoughts=inner analogy in general though is that the time and scope scales for evolution outer optimization are drastically larger than the timescale of any inner optimizers we might have. When we’re considering AGI inner/outer misalignment, they won’t be anywhere near so different.