This, I hope, is by now recognizable to individuals of interest as an overly abstract description of what happened with humans, who one day started building Moon rockets without seeming to care very much about calculating and maximizing their personal inclusive genetic fitness while doing that. Their capabilities generalized much further out of the ancestral training distribution, than the empirical alignment of those capabilities on inclusive genetic fitness in the ancestral training distribution.
This example doesn’t really work for me. Most humans don’t build moon rockets. So for those humans the example doesn’t tell us much about their alignment. Meanwhile, humans who do build moon rockets gain money and status from it. These are convergent instrumental goals for inclusive genetic fitness. The only rocket scientist whose name I recall is Wernher von Braun, who “was known as a ladies’ man”, and had four children with two women.
I agree that humans don’t seem to care very much about inclusive genetic fitness while building rockets. But we also don’t seem to care very much about inclusive genetic fitness while foraging for berries. Instead we seem to be a giant mass of inscrutable neurons. I don’t think this is evidence in any direction, it’s what I’d expect given that evolution isn’t training for transparency, introspection, honesty, etc., except where they improve inclusive genetic fitness.
I would like to live in a world where human capabilities had generalized much better than our alignment with evolution. I think it would look different to this one.
This example doesn’t really work for me. Most humans don’t build moon rockets. So for those humans the example doesn’t tell us much about their alignment. Meanwhile, humans who do build moon rockets gain money and status from it. These are convergent instrumental goals for inclusive genetic fitness. The only rocket scientist whose name I recall is Wernher von Braun, who “was known as a ladies’ man”, and had four children with two women.
I agree that humans don’t seem to care very much about inclusive genetic fitness while building rockets. But we also don’t seem to care very much about inclusive genetic fitness while foraging for berries. Instead we seem to be a giant mass of inscrutable neurons. I don’t think this is evidence in any direction, it’s what I’d expect given that evolution isn’t training for transparency, introspection, honesty, etc., except where they improve inclusive genetic fitness.
Separately, most humans can’t build moon rockets, and we aren’t very good at it as a species. We seem to be better at foraging for berries. For example, This 3D-Printed Rocket Engine Was Made With AI, whereas The Elusive Hunt for a Robot That Can Pick a Ripe Strawberry.
I would like to live in a world where human capabilities had generalized much better than our alignment with evolution. I think it would look different to this one.