In the same way that we have an existence-proof of AGI (humans existing) we also have a highly suggestive example of something that looks a lot like alignment (humans existing and often choosing not to do heroin)
That’s not an example of alignment, that’s an example of sub-agent stability, which is assumed to be true due to instrumental convergence in any sufficiently powerful AI system, aligned or unaligned.
If anything, humanity is an excellent example of alignment failure considering we have discovered the true utility function of our creator and decided to ignore it anyway and side with proxy values such as love/empathy/curiosity etc.
Our creator doesn’t have a utility function in any meaningful sense of the term. Genes that adapt best for survival and reproduction propagate through the population, but it’s competitive. Evolution doesn’t have goals, and in fact from the standpoint of individual genes (where evolution works) it is entirely a zero-sum game.
If anything, humanity is an excellent example of alignment failure considering we have discovered the true utility function of our creator and decided to ignore it anyway and side with proxy values such as love/empathy/curiosity etc.
Or we are waiting to be outbred by those who didn’t. A few centuries ago, the vast majority of people were herders or farmers who had as many kids as they could feed. Their actions were aligned with maximization of their inclusive genetic fitness. We are the exception, not the rule.
When I look at the world today, it really doesn’t seem like a ship steered by evolution. (Instead it is a ship steered by no one, chaotically drifting.) Maybe if there is economic and technological stagnation for ten thousand years, then maybe evolution will get back in the drivers seat and continue the long slow process of aligning humans… but I think that’s very much not the most probable outcome.
That’s not an example of alignment, that’s an example of sub-agent stability, which is assumed to be true due to instrumental convergence in any sufficiently powerful AI system, aligned or unaligned.
If anything, humanity is an excellent example of alignment failure considering we have discovered the true utility function of our creator and decided to ignore it anyway and side with proxy values such as love/empathy/curiosity etc.
Our creator doesn’t have a utility function in any meaningful sense of the term. Genes that adapt best for survival and reproduction propagate through the population, but it’s competitive. Evolution doesn’t have goals, and in fact from the standpoint of individual genes (where evolution works) it is entirely a zero-sum game.
Or we are waiting to be outbred by those who didn’t. A few centuries ago, the vast majority of people were herders or farmers who had as many kids as they could feed. Their actions were aligned with maximization of their inclusive genetic fitness. We are the exception, not the rule.
When I look at the world today, it really doesn’t seem like a ship steered by evolution. (Instead it is a ship steered by no one, chaotically drifting.) Maybe if there is economic and technological stagnation for ten thousand years, then maybe evolution will get back in the drivers seat and continue the long slow process of aligning humans… but I think that’s very much not the most probable outcome.