maximkazhenkov comments on What would we do if alignment were futile?

maximkazhenkov 14 Nov 2021 18:31 UTC
28 points
In the same way that we have an existence-proof of AGI (humans existing) we also have a highly suggestive example of something that looks a lot like alignment (humans existing and often choosing not to do heroin)
That’s not an example of alignment, that’s an example of sub-agent stability, which is assumed to be true due to instrumental convergence in any sufficiently powerful AI system, aligned or unaligned.
If anything, humanity is an excellent example of alignment failure considering we have discovered the true utility function of our creator and decided to ignore it anyway and side with proxy values such as love/empathy/curiosity etc.
- devansh 15 Nov 2021 1:23 UTC
  6 points
  Parent
  Our creator doesn’t have a utility function in any meaningful sense of the term. Genes that adapt best for survival and reproduction propagate through the population, but it’s competitive. Evolution doesn’t have goals, and in fact from the standpoint of individual genes (where evolution works) it is entirely a zero-sum game.
- dkirmani 15 Nov 2021 3:18 UTC
  1 point
  Parent
  
  If anything, humanity is an excellent example of alignment failure considering we have discovered the true utility function of our creator and decided to ignore it anyway and side with proxy values such as love/empathy/curiosity etc.
  
  Or we are waiting to be outbred by those who didn’t. A few centuries ago, the vast majority of people were herders or farmers who had as many kids as they could feed. Their actions were aligned with maximization of their inclusive genetic fitness. We are the exception, not the rule.
  - Daniel Kokotajlo 15 Nov 2021 9:39 UTC
    10 points
    Parent
    When I look at the world today, it really doesn’t seem like a ship steered by evolution. (Instead it is a ship steered by no one, chaotically drifting.) Maybe if there is economic and technological stagnation for ten thousand years, then maybe evolution will get back in the drivers seat and continue the long slow process of aligning humans… but I think that’s very much not the most probable outcome.