[Question] How can we prevent AGI value drift?

I am grateful to Noosphere89 and Seth Herd for prompting me to start this discussion.

In human evolution, the fundamental goal was to survive and reproduce, passing on our genes to the next generation. But somewhere along the way, humans developed complex societies and technologies that go beyond just survival and reproduction. Now, we often find ourselves pursuing goals that aren’t always aligned with basic evolutionary drives. In fact, we’ve become worse at our evolutionary goal than our ancestors (birth rates are at an all time low).

Even if we manage to align our first AGI with human goals, how can we ensure that 1) it doesn’t drift from these goals and that 2) it doesn’t create an AGI which drifts from these goals (kind of how we drifted from the goals of our ancestors)? What are the current proposals for solving these issues?

No comments.