The part where alignment is hard is precisely when the thing I’m trying to accomplish is hard. Because then I need a powerful plan, and it’s hard to specify a search for powerful plans that don’t kill everyone.
I now read you as pointing to chess as:
It is “hard to accomplish” from the perspective of human cognition.
It does not require a “powerful”/”agentic” plan.
It’s “easy” to specify a search for a good plan, we already did it.
Yepp. And clearly alignment is much harder than chess, but it seems like an open question whether it’s harder than “kill everyone” (and even if it is, there’s an open question of how much of an advantage we get from doing our best to point the system at the former not the latter).
“Kill everyone” seems like it should be “easy”, because there are so many ways to do it: humans only survive in environments with a specific range of temperatures, pressures, atmospheric contents, availability of human-digestible food, &c.
That helps, thanks. Raemon says:
I now read you as pointing to chess as:
It is “hard to accomplish” from the perspective of human cognition.
It does not require a “powerful”/”agentic” plan.
It’s “easy” to specify a search for a good plan, we already did it.
So maybe alignment is like that.
Yepp. And clearly alignment is much harder than chess, but it seems like an open question whether it’s harder than “kill everyone” (and even if it is, there’s an open question of how much of an advantage we get from doing our best to point the system at the former not the latter).
“Kill everyone” seems like it should be “easy”, because there are so many ways to do it: humans only survive in environments with a specific range of temperatures, pressures, atmospheric contents, availability of human-digestible food, &c.