Problem #1: Align limited task AGI to do some minimal act that ensures no one else can destroy the world with AGI.
Problem #2: Solve the full problem of using AGI to help us achieve an awesome future.
Problem #1 is the one I was talking about in the OP, and I think of it as the problem we need to solve on a deadline. Problem #2 is also indispensable (and a lot more philosophically fraught), but it’s something humanity can solve at its leisure once we’ve solved #1 and therefore aren’t at immediate risk of destroying ourselves.
There are two problems here:
Problem #1: Align limited task AGI to do some minimal act that ensures no one else can destroy the world with AGI.
Problem #2: Solve the full problem of using AGI to help us achieve an awesome future.
Problem #1 is the one I was talking about in the OP, and I think of it as the problem we need to solve on a deadline. Problem #2 is also indispensable (and a lot more philosophically fraught), but it’s something humanity can solve at its leisure once we’ve solved #1 and therefore aren’t at immediate risk of destroying ourselves.
MIRI had a strategic explanation in their 2017 fundraiser post which I found very insightful. This was called the “acute risk period”.