Morpheus comments on Attempted Gears Analysis of AGI Intervention Discussion With Eliezer

Morpheus 20 Nov 2021 12:00 UTC
11 points
Thanks, your numbered list was very helpful in encouraging to go through the claims. Just two things that stood out to me:

39 Nothing we can do with a safe-by-default AI like GPT-3 would be powerful enough to save the world (to ‘commit a pivotal act’), although it might be fun. In order to use an AI to save the world it needs to be powerful enough that you need to trust its alignment, which doesn’t solve your problem.
- What exactly makes people sure that something like GPT would be safe/unsafe?
- If what is needed is some form of insight/break through: Some smarter version of GPT-3 seems really useful? The idea that GPT-3 produces better poetry than me while GPT-5 could help to come up with better alignment ideas, doesn’t strongly conflict with my current view of the world?
Worth noting that the more precise #12 is substantially more optimistic than 12 as stated explicitly here.

#12:

“An aligned advanced AI created by a responsible project that is hurrying where it can, but still being careful enough to maintain a success probability greater than 25%, will take the lesser of (50% longer, 2 years longer) than would an unaligned unlimited superintelligence produced by cutting all possible corners.”

This might come across as optimistic if this was your median alignment difficulty estimate, but instead Elizier is putting 95% on this, which on the flip side suggests a 5% chance that things turn out to be easier. This seems rather in line with “Carefully aligning an AGI would at best be slow and difficult, requiring years of work, even if we did know how.”