Time for some predictions. If this is actually from AI developing social manipulation superpowers, I would expect:
We never find out any real reasonable-sounding reason for Altman’s firing.
OpenAI does not revert to how it was before.
More instances of people near OpenAI’s safety people doing bizarre unexpected things that have stranger outcomes.
Possibly one of the following:
Some extreme “scissors statements” pop up which divide AI groups into groups that hate each other to an unreasonable degree.
An OpenAI person who directly interacted with some scary AI suddenly either commits suicide or becomes a vocal flat-earther or similar who is weirdly convincing to many people.
An OpenAI person skyrockets to political power, suddenly finding themselves in possession of narratives and phrases which convince millions to follow them.
(Again, I don’t think it’s that likely, but I do think it’s possible.)
Things might be even weirder than that if this is a narrowly superhuman AI that is specifically superhuman at social manipulation, but still has the same inability to form new gears-level models exhibited by current LLMs (e.g. if they figured out how to do effective self-play on the persuasion task, but didn’t actually crack AGI).
Time for some predictions. If this is actually from AI developing social manipulation superpowers, I would expect:
We never find out any real reasonable-sounding reason for Altman’s firing.
OpenAI does not revert to how it was before.
More instances of people near OpenAI’s safety people doing bizarre unexpected things that have stranger outcomes.
Possibly one of the following:
Some extreme “scissors statements” pop up which divide AI groups into groups that hate each other to an unreasonable degree.
An OpenAI person who directly interacted with some scary AI suddenly either commits suicide or becomes a vocal flat-earther or similar who is weirdly convincing to many people.
An OpenAI person skyrockets to political power, suddenly finding themselves in possession of narratives and phrases which convince millions to follow them.
(Again, I don’t think it’s that likely, but I do think it’s possible.)
Things might be even weirder than that if this is a narrowly superhuman AI that is specifically superhuman at social manipulation, but still has the same inability to form new gears-level models exhibited by current LLMs (e.g. if they figured out how to do effective self-play on the persuasion task, but didn’t actually crack AGI).