Not even a month ago, Sam Altman predicted that we would live in a strange world where AIs are super-human at persuasion but still not particularly intelligent.
What would it look like when an AGI lab developed such an AI? People testing or playing with the AI might find themselves persuaded of semi-random things, or if sycophantic behavior persists, have their existing feelings and beliefs magnified into zealotry. However, this would (at this stage) not be done in a coordinated way, nor with a strategic goal in mind on the AI’s part. The result would likely be chaotic, dramatic, and hard to explain.
Small differences of opinion might suddenly be magnified into seemingly insurmountable chasms, inspiring urgent and dramatic actions. Actions which would be hard to explain even to oneself later.
I don’t think this is what happened [<1%] but I found it interesting and amusing to think about. This might even be a relatively better-off world, with frontier AGI orgs regularly getting mired in explosive and confusing drama, thus inhibiting research and motivating tougher regulation.
This could be largely addressed by first promoting a pursuasion AI that does something similar to what Scott Alexander often does: Convince the reader of A, then of Not A, to teach them how difficult it actually is to process the evidence and evaluate an argument, to be less trusting of their impulses.
As Penn and Teller demonstrate the profanity of magic to inoculate their readers against illusion, we must create a pursuasion AI that demonstrates the profanity of rhetoric to inoculate the reader against any pursuasionist AI they may meet later on.
The Drama-Bomb hypothesis
Not even a month ago, Sam Altman predicted that we would live in a strange world where AIs are super-human at persuasion but still not particularly intelligent.
https://twitter.com/sama/status/1716972815960961174
What would it look like when an AGI lab developed such an AI? People testing or playing with the AI might find themselves persuaded of semi-random things, or if sycophantic behavior persists, have their existing feelings and beliefs magnified into zealotry. However, this would (at this stage) not be done in a coordinated way, nor with a strategic goal in mind on the AI’s part. The result would likely be chaotic, dramatic, and hard to explain.
Small differences of opinion might suddenly be magnified into seemingly insurmountable chasms, inspiring urgent and dramatic actions. Actions which would be hard to explain even to oneself later.
I don’t think this is what happened [<1%] but I found it interesting and amusing to think about. This might even be a relatively better-off world, with frontier AGI orgs regularly getting mired in explosive and confusing drama, thus inhibiting research and motivating tougher regulation.
This could be largely addressed by first promoting a pursuasion AI that does something similar to what Scott Alexander often does: Convince the reader of A, then of Not A, to teach them how difficult it actually is to process the evidence and evaluate an argument, to be less trusting of their impulses.
As Penn and Teller demonstrate the profanity of magic to inoculate their readers against illusion, we must create a pursuasion AI that demonstrates the profanity of rhetoric to inoculate the reader against any pursuasionist AI they may meet later on.