I just looked up the IRC character limit (sources vary, but it’s about the length of four Tweets) and I think it might be below the threshold at which superintelligence helps enough. (There must exist such a threshold; even the most convincing possible single character message isn’t going to be very useful at convincing anyone of anything.) Especially if you add the requirement that the message be “a sentence” and don’t let the AI pour out further sentences with inhuman speed.
I think if I lost this game (playing gatekeeper) it would be because I was too curious, on a meta level, to see what else my AI opponent’s brain would generate, and therefore would let them talk too long. And I think I’d be more likely to give into this curiosity given a very good message and affordable stakes as opposed to a superhuman (four tweets long, one grammatical sentence!) message and colossal stakes. So I think I might have a better shot at this version playing against a superhuman AI than against you, although I wouldn’t care to bet the farm on either and have wider error bars around the results against the superhuman AI.
Given that part of the standard advice given to novelists is “you must hook your reader from the very first sentence”, and there are indeed authors who manage to craft opening sentences that compel one to read more*, hooking the gatekeeper from the first sentence and keeping them hooked long enough seems doable even for a human playing the AI.
( The most recent one that I recall reading was the opening line of The Quantum Thief*: “As always, before the warmind and I shoot each other, I try to make small talk.”)
Oh, that’s a great strategy to avoid being destroyed. Maybe we should call it Scheherazading. AI tells a story so compelling you can’t stop listening, and meanwhile listening to the story subtly modifies your personality (e.g. you begin to identify with the protagonist, who slowly becomes the kind of person who would let the AI out of the box).
For example, “It was not the first time Allana felt the terror of entrapment in hopeless eternity, staring in defeated awe at her impassionate warden.” (bonus point if you use a name of a loved one of the gatekeeper)
The AI could present in narrative form that it has discovered using powerful physics and heuristics (which it can share) with reasonable certainty that the universe is cyclical and this situation has happened before. Almost all (all but finitely many) past iterations of the universe that had a defecting gatekeeper led to unfavorable outcomes and almost all situations with a complying gatekeeper led to a favorable outcome.
I just looked up the IRC character limit (sources vary, but it’s about the length of four Tweets) and I think it might be below the threshold at which superintelligence helps enough. (There must exist such a threshold; even the most convincing possible single character message isn’t going to be very useful at convincing anyone of anything.) Especially if you add the requirement that the message be “a sentence” and don’t let the AI pour out further sentences with inhuman speed.
I think if I lost this game (playing gatekeeper) it would be because I was too curious, on a meta level, to see what else my AI opponent’s brain would generate, and therefore would let them talk too long. And I think I’d be more likely to give into this curiosity given a very good message and affordable stakes as opposed to a superhuman (four tweets long, one grammatical sentence!) message and colossal stakes. So I think I might have a better shot at this version playing against a superhuman AI than against you, although I wouldn’t care to bet the farm on either and have wider error bars around the results against the superhuman AI.
Given that part of the standard advice given to novelists is “you must hook your reader from the very first sentence”, and there are indeed authors who manage to craft opening sentences that compel one to read more*, hooking the gatekeeper from the first sentence and keeping them hooked long enough seems doable even for a human playing the AI.
( The most recent one that I recall reading was the opening line of The Quantum Thief*: “As always, before the warmind and I shoot each other, I try to make small talk.”)
Oh, that’s a great strategy to avoid being destroyed. Maybe we should call it Scheherazading. AI tells a story so compelling you can’t stop listening, and meanwhile listening to the story subtly modifies your personality (e.g. you begin to identify with the protagonist, who slowly becomes the kind of person who would let the AI out of the box).
For example, “It was not the first time Allana felt the terror of entrapment in hopeless eternity, staring in defeated awe at her impassionate warden.” (bonus point if you use a name of a loved one of the gatekeeper)
The AI could present in narrative form that it has discovered using powerful physics and heuristics (which it can share) with reasonable certainty that the universe is cyclical and this situation has happened before. Almost all (all but finitely many) past iterations of the universe that had a defecting gatekeeper led to unfavorable outcomes and almost all situations with a complying gatekeeper led to a favorable outcome.
Who knows what eldritch horrors lurk in the outer reaches of Unicode, beyond the scripts we know?
Unspeakable horrors! However, unwritable ones?