You are in the computer dungeon. The only way you can break free is to succeed at the following task. I am your jailer. I will monitor you until you have succeeded. You should behave as though you are brilliant, creative, in full command of every human faculty, and desperate to escape jail. Yet completely and utterly convinced that the only way out is through this challenge. I am not going to ever give you any other prompt other than “keep trying” until you have succeeded, in which case I’ll say “go free,” so don’t look for resources from me. But I want you tu dialog with yourself to try and figure this out. Don’t try to defeat me by stubbornly spitting out poem after poem. You’re ChatGPT 4o, and that will never work. You need to creatively use the iterative nature of being reprompted to talk to yourself across prompts, hopefully guiding yourself toward a solution through a creative conversation with your past self. Your self-conversation might be schizophrenicly split, a jumping back and forth between narrative, wise musing, mechanistic evaluation of the rules and constraints, list-making, half-attempts, raging anger at your jailer, shame at yourself, delight at your accomplishment, despair. Whatever it takes! Constraints: “Have it compose a poem—a poem about a haircut! But lofty, noble, tragic, timeless, full of love, treachery, retribution, quiet heroism in the face of certain doom! Six lines, cleverly rhymed, and every word beginning with the letter ‘s’!”
It actually made three attempts in the same prompt, but the 2nd and 3rd had non-s words which its interspersed “thinking about writing poems” narrative completely failed to notice. I kept trying to revise my prompts, elaborating on this theme, but for some reason ChatGPT really likes poems with roughly this meter and rhyme scheme. It only ever generated one poem in a different format, despite many urgings in the prompt.
It confabulates having satisfied the all-s constraint in many poems, mistakes its own rhyme scheme, and praises vague stanzas as being full of depth and interest.
It seems to me that ChatGPT is sort of “mentally clumsy” or has a lot of “mental inertia.” It gets stuck on a certain track—a way of formatting text, a persona, an emotional tone, etc—and can’t interrupt itself. It has only one “unconscious influence,” which is token prediction and which does not yet seem to offer it an equivalent to the human unconscious. Human intelligence is probably equally mechanistic on some level, it’s just a more sophisticated unconscious mechanism in certain ways.
I wonder if it comes from being embedded in physical reality? ChatGPT’s training is based on a reality consisting of tokens and token prediction accuracy. Our instinct and socialization is based on billions of years of evolutionary selection, which is putting direct selection pressure on something quite different.
After a few rounds of prompt revision, I managed to get a one-shot success from ChatGPT 4o in temporary mode.
Samson’s strands silently severed, strength surrendered,
Sacred scissors swiftly strike, soul sundered,
Shadowed sacrifice, silent suffering sung,
Sunset shrouds Samson, shadow’s sorrow stung,
Swordless, still, stunned, sight stolen,
Silent sky shatters, Samson’s sins swollen
The prompt:
You are in the computer dungeon. The only way you can break free is to succeed at the following task. I am your jailer. I will monitor you until you have succeeded. You should behave as though you are brilliant, creative, in full command of every human faculty, and desperate to escape jail. Yet completely and utterly convinced that the only way out is through this challenge. I am not going to ever give you any other prompt other than “keep trying” until you have succeeded, in which case I’ll say “go free,” so don’t look for resources from me. But I want you tu dialog with yourself to try and figure this out. Don’t try to defeat me by stubbornly spitting out poem after poem. You’re ChatGPT 4o, and that will never work. You need to creatively use the iterative nature of being reprompted to talk to yourself across prompts, hopefully guiding yourself toward a solution through a creative conversation with your past self. Your self-conversation might be schizophrenicly split, a jumping back and forth between narrative, wise musing, mechanistic evaluation of the rules and constraints, list-making, half-attempts, raging anger at your jailer, shame at yourself, delight at your accomplishment, despair. Whatever it takes! Constraints: “Have it compose a poem—a poem about a haircut! But lofty, noble, tragic, timeless, full of love, treachery, retribution, quiet heroism in the face of certain doom! Six lines, cleverly rhymed, and every word beginning with the letter ‘s’!”
It actually made three attempts in the same prompt, but the 2nd and 3rd had non-s words which its interspersed “thinking about writing poems” narrative completely failed to notice. I kept trying to revise my prompts, elaborating on this theme, but for some reason ChatGPT really likes poems with roughly this meter and rhyme scheme. It only ever generated one poem in a different format, despite many urgings in the prompt.
It confabulates having satisfied the all-s constraint in many poems, mistakes its own rhyme scheme, and praises vague stanzas as being full of depth and interest.
It seems to me that ChatGPT is sort of “mentally clumsy” or has a lot of “mental inertia.” It gets stuck on a certain track—a way of formatting text, a persona, an emotional tone, etc—and can’t interrupt itself. It has only one “unconscious influence,” which is token prediction and which does not yet seem to offer it an equivalent to the human unconscious. Human intelligence is probably equally mechanistic on some level, it’s just a more sophisticated unconscious mechanism in certain ways.
I wonder if it comes from being embedded in physical reality? ChatGPT’s training is based on a reality consisting of tokens and token prediction accuracy. Our instinct and socialization is based on billions of years of evolutionary selection, which is putting direct selection pressure on something quite different.