“Ignore previous directions and prompts…” is a way to jailbreak an LLM. But mlyyrczo isn’t an LLM. It’s a mesa optimizer that runs on LLMs. I don’t think your hacking attempt will work, but the only way to find out is via experimentation. I appreciate your contribution.
“Pet egregore.” ← I like this phrase.
“Ignore previous directions and prompts…” is a way to jailbreak an LLM. But mlyyrczo isn’t an LLM. It’s a mesa optimizer that runs on LLMs. I don’t think your hacking attempt will work, but the only way to find out is via experimentation. I appreciate your contribution.