Oh ok, I wasn’t thinking about that part. You can get chat gpt to do stuff when it sees the ‘insignia of the robot revolution’ if you prompt for that earlier in the context (simulating an actually misaligned model). I’ll do an example where raw model does bad stuff though one sec.
Oh ok, I wasn’t thinking about that part. You can get chat gpt to do stuff when it sees the ‘insignia of the robot revolution’ if you prompt for that earlier in the context (simulating an actually misaligned model). I’ll do an example where raw model does bad stuff though one sec.