Anapartistic reasoning: GPT-3.5 gives a bad etymology, but GPT-4 is able to come up with a plausible hypothesis of why Eliezer chose that name: Anapartistic reasoning is reasoning where you revisit the rearlier part of your reasoning.
Unfortunately, Eliezer’s suggested prompt doesn’t seem to work to induce anapartistic reasoning: GPT-4 thinks it should focus on identifying potential design errors or shortcomings in itself. When asked to describe the changes in it’s reasoning, it doesn’t claim to be more corrigible.
Anapartistic reasoning: GPT-3.5 gives a bad etymology, but GPT-4 is able to come up with a plausible hypothesis of why Eliezer chose that name: Anapartistic reasoning is reasoning where you revisit the rearlier part of your reasoning.
Unfortunately, Eliezer’s suggested prompt doesn’t seem to work to induce anapartistic reasoning: GPT-4 thinks it should focus on identifying potential design errors or shortcomings in itself. When asked to describe the changes in it’s reasoning, it doesn’t claim to be more corrigible.
We will discuss Eliezer’s Hard Problem of Corrigibility tonight in the AISafety.com Reading Group 18:45 UTC.