ADifferentAnonymous comments on Announcing the Inverse Scaling Prize ($250k Prize Pool)

ADifferentAnonymous 28 Jun 2022 18:53 UTC
9 points
0
I agree that (1) is an important consideration for AI going forward, but I don’t think it really applies until the AI has a definite goal. AFAICT the goal in developing systems like GPT is mostly ‘to see what they can do’.

I don’t fault anybody for GPT completing anachronistic counterfactuals—they’re fun and interesting. It’s a feature, not a bug. You could equally call it an alignment failure if GPT-4 starting being a wet blanket and giving completions like

Prompt: “In response to the Pearl Harbor attacks, Otto von Bismarck said” Completion: “nothing, because he was dead.”

In contrast, a system like IBM Watson has a goal of producing correct answers, making it unambiguous what the aligned answer would be.

To be clear, I think the contest still works—I just think the ‘surprisingness’ condition hides a lot of complexity wrt what we expect in the first place.