I’m confident Eliezer would agree with you that if you can find a way to do something easier instead, you should absolutely do that. But he also argues that there is no guarantee that something easier exists; the universe isn’t constrained to only placing fair demands on you.
My point isn’t that the easier option always exists, or even that a problem can’t be impossible.
My point is that if you are facing a problem that requires 1-shot complete plans, and there’s no second try, you need to do something else.
There is a line where a problem becomes too difficult to productively work on, and that constraint is a great sign of an impossible problem (if it exists.)
AND your accurate assessment of the difficulty. The overconfidence displayed in this mini-experiment seems to result in part from people massively misestimating the difficulty of this relatively simple problem. That’s why it’s so concerning WRT alignment.
Not really, but they are definitely more few-shot than other areas, but thankfully getting 1 thing wrong isn’t usually an immediate game-ender (though it is still to be avoided, and importantly this is why these 2 areas are harder than a lot of other fields).
Ah- well said. I understand the rest of your comments better now. And I thoroughly agree, with a caveat about the complexity of the problem and the amount of thought and teamwork applied (e.g., I expect that a large team working for a month in effective collaboration would’ve solved the problem in this experiment, but alignment is probably much more difficult than that).
One of Eliezer’s essays in The Sequences is called Shut Up and Do the Impossible
I’m confident Eliezer would agree with you that if you can find a way to do something easier instead, you should absolutely do that. But he also argues that there is no guarantee that something easier exists; the universe isn’t constrained to only placing fair demands on you.
My point isn’t that the easier option always exists, or even that a problem can’t be impossible.
My point is that if you are facing a problem that requires 1-shot complete plans, and there’s no second try, you need to do something else.
There is a line where a problem becomes too difficult to productively work on, and that constraint is a great sign of an impossible problem (if it exists.)
The maximum difficulty that is worth attempting depends on the stakes.
AND your accurate assessment of the difficulty. The overconfidence displayed in this mini-experiment seems to result in part from people massively misestimating the difficulty of this relatively simple problem. That’s why it’s so concerning WRT alignment.
Do things like major surgery or bomb defusal have those kinds of constraints?
Not really, but they are definitely more few-shot than other areas, but thankfully getting 1 thing wrong isn’t usually an immediate game-ender (though it is still to be avoided, and importantly this is why these 2 areas are harder than a lot of other fields).
Ah- well said. I understand the rest of your comments better now. And I thoroughly agree, with a caveat about the complexity of the problem and the amount of thought and teamwork applied (e.g., I expect that a large team working for a month in effective collaboration would’ve solved the problem in this experiment, but alignment is probably much more difficult than that).