(For object-level responses, see comments on parallel threads.)
I want to push back on an implicit framing in lines like:
there’s some value to more people thinking thru / shooting down their own edge cases [...], instead of pushing the work to Eliezer.
people aren’t updating on the meta-level point and continue to attempt ‘rolling their own crypto’, asking if Eliezer can poke the hole in this new procedure
This makes it sound like the rest of us don’t try to break our proposals, push the work to Eliezer, agree with Eliezer when he finds a problem, and then not update that maybe future proposals will have problems.
Whereas in reality, I try to break my proposals, don’t agree with Eliezer’s diagnoses of the problems, and usually don’t ask Eliezer because I don’t expect his answer to be useful to me (and previously didn’t expect him to respond). I expect this is true of others (like Paul and Richard) as well.
Yeah, sorry about not owning that more, and for the frame being muddled. I don’t endorse the “asking Eliezer” or “agreeing with Eliezer” bits, but I do basically think he’s right about many object-level problems he identifies (and thus people disagreeing with him about that is not a feature) and think ‘security mindset’ is the right orientation to have towards AGI alignment. That hypothesis is a ‘worry’ primarily because asymmetric costs means it’s more worth investigating than the raw probability would suggest. [Tho the raw probability of components of it do feel pretty substantial to me.]
[EDIT: I should say I think ARC’s approach to ELK seems like a great example of “people breaking their own proposals”. As additional data to update on, I’d be interested in seeing, like, a graph of people’s optimism about ELK over time, or something similar.]
(For object-level responses, see comments on parallel threads.)
I want to push back on an implicit framing in lines like:
This makes it sound like the rest of us don’t try to break our proposals, push the work to Eliezer, agree with Eliezer when he finds a problem, and then not update that maybe future proposals will have problems.
Whereas in reality, I try to break my proposals, don’t agree with Eliezer’s diagnoses of the problems, and usually don’t ask Eliezer because I don’t expect his answer to be useful to me (and previously didn’t expect him to respond). I expect this is true of others (like Paul and Richard) as well.
Yeah, sorry about not owning that more, and for the frame being muddled. I don’t endorse the “asking Eliezer” or “agreeing with Eliezer” bits, but I do basically think he’s right about many object-level problems he identifies (and thus people disagreeing with him about that is not a feature) and think ‘security mindset’ is the right orientation to have towards AGI alignment. That hypothesis is a ‘worry’ primarily because asymmetric costs means it’s more worth investigating than the raw probability would suggest. [Tho the raw probability of components of it do feel pretty substantial to me.]
[EDIT: I should say I think ARC’s approach to ELK seems like a great example of “people breaking their own proposals”. As additional data to update on, I’d be interested in seeing, like, a graph of people’s optimism about ELK over time, or something similar.]