Yeah, I’m also interested in the question of “how do we distinguish ‘sentences-on-mainline’ from ‘shoring-up-edge-cases’?”, or which conversational moves most develop shared knowledge, or something similar.
Like I think it’s often good to point out edge cases, especially when you’re trying to formalize an argument or look for designs that get us out of this trap. In another comment in this thread, I note that there’s a thing Eliezer said that I think is very important and accurate, and also think there’s an edge case that’s not obviously handled correctly.
But also my sense is that there’s some deep benefit from “having mainlines” and conversations that are mostly ‘sentences-on-mainline’? Or, like, there’s some value to more people thinking thru / shooting down their own edge cases (like I do in the mentioned comment), instead of pushing the work to Eliezer. I’m pretty worried that there are deeply general reasons to expect AI alignment to be extremely difficult, people aren’t updating on the meta-level point and continue to attempt ‘rolling their own crypto’, asking if Eliezer can poke the hole in this new procedure, and if Eliezer ever decides to just write serial online fiction until the world explodes humanity hasn’t developed enough capacity to replace him.
(For object-level responses, see comments on parallel threads.)
I want to push back on an implicit framing in lines like:
there’s some value to more people thinking thru / shooting down their own edge cases [...], instead of pushing the work to Eliezer.
people aren’t updating on the meta-level point and continue to attempt ‘rolling their own crypto’, asking if Eliezer can poke the hole in this new procedure
This makes it sound like the rest of us don’t try to break our proposals, push the work to Eliezer, agree with Eliezer when he finds a problem, and then not update that maybe future proposals will have problems.
Whereas in reality, I try to break my proposals, don’t agree with Eliezer’s diagnoses of the problems, and usually don’t ask Eliezer because I don’t expect his answer to be useful to me (and previously didn’t expect him to respond). I expect this is true of others (like Paul and Richard) as well.
Yeah, sorry about not owning that more, and for the frame being muddled. I don’t endorse the “asking Eliezer” or “agreeing with Eliezer” bits, but I do basically think he’s right about many object-level problems he identifies (and thus people disagreeing with him about that is not a feature) and think ‘security mindset’ is the right orientation to have towards AGI alignment. That hypothesis is a ‘worry’ primarily because asymmetric costs means it’s more worth investigating than the raw probability would suggest. [Tho the raw probability of components of it do feel pretty substantial to me.]
[EDIT: I should say I think ARC’s approach to ELK seems like a great example of “people breaking their own proposals”. As additional data to update on, I’d be interested in seeing, like, a graph of people’s optimism about ELK over time, or something similar.]
But also my sense is that there’s some deep benefit from “having mainlines” and conversations that are mostly ‘sentences-on-mainline’?
I agree with this. Or, if you feel ~evenly split between two options, have two mainlines and focus a bunch on those (including picking at cruxes and revising your mainline view over time).
But:
Like, it feels to me like Eliezer was generating sentences on his mainline, and Richard was responding with ‘since you’re being overly pessimistic, I will be overly optimistic to balance’, with no attempt to have his response match his own mainline.
I do note that there are some situations where rushing to tell a ‘mainline story’ might be the wrong move:
Maybe your beliefs feel wildly unstable day-to-day—because you’re learning a lot quickly, or because it’s just hard to know how to assign weight to the dozens of different considerations that bear on these questions. Then trying to take a quick snapshot of your current view might feel beside the point.
It might even feel actively counterproductive, like rushing too quickly to impose meaning/structure on data when step one is to make sure you have the data properly loaded up in your head.
Maybe there are many scenarios that seem similarly likely to you. If you see ten very different ways things could go, each with ~10% subjective probability, then picking a ‘mainline’ may be hard, and may require a bunch of arbitrary-feeling choices about which similarities-between-scenarios you choose to pay attention to.
Yeah, I’m also interested in the question of “how do we distinguish ‘sentences-on-mainline’ from ‘shoring-up-edge-cases’?”, or which conversational moves most develop shared knowledge, or something similar.
Like I think it’s often good to point out edge cases, especially when you’re trying to formalize an argument or look for designs that get us out of this trap. In another comment in this thread, I note that there’s a thing Eliezer said that I think is very important and accurate, and also think there’s an edge case that’s not obviously handled correctly.
But also my sense is that there’s some deep benefit from “having mainlines” and conversations that are mostly ‘sentences-on-mainline’? Or, like, there’s some value to more people thinking thru / shooting down their own edge cases (like I do in the mentioned comment), instead of pushing the work to Eliezer. I’m pretty worried that there are deeply general reasons to expect AI alignment to be extremely difficult, people aren’t updating on the meta-level point and continue to attempt ‘rolling their own crypto’, asking if Eliezer can poke the hole in this new procedure, and if Eliezer ever decides to just write serial online fiction until the world explodes humanity hasn’t developed enough capacity to replace him.
(For object-level responses, see comments on parallel threads.)
I want to push back on an implicit framing in lines like:
This makes it sound like the rest of us don’t try to break our proposals, push the work to Eliezer, agree with Eliezer when he finds a problem, and then not update that maybe future proposals will have problems.
Whereas in reality, I try to break my proposals, don’t agree with Eliezer’s diagnoses of the problems, and usually don’t ask Eliezer because I don’t expect his answer to be useful to me (and previously didn’t expect him to respond). I expect this is true of others (like Paul and Richard) as well.
Yeah, sorry about not owning that more, and for the frame being muddled. I don’t endorse the “asking Eliezer” or “agreeing with Eliezer” bits, but I do basically think he’s right about many object-level problems he identifies (and thus people disagreeing with him about that is not a feature) and think ‘security mindset’ is the right orientation to have towards AGI alignment. That hypothesis is a ‘worry’ primarily because asymmetric costs means it’s more worth investigating than the raw probability would suggest. [Tho the raw probability of components of it do feel pretty substantial to me.]
[EDIT: I should say I think ARC’s approach to ELK seems like a great example of “people breaking their own proposals”. As additional data to update on, I’d be interested in seeing, like, a graph of people’s optimism about ELK over time, or something similar.]
I agree with this. Or, if you feel ~evenly split between two options, have two mainlines and focus a bunch on those (including picking at cruxes and revising your mainline view over time).
But:
I do note that there are some situations where rushing to tell a ‘mainline story’ might be the wrong move:
Maybe your beliefs feel wildly unstable day-to-day—because you’re learning a lot quickly, or because it’s just hard to know how to assign weight to the dozens of different considerations that bear on these questions. Then trying to take a quick snapshot of your current view might feel beside the point.
It might even feel actively counterproductive, like rushing too quickly to impose meaning/structure on data when step one is to make sure you have the data properly loaded up in your head.
Maybe there are many scenarios that seem similarly likely to you. If you see ten very different ways things could go, each with ~10% subjective probability, then picking a ‘mainline’ may be hard, and may require a bunch of arbitrary-feeling choices about which similarities-between-scenarios you choose to pay attention to.