After reading these two Eliezer <> Paul discussions, I realize I’m confused about what the importance of their disagreement is.
It’s very clear to me why Richard & Eliezer’s disagreement is important. Alignment being extremely hard suggests AI companies should work a lot harder to avoid accidentally destroying the world, and suggests alignment researchers should be wary of easy-seeming alignment approaches.
But it seems like Paul & Eliezer basically agree about all of that. They disagree about… what the world looks like shortly before the end? Which, sure, does have some strategic implications. You might be able to make a ton of money by betting on AI companies and thus have a lot of power in the few years before the world drastically changes. That does seem important, but it doesn’t seem nearly as important as the difficulty of alignment.
I wonder if there are other things Paul & Eliezer disagree about that are more important. Or if I’m underrating the importance of the ways they disagree here. Paul wants Eliezer to bet on things so Paul can have a chance to update to his view in the future if things end up being really different than he thinks. Okay, but what will he do differently in those worlds? Imo he’d just be doing the same things he’s trying now if Eliezer was right. And maybe there is something implicit in Paul’s “smooth line” forecasting beliefs that makes his prosaic alignment strategy more likely to work in world’s where he’s right, but I currently don’t see it.
I would frame the question more as ‘Is this question important for the entire chain of actions humanity needs to select in order to steer to good outcomes?‘, rather than ‘Is there a specific thing Paul or Eliezer personally should do differently tomorrow if they update to the other’s view?’ (though the latter is an interesting question too).
Some implications of having a more Eliezer-ish view include:
In the Eliezer-world, humanity’s task is more foresight-loaded. You don’t get a long period of time in advance of AGI where the path to AGI is clear; nor do you get a long period of time of working with proto-AGI or weak AGI where we can safely learn all the relevant principles and meta-principles via trial and error. You need to see far more of the bullets coming in advance of the experiment, which means developing more of the technical knowledge to exercise that kind of foresight, and also developing more of the base skills of thinking well about AGI even where our technical models and our data are both thin.
My Paul-model says: ‘Humans are just really bad at foresight, and it seems like AI just isn’t very amenable to understanding; so we’re forced to rely mostly on surface trends and empirical feedback loops. Fortunately, AGI itself is pretty simple and obvious (just keep scaling stuff similar to GPT-3 and you’re done), and progress is likely to be relatively slow and gradual, so surface trends will be a great guide and empirical feedback loops will be abundant.’
My Eliezer-model says: ‘AI foresight may be hard, but it seems overwhelmingly necessary; either we see the bullets coming in advance, or we die. So we need to try to master foresight, even though we can’t be sure of success in advance. In the end, this is a novel domain, and humanity hasn’t put much effort into developing good foresight here; it would be foolish to despair before we’ve made a proper attempt. We need to try to overcome important biases, think like reality, and become capable of good inside-view reasoning about AGI. We need to hone and refine our gut-level pattern-matching, as well as our explicit models of AGI, as well as the metacognition that helps us improve the former capacities.’
In the Eliezer-world, small actors matter more in expectation; there’s no guarantee that the largest and most well-established ML groups will get to AGI first. Governments in particular matter less in expectation.
In the Eliezer-world, single organizations matter more: there’s more potential for a single group to have a lead, and for other groups to be passive or oblivious. This means that you can get more bang for your buck by figuring out how to make a really excellent organization full of excellent people; and you get comparatively less bang for your buck from improving relations between organizations, between governments, etc.
The Eliezer-world is less adequate overall, and also has more capabilities (and alignment) secrets.
So, e.g., research closure matters more — both because more secrets exist, and because it’s less likely that there will be multiple independent discoveries of any given secret at around the same time.
Also, if your background view of the world is more adequate, you should be less worried about alignment (both out of deference to the ML mainstream that is at least moderately less worried about alignment; and out of expectation that the ML mainstream will update and change course as needed).
Relatedly, in Eliezer-world you have to do more work to actively recruit the world’s clearest and best thinkers to helping solve alignment. In Paul-world, you can rely more on future AI progress, warning shots, etc. to naturally grow the alignment field.
In the Eliezer-world, timelines are both shorter and less predictable. There’s more potential for AGI to be early-paradigm rather than late-paradigm; and even if it’s late-paradigm, it may be late into a paradigm that doesn’t look very much like GPT-3 or other circa-2021 systems.
In the Eliezer-world, there are many different paths to AGI, and it may be key to humanity’s survival that we pick a relatively good path years in advance, and deliberately steer toward more alignable approaches to AGI. In the Paul-world, there’s one path to AGI, and it’s big and obvious.
(I’ll emphasize again, by the way, that this is a relative comparison of my model of Paul vs. Eliezer. If Paul and Eliezer’s views on some topic are pretty close in absolute terms, the above might misleadingly suggest more disagreement than there in fact is.)
After reading these two Eliezer <> Paul discussions, I realize I’m confused about what the importance of their disagreement is.
It’s very clear to me why Richard & Eliezer’s disagreement is important. Alignment being extremely hard suggests AI companies should work a lot harder to avoid accidentally destroying the world, and suggests alignment researchers should be wary of easy-seeming alignment approaches.
But it seems like Paul & Eliezer basically agree about all of that. They disagree about… what the world looks like shortly before the end? Which, sure, does have some strategic implications. You might be able to make a ton of money by betting on AI companies and thus have a lot of power in the few years before the world drastically changes. That does seem important, but it doesn’t seem nearly as important as the difficulty of alignment.
I wonder if there are other things Paul & Eliezer disagree about that are more important. Or if I’m underrating the importance of the ways they disagree here. Paul wants Eliezer to bet on things so Paul can have a chance to update to his view in the future if things end up being really different than he thinks. Okay, but what will he do differently in those worlds? Imo he’d just be doing the same things he’s trying now if Eliezer was right. And maybe there is something implicit in Paul’s “smooth line” forecasting beliefs that makes his prosaic alignment strategy more likely to work in world’s where he’s right, but I currently don’t see it.
I would frame the question more as ‘Is this question important for the entire chain of actions humanity needs to select in order to steer to good outcomes?‘, rather than ‘Is there a specific thing Paul or Eliezer personally should do differently tomorrow if they update to the other’s view?’ (though the latter is an interesting question too).
Some implications of having a more Eliezer-ish view include:
In the Eliezer-world, humanity’s task is more foresight-loaded. You don’t get a long period of time in advance of AGI where the path to AGI is clear; nor do you get a long period of time of working with proto-AGI or weak AGI where we can safely learn all the relevant principles and meta-principles via trial and error. You need to see far more of the bullets coming in advance of the experiment, which means developing more of the technical knowledge to exercise that kind of foresight, and also developing more of the base skills of thinking well about AGI even where our technical models and our data are both thin.
My Paul-model says: ‘Humans are just really bad at foresight, and it seems like AI just isn’t very amenable to understanding; so we’re forced to rely mostly on surface trends and empirical feedback loops. Fortunately, AGI itself is pretty simple and obvious (just keep scaling stuff similar to GPT-3 and you’re done), and progress is likely to be relatively slow and gradual, so surface trends will be a great guide and empirical feedback loops will be abundant.’
My Eliezer-model says: ‘AI foresight may be hard, but it seems overwhelmingly necessary; either we see the bullets coming in advance, or we die. So we need to try to master foresight, even though we can’t be sure of success in advance. In the end, this is a novel domain, and humanity hasn’t put much effort into developing good foresight here; it would be foolish to despair before we’ve made a proper attempt. We need to try to overcome important biases, think like reality, and become capable of good inside-view reasoning about AGI. We need to hone and refine our gut-level pattern-matching, as well as our explicit models of AGI, as well as the metacognition that helps us improve the former capacities.’
In the Eliezer-world, small actors matter more in expectation; there’s no guarantee that the largest and most well-established ML groups will get to AGI first. Governments in particular matter less in expectation.
In the Eliezer-world, single organizations matter more: there’s more potential for a single group to have a lead, and for other groups to be passive or oblivious. This means that you can get more bang for your buck by figuring out how to make a really excellent organization full of excellent people; and you get comparatively less bang for your buck from improving relations between organizations, between governments, etc.
The Eliezer-world is less adequate overall, and also has more capabilities (and alignment) secrets.
So, e.g., research closure matters more — both because more secrets exist, and because it’s less likely that there will be multiple independent discoveries of any given secret at around the same time.
Also, if your background view of the world is more adequate, you should be less worried about alignment (both out of deference to the ML mainstream that is at least moderately less worried about alignment; and out of expectation that the ML mainstream will update and change course as needed).
Relatedly, in Eliezer-world you have to do more work to actively recruit the world’s clearest and best thinkers to helping solve alignment. In Paul-world, you can rely more on future AI progress, warning shots, etc. to naturally grow the alignment field.
In the Eliezer-world, timelines are both shorter and less predictable. There’s more potential for AGI to be early-paradigm rather than late-paradigm; and even if it’s late-paradigm, it may be late into a paradigm that doesn’t look very much like GPT-3 or other circa-2021 systems.
In the Eliezer-world, there are many different paths to AGI, and it may be key to humanity’s survival that we pick a relatively good path years in advance, and deliberately steer toward more alignable approaches to AGI. In the Paul-world, there’s one path to AGI, and it’s big and obvious.
Thanks this is helpful! I’d be very curious to see where Paul agreed / disagree with the summary / implications of his view here.
(I’ll emphasize again, by the way, that this is a relative comparison of my model of Paul vs. Eliezer. If Paul and Eliezer’s views on some topic are pretty close in absolute terms, the above might misleadingly suggest more disagreement than there in fact is.)