I think this post doesn’t really explain why rats have high belief in doom, or why they’re wrong to do so. Perhaps ironically, there is a better a version of this post on both counts which isn’t so focused on how rats get epistemology wrong and the social/meta-level consequences. A post which focuses on the object-level implications for AI of a theory of rationality which looks very different from the AIXI-flavoured rat-orthodox view.
I say this because those sorts of considerations convinced me that we’re much less likely to be buggered. I.e. I no longer believe EU maximization is/will be a good description by default of TAI or widely economically productive AGI, mildly superhuman AGI or even ASI, depending on the details. Which is partly due to a recognition that the arguments for EU maximization are weaker than I thought, arguments for LDT being convergent are lacking, the notions of optimality we do have are very weak, the existence and behaviour of GPT-4, Claude Opus etc.
6 seems too general a claim to me. Why wouldn’t it work for 1% vs 10%, and likewise 0.1% vs 1% i.e. why doesn’t this suggest that you should round down P(doom) to zero. Also, I don’t even know what you mean by “most” here. Like, are we quantifying over methods of reasoning used by current AI researchers right now? Over all time? Over all AI researchers and engineers? Over everyone in the West? Over everyone who’s ever lived? Etc.
And it seems to me like you’re implicitly privileging ways of combining these opinions that get you 10% instead of 1% or 90%, which is begging the question. Of course, you could reply that a P(doom) of 10% is confused, that isn’t really your state of knowledge, lumping in all your sub-agents models into a single number is too lossy etc. But then why mention that 90% is a much stronger prediction than 10% instead of saying they’re roughly equally confused?
7 I kinda disagree with. Those models of idealized reasoning you mention generalize Bayesianism/Expected Utility Maximization. But they are not far from the Bayesian framework or EU frameworks. Like Bayesianism, they do say there are correct and incorrect ways of combining beliefs, that beliefs should be isomorphic to certain structures, unless I’m horribly mistaken. Which sure is not what you’re claiming to be the case in your above points.
Also, a lot of rationalists already recognize that these models are addressing flaws in Bayesianism like logical omniscience, embeddedness etc. Like, I believed this at least around 2017, and probably earlier. Also, note that these models of epistemology are not in tension with a strong belief that we’re buggered. Last I checked, the people who invented these models believe we’re buggered. I think they may imply that we’re a little less than the EU maximization theory though, but I don’t think this is a big difference. IMO this is not a big enough departure to do the work that your post requires.
A post which focuses on the object-level implications for AI of a theory of rationality which looks very different from the AIXI-flavoured rat-orthodox view.
I’m working on this right now, actually. Will hopefully post in a couple of weeks.
I say this because those sorts of considerations convinced me that we’re much less likely to be buggered.
That seems reasonable. But I do think there’s a group of people who have internalized bayesian rationalism enough that the main blocker is their general epistemology, rather than the way they reason about AI in particular.
6 seems too general a claim to me. Why wouldn’t it work for 1% vs 10%, and likewise 0.1% vs 1% i.e. why doesn’t this suggest that you should round down P(doom) to zero.
I think the point of 6 is not to say “here’s where you should end up”, but more to say “here’s the reason why this straightforward symmetry argument doesn’t hold”.
7 I kinda disagree with. Those models of idealized reasoning you mention generalize Bayesianism/Expected Utility Maximization. But they are not far from the Bayesian framework or EU frameworks.
There’s still something importantly true about EU maximization and bayesianism. I think the changes we need will be subtle but have far-reaching ramifications. Analogously, relativity was a subtle change to newtonian mechanics that had far-reaching implications for how to think about reality.
Like Bayesianism, they do say there are correct and incorrect ways of combining beliefs, that beliefs should be isomorphic to certain structures, unless I’m horribly mistaken. Which sure is not what you’re claiming to be the case in your above points.
Any epistemology will rule out some updates, but a problem with bayesianism is that it says there’s one correct update to make. Whereas radical probabilism, for example, still sets some constraints, just far fewer.
I’m working on this right now, actually. Will hopefully post in a couple of weeks.
This sounds cool.
That seems reasonable. But I do think there’s a group of people who have internalized bayesian rationalism enough that the main blocker is their general epistemology, rather than the way they reason about AI in particular.
I think your OP didn’t give enough details as to why internalizing Bayesian rationalism leads to doominess by default. Like, Nora Belrose is firmly Bayesian and is decidedly an optimist. Admittedly, I think she doesn’t think a Kolmogorov prior is a good one, but I don’t think that makes you much more doomy either. I think Jacob Cannel and others are also Bayesian and non-doomy. Perhaps I’m using “Bayesian rationalism” differently than you are, which is why I think your claim, as I read it, is invalid.
I think the point of 6 is not to say “here’s where you should end up”, but more to say “here’s the reason why this straightforward symmetry argument doesn’t hold”.
Fair enough. However, how big is the asymmetry? I’m a bit sceptical there is a large one. Based off my interactions, it seems like ~ everyone who has seriously thought about this topic for a couple of hours has radically different models, w/ radically different levels of doominess. This holds even amongst people who share many lenses (e.g. Tyler Cowen vs Robin Hanson, Paul Christiano vs. Scott Aaronson, Steve Hsu vs Michael Nielsen etc.).
There’s still something importantly true about EU maximization and bayesianism. I think the changes we need will be subtle but have far-reaching ramifications. Analogously, relativity was a subtle change to newtonian mechanics that had far-reaching implications for how to think about reality.
I think we’re in agreement over this. (I think Bayesianism less wrong than EU maximization, and probably a very good approximation in lots of places, like Newtonian physics is for GR.) But my contention is over Bayesian epistemology tripping many rats up when thinking about AI x-risk. You need some story which explains why sticking to Bayesian epistemology is tripping up very many people here in particular.
Any epistemology will rule out some updates, but a problem with bayesianism is that it says there’s one correct update to make. Whereas radical probabilism, for example, still sets some constraints, just far fewer.
Right, but in radical probabilism the type of beliefs is still a real valued function, no? Which is in tension w/ many disparate models that don’t get compressed down to a single number. In that sense, the refined formalism is still rigid in a way that your description is flexible. And I suspect the same is true for Infra-Bayesianism, though I understand that even less well than radical probabilism.
I think this post doesn’t really explain why rats have high belief in doom, or why they’re wrong to do so. Perhaps ironically, there is a better a version of this post on both counts which isn’t so focused on how rats get epistemology wrong and the social/meta-level consequences. A post which focuses on the object-level implications for AI of a theory of rationality which looks very different from the AIXI-flavoured rat-orthodox view.
I say this because those sorts of considerations convinced me that we’re much less likely to be buggered. I.e. I no longer believe EU maximization is/will be a good description by default of TAI or widely economically productive AGI, mildly superhuman AGI or even ASI, depending on the details. Which is partly due to a recognition that the arguments for EU maximization are weaker than I thought, arguments for LDT being convergent are lacking, the notions of optimality we do have are very weak, the existence and behaviour of GPT-4, Claude Opus etc.
6 seems too general a claim to me. Why wouldn’t it work for 1% vs 10%, and likewise 0.1% vs 1% i.e. why doesn’t this suggest that you should round down P(doom) to zero. Also, I don’t even know what you mean by “most” here. Like, are we quantifying over methods of reasoning used by current AI researchers right now? Over all time? Over all AI researchers and engineers? Over everyone in the West? Over everyone who’s ever lived? Etc.
And it seems to me like you’re implicitly privileging ways of combining these opinions that get you 10% instead of 1% or 90%, which is begging the question. Of course, you could reply that a P(doom) of 10% is confused, that isn’t really your state of knowledge, lumping in all your sub-agents models into a single number is too lossy etc. But then why mention that 90% is a much stronger prediction than 10% instead of saying they’re roughly equally confused?
7 I kinda disagree with. Those models of idealized reasoning you mention generalize Bayesianism/Expected Utility Maximization. But they are not far from the Bayesian framework or EU frameworks. Like Bayesianism, they do say there are correct and incorrect ways of combining beliefs, that beliefs should be isomorphic to certain structures, unless I’m horribly mistaken. Which sure is not what you’re claiming to be the case in your above points.
Also, a lot of rationalists already recognize that these models are addressing flaws in Bayesianism like logical omniscience, embeddedness etc. Like, I believed this at least around 2017, and probably earlier. Also, note that these models of epistemology are not in tension with a strong belief that we’re buggered. Last I checked, the people who invented these models believe we’re buggered. I think they may imply that we’re a little less than the EU maximization theory though, but I don’t think this is a big difference. IMO this is not a big enough departure to do the work that your post requires.
Thanks for the reply.
I’m working on this right now, actually. Will hopefully post in a couple of weeks.
That seems reasonable. But I do think there’s a group of people who have internalized bayesian rationalism enough that the main blocker is their general epistemology, rather than the way they reason about AI in particular.
I think the point of 6 is not to say “here’s where you should end up”, but more to say “here’s the reason why this straightforward symmetry argument doesn’t hold”.
There’s still something importantly true about EU maximization and bayesianism. I think the changes we need will be subtle but have far-reaching ramifications. Analogously, relativity was a subtle change to newtonian mechanics that had far-reaching implications for how to think about reality.
Any epistemology will rule out some updates, but a problem with bayesianism is that it says there’s one correct update to make. Whereas radical probabilism, for example, still sets some constraints, just far fewer.
This sounds cool.
I think your OP didn’t give enough details as to why internalizing Bayesian rationalism leads to doominess by default. Like, Nora Belrose is firmly Bayesian and is decidedly an optimist. Admittedly, I think she doesn’t think a Kolmogorov prior is a good one, but I don’t think that makes you much more doomy either. I think Jacob Cannel and others are also Bayesian and non-doomy. Perhaps I’m using “Bayesian rationalism” differently than you are, which is why I think your claim, as I read it, is invalid.
Fair enough. However, how big is the asymmetry? I’m a bit sceptical there is a large one. Based off my interactions, it seems like ~ everyone who has seriously thought about this topic for a couple of hours has radically different models, w/ radically different levels of doominess. This holds even amongst people who share many lenses (e.g. Tyler Cowen vs Robin Hanson, Paul Christiano vs. Scott Aaronson, Steve Hsu vs Michael Nielsen etc.).
I think we’re in agreement over this. (I think Bayesianism less wrong than EU maximization, and probably a very good approximation in lots of places, like Newtonian physics is for GR.) But my contention is over Bayesian epistemology tripping many rats up when thinking about AI x-risk. You need some story which explains why sticking to Bayesian epistemology is tripping up very many people here in particular.
Right, but in radical probabilism the type of beliefs is still a real valued function, no? Which is in tension w/ many disparate models that don’t get compressed down to a single number. In that sense, the refined formalism is still rigid in a way that your description is flexible. And I suspect the same is true for Infra-Bayesianism, though I understand that even less well than radical probabilism.