Basically the fact LW has far more arguments for “alignment will be hard” compared to alignment being easy is the selection effect I’m talking about.
I was also worried because ML people don’t really think that AGI poses an existential risk, and that’s evidence, in an Aumann sense.
Now I do think this is explainable, but other issues remain:
Has longstanding disagreements about basic matters
Has theories—but many of the theories have not resulted in really any concrete predictions that differentiate from standard expectations, despite efforts to do so.
Basically the fact LW has far more arguments for “alignment will be hard” compared to alignment being easy is the selection effect I’m talking about.
That could either be ‘we’re selecting for good arguments, and the good arguments point toward alignment being hard’, or it could be a non-epistemic selection effect.
Why do you think it’s a non-epistemic selection effect? It’s easier to find arguments for ‘the Earth is round’ than ‘the Earth is flat’, but that doesn’t demonstrate a non-epistemic bias.
I was also worried because ML people don’t really think that AGI poses an existential risk, and that’s evidence, in an Aumann sense.
… By ‘an Aumann sense’ do you just mean ‘if you know nothing about a brain, then knowing it believes P is some Bayesian evidence for the truth of P’? That seems like a very weird way to use “Aumann”, but if that’s what you mean then sure. It’s trivial evidence to anyone who’s spent much time poking at the details, but it’s evidence.
… By ‘an Aumann sense’ do you just mean ‘if you know nothing about a brain, then knowing it believes P is some Bayesian evidence for the truth of P’? That seems like a very weird way to use “Aumann”, but if that’s what you mean then sure. It’s trivial evidence to anyone who’s spent much time poking at the details, but it’s evidence.
Basically, it means that the fact that other smart people working in ML/AI doesn’t agree with LW is itself evidence that LW is wrong, since rational reasoner’s updating towards the same priors should see disagreements lessen until there isn’t disagreement, at least in the case where there is only 1 objective truth.
Now I do think this is explainable as a case where vast incentives for capabilities researchers to adopt the position that it isn’t an existential risk, given the potential power and massive impact of AGI benefits them.
That could either be ‘we’re selecting for good arguments, and the good arguments point toward alignment being hard’, or it could be a non-epistemic selection effect.
Why do you think it’s a non-epistemic selection effect? It’s easier to find arguments for ‘the Earth is round’ than ‘the Earth is flat’, but that doesn’t demonstrate a non-epistemic bias.
I definitely agree that this alone doesn’t only suggest that LW isn’t doing it’s epistemics well or that there is a problematic selection effect.
My worries re LW epistemics are the following:
There’s a lot more theory than empirical evidence, while this is changing for the better, theory being so predominant in LW culture is a serious problem, as theory can easily get out of reality and model poorly.
Comparing AI will come to the idea that the world is round is sort of insane, as the fact that the world is round has both some theoretical evidence and massive empirical evidence, and LW on AI is nowhere close to the evidence for AI catastrophe. It also isn’t necessary.
More specifically, the outside view suggests that AI takeoff is probably slower and most importantly, it suggests that catastrophe from technology has a low prior, given massive amounts of claims of impending doom/catastrophe don’t ever occur, and catastrophe potential from nukes is almost certainly overrated, which tells us we should have much lower priors and suggests thats humanity dying out is actually hard.
I’m not asking LWers to totally change their views, but to have more uncertainty in their estimates of AI risk.
I’m not much moved by these types of arguments, essentially because (in my view) the level of meta at which they occur is too far removed from the object level. If you look at the actual points your opponents lay out, and decide (for whatever reason) that you find those points uncompelling… that’s it. Your job here is done, and the remaining fact that they disagree with you is, if not explained away, then at least screened off. (And to be clear, sometimes it is explained away, although that happens mostly with bad arguments.)
Ditto for outside view arguments—if you’ve looked at past examples of tech, concluded that they’re dissimilar from AGI in a number of ways (not a hard conclusion to reach), and moreover concluded that some of those dissimilarities are strategically significant (a slightly harder conclusion, and one that some people stumble before reaching—but not, ultimately, that hard), then the base rates of the category being outside-viewed no longer contain any independently relevant information, which means that—again—your job here is done.
(I’ve made comments to similar effect in the past, and plan to continuing trumpeting this horn for as long as the meme to which it is counter continues to exist.)
This does, of course, rely on your own reasoning to be correct, in the sense that if you’re wrong, well… you’re wrong. But of course, this really isn’t a particularly special kind of situation: it’s one that recurs all across life, in all kinds of fields and domains. And in particular, it’s not the kind of situation you should cower away from in fear—not if your goal is actually grasping the reality of the situation.
***
And finally (and obviously), all of this only applies to the person making the updates in the first place (which is why, you may notice, everything above the asterisks seems to inhabit the perspective of someone who believes they understand what’s happening, and takes for granted that it’s possible for them to be right as well as wrong). If you’re not in the position of such an individual, but instead conceive of yourself as primarily a third party, an outsider looking in...
...well, mostly I’d ask what the heck you’re doing, and why you aren’t either (1) trying to form your own models, to become one of the People Who Can Get Things Right As Well As Wrong, or—alternatively—(2) deciding that it’s not worth your time and effort, either because of a lack of comparative advantage, or just because you think the whole thing is Likely To Be Bunk.
It kind of sounds like you’re on the second path—which, to be clear, is totally fine! One of the predictable consequences of Daring to Disagree with Others is that Other Others might look upon you, notice that they can’t really tell who’s right from the outside, and downgrade their confidence accordingly. That’s fine, and even good in some sense: you definitely don’t want people thinking they ought to believe something even in [what looks to them like] the absence of any good arguments for it; that’s a recipe for irrationality.
But that’s the whole point, isn’t it—that the perspectives of the Insider, the Researcher Trying to Get At the Truth, and the Outsider, the Bystander Peering Through the Windows—will not look identical, and for obvious reason: they’re different people standing in different (epistemic) places! Neither one of them should agonize about the fact that the former has a tighter probability distribution than the latter; that’s what happens when you proceed further down the path—ideally the right path, but any path has the same property: that your probability distribution narrows as you go further down, and your models become more specific and more detailed.
So go ahead and downgrade your assessment of “LW epistemics” accordingly, if that’s what you’ve decided is the right thing to do in your position as the outsider looking in. (Although I’d argue that what you’d really want is to downgrade your assessment of MIRI, instead of LW as a whole; they’re the most extreme ones in the room, after all. For the record, I think this is Pretty Awesome, but your mileage may vary.) But don’t demand that the Insider be forced to update their probability distribution to match yours—to widen their distribution, to walk back the path they’ve followed in the course of forming their detailed models—simply because you can’t see what [they think] they’re seeing, from their vantage point!
Those people are down in the trenches for a reason: they’re investigating what they see as the most likely possibilities, and letting them do their work is good, even if you think they haven’t justified their (seeming) confidence level to your satisfaction. They’re not trying to.
(Oh hey, I think that has something to do with the title of the post we’re commenting on.)
Basically the fact LW has far more arguments for “alignment will be hard” compared to alignment being easy is the selection effect I’m talking about.
I was also worried because ML people don’t really think that AGI poses an existential risk, and that’s evidence, in an Aumann sense.
Now I do think this is explainable, but other issues remain:
That could either be ‘we’re selecting for good arguments, and the good arguments point toward alignment being hard’, or it could be a non-epistemic selection effect.
Why do you think it’s a non-epistemic selection effect? It’s easier to find arguments for ‘the Earth is round’ than ‘the Earth is flat’, but that doesn’t demonstrate a non-epistemic bias.
… By ‘an Aumann sense’ do you just mean ‘if you know nothing about a brain, then knowing it believes P is some Bayesian evidence for the truth of P’? That seems like a very weird way to use “Aumann”, but if that’s what you mean then sure. It’s trivial evidence to anyone who’s spent much time poking at the details, but it’s evidence.
Basically, it means that the fact that other smart people working in ML/AI doesn’t agree with LW is itself evidence that LW is wrong, since rational reasoner’s updating towards the same priors should see disagreements lessen until there isn’t disagreement, at least in the case where there is only 1 objective truth.
Now I do think this is explainable as a case where vast incentives for capabilities researchers to adopt the position that it isn’t an existential risk, given the potential power and massive impact of AGI benefits them.
I definitely agree that this alone doesn’t only suggest that LW isn’t doing it’s epistemics well or that there is a problematic selection effect.
My worries re LW epistemics are the following:
There’s a lot more theory than empirical evidence, while this is changing for the better, theory being so predominant in LW culture is a serious problem, as theory can easily get out of reality and model poorly.
Comparing AI will come to the idea that the world is round is sort of insane, as the fact that the world is round has both some theoretical evidence and massive empirical evidence, and LW on AI is nowhere close to the evidence for AI catastrophe. It also isn’t necessary.
More specifically, the outside view suggests that AI takeoff is probably slower and most importantly, it suggests that catastrophe from technology has a low prior, given massive amounts of claims of impending doom/catastrophe don’t ever occur, and catastrophe potential from nukes is almost certainly overrated, which tells us we should have much lower priors and suggests thats humanity dying out is actually hard.
I’m not asking LWers to totally change their views, but to have more uncertainty in their estimates of AI risk.
I’m not much moved by these types of arguments, essentially because (in my view) the level of meta at which they occur is too far removed from the object level. If you look at the actual points your opponents lay out, and decide (for whatever reason) that you find those points uncompelling… that’s it. Your job here is done, and the remaining fact that they disagree with you is, if not explained away, then at least screened off. (And to be clear, sometimes it is explained away, although that happens mostly with bad arguments.)
Ditto for outside view arguments—if you’ve looked at past examples of tech, concluded that they’re dissimilar from AGI in a number of ways (not a hard conclusion to reach), and moreover concluded that some of those dissimilarities are strategically significant (a slightly harder conclusion, and one that some people stumble before reaching—but not, ultimately, that hard), then the base rates of the category being outside-viewed no longer contain any independently relevant information, which means that—again—your job here is done.
(I’ve made comments to similar effect in the past, and plan to continuing trumpeting this horn for as long as the meme to which it is counter continues to exist.)
This does, of course, rely on your own reasoning to be correct, in the sense that if you’re wrong, well… you’re wrong. But of course, this really isn’t a particularly special kind of situation: it’s one that recurs all across life, in all kinds of fields and domains. And in particular, it’s not the kind of situation you should cower away from in fear—not if your goal is actually grasping the reality of the situation.
***
And finally (and obviously), all of this only applies to the person making the updates in the first place (which is why, you may notice, everything above the asterisks seems to inhabit the perspective of someone who believes they understand what’s happening, and takes for granted that it’s possible for them to be right as well as wrong). If you’re not in the position of such an individual, but instead conceive of yourself as primarily a third party, an outsider looking in...
...well, mostly I’d ask what the heck you’re doing, and why you aren’t either (1) trying to form your own models, to become one of the People Who Can Get Things Right As Well As Wrong, or—alternatively—(2) deciding that it’s not worth your time and effort, either because of a lack of comparative advantage, or just because you think the whole thing is Likely To Be Bunk.
It kind of sounds like you’re on the second path—which, to be clear, is totally fine! One of the predictable consequences of Daring to Disagree with Others is that Other Others might look upon you, notice that they can’t really tell who’s right from the outside, and downgrade their confidence accordingly. That’s fine, and even good in some sense: you definitely don’t want people thinking they ought to believe something even in [what looks to them like] the absence of any good arguments for it; that’s a recipe for irrationality.
But that’s the whole point, isn’t it—that the perspectives of the Insider, the Researcher Trying to Get At the Truth, and the Outsider, the Bystander Peering Through the Windows—will not look identical, and for obvious reason: they’re different people standing in different (epistemic) places! Neither one of them should agonize about the fact that the former has a tighter probability distribution than the latter; that’s what happens when you proceed further down the path—ideally the right path, but any path has the same property: that your probability distribution narrows as you go further down, and your models become more specific and more detailed.
So go ahead and downgrade your assessment of “LW epistemics” accordingly, if that’s what you’ve decided is the right thing to do in your position as the outsider looking in. (Although I’d argue that what you’d really want is to downgrade your assessment of MIRI, instead of LW as a whole; they’re the most extreme ones in the room, after all. For the record, I think this is Pretty Awesome, but your mileage may vary.) But don’t demand that the Insider be forced to update their probability distribution to match yours—to widen their distribution, to walk back the path they’ve followed in the course of forming their detailed models—simply because you can’t see what [they think] they’re seeing, from their vantage point!
Those people are down in the trenches for a reason: they’re investigating what they see as the most likely possibilities, and letting them do their work is good, even if you think they haven’t justified their (seeming) confidence level to your satisfaction. They’re not trying to.
(Oh hey, I think that has something to do with the title of the post we’re commenting on.)
Thank you for answering, and I now get why narrower probability distributions are there for the inside view
I hope that no matter what really is the truth, that LWers continue on, but always making sure that they are careful in how well their epistemics are.