But “let’s just assume this whole alignment thing isn’t real” is a particularly extreme instance of the behavior that Eliezer is criticizing.
I endorse this response, and want to highlight it. Specifically, there are two separate errors in “if Eliezer, an expert in that field. is correct, then it’s hopeless for us to try. So our only hope is that he’s wrong and the field is somewhat ridiculous or easily solved.”:
1. The post doesn’t say it’s hopeless to try (‘low probability of success’ is not equivalent to ‘negligible probablity of success’).
2. Per my comment in another thread, the exact mental move the OP is criticizing is ‘if you’re right then we’re screwed regardless, so we might as well adopt a bunch of more-optimistic assumptions’.
What I’d recommend is just to start building your own mental models of AI risk, especially if your initial reason for concern was mainly based on deference to others.
If you get enough evidence to decide that alignment is difficult and necessary for an awesome future (based on a mix of your more-fleshed-out models of the object level and your more-fleshed-out models of which people tend to be right about which things), then work on it insofar as this is higher-EV than whatever you’d be doing instead, even if you end up agreeing with Eliezer that the absolute odds of success are low.
In the process, be wary of flailing and of rushing to epistemic closure; and see also https://www.lesswrong.com/posts/st7DiQP23YQSxumCt/on-doing-the-improbable. That post was about people preemptively giving up even in the face of moderate difficulties, where it’s especially obvious that the psychological processes at work are maladaptive; but the same processes are also in play when there are larger difficulties, even if they’re harder to spot because the problem genuinely is harder and less obviously tractable.
I think you’d have to both reject Eliezer’s basic perspective, and take Eliezer as representing the entire field so strongly that rejecting his perspective means rejecting the whole field.
I think this is mostly false, in that it’s not sufficient for ‘I shouldn’t work on AI alignment’ (Eliezer can be wrong about a lot of core things and it still be the case that AI alignment is important and difficult), and in that it doesn’t connect up to joraine’s ‘I infer from this post that I shouldn’t work on AI alignment’.
I don’t think that’s reasonable, unless your interest in AI safety was primarily driven by deference to Eliezer.
Likewise, this seems false to me because it suggests that past Eliezer-deference is sufficient to justify ‘alignment isn’t important if Eliezer is wrong’ (which it’s not), and/or it suggests that Eliezer-deferential people should conclude from the OP ‘ah, Eliezer is wrong about AI’ (without appeal to your own personal model of AI that disagrees with Eliezer).
‘Our odds of success were pretty low and have gotten lower’ is just a possible state of the world; it isn’t a ridiculous-on-its-face view. So while there are reasonable paths to arriving at the conclusion ‘I shouldn’t work on alignment’, I don’t think any of them have much to do with this post.
I guess the most reasonable would be ‘I don’t care that much about longevity or the long-term future, so I’ll work on alignment if I think it’s easy but not if I think it’s hard; and the OP updated me toward thinking it’s hard; so now I’m going to go play video games and chill instead of worrying about the long-term future.’
If so, I do think it makes sense to stop deferring
I don’t know what specific things ‘If so’ is referring back to, and I also recommended trying to build one’s own models of the space. But it again seems to me like this can be misinterpreted as saying ‘the OP contains an obvious reasoning error such that if you came to the post thinking EY was right about p(doom), you should leave it thinking he’s wrong about p(doom)’. That seems obviously wrong to me, whether or not EY is actually correct about p(doom).
The reason you should build your own models of this space is because there’s high expected information value (both for you and for others) in figuring out what’s up here. If you’re a smart technical person who hasn’t done much of their own model-building about AI, then IMO that’s very likely to be true regardless of what your current p(doom) is.
I endorse this response, and want to highlight it. Specifically, there are two separate errors in “if Eliezer, an expert in that field. is correct, then it’s hopeless for us to try. So our only hope is that he’s wrong and the field is somewhat ridiculous or easily solved.”:
1. The post doesn’t say it’s hopeless to try (‘low probability of success’ is not equivalent to ‘negligible probablity of success’).
2. Per my comment in another thread, the exact mental move the OP is criticizing is ‘if you’re right then we’re screwed regardless, so we might as well adopt a bunch of more-optimistic assumptions’.
What I’d recommend is just to start building your own mental models of AI risk, especially if your initial reason for concern was mainly based on deference to others.
If you get enough evidence to decide that alignment is difficult and necessary for an awesome future (based on a mix of your more-fleshed-out models of the object level and your more-fleshed-out models of which people tend to be right about which things), then work on it insofar as this is higher-EV than whatever you’d be doing instead, even if you end up agreeing with Eliezer that the absolute odds of success are low.
In the process, be wary of flailing and of rushing to epistemic closure; and see also https://www.lesswrong.com/posts/st7DiQP23YQSxumCt/on-doing-the-improbable. That post was about people preemptively giving up even in the face of moderate difficulties, where it’s especially obvious that the psychological processes at work are maladaptive; but the same processes are also in play when there are larger difficulties, even if they’re harder to spot because the problem genuinely is harder and less obviously tractable.
I think this is mostly false, in that it’s not sufficient for ‘I shouldn’t work on AI alignment’ (Eliezer can be wrong about a lot of core things and it still be the case that AI alignment is important and difficult), and in that it doesn’t connect up to joraine’s ‘I infer from this post that I shouldn’t work on AI alignment’.
Likewise, this seems false to me because it suggests that past Eliezer-deference is sufficient to justify ‘alignment isn’t important if Eliezer is wrong’ (which it’s not), and/or it suggests that Eliezer-deferential people should conclude from the OP ‘ah, Eliezer is wrong about AI’ (without appeal to your own personal model of AI that disagrees with Eliezer).
‘Our odds of success were pretty low and have gotten lower’ is just a possible state of the world; it isn’t a ridiculous-on-its-face view. So while there are reasonable paths to arriving at the conclusion ‘I shouldn’t work on alignment’, I don’t think any of them have much to do with this post.
I guess the most reasonable would be ‘I don’t care that much about longevity or the long-term future, so I’ll work on alignment if I think it’s easy but not if I think it’s hard; and the OP updated me toward thinking it’s hard; so now I’m going to go play video games and chill instead of worrying about the long-term future.’
I don’t know what specific things ‘If so’ is referring back to, and I also recommended trying to build one’s own models of the space. But it again seems to me like this can be misinterpreted as saying ‘the OP contains an obvious reasoning error such that if you came to the post thinking EY was right about p(doom), you should leave it thinking he’s wrong about p(doom)’. That seems obviously wrong to me, whether or not EY is actually correct about p(doom).
The reason you should build your own models of this space is because there’s high expected information value (both for you and for others) in figuring out what’s up here. If you’re a smart technical person who hasn’t done much of their own model-building about AI, then IMO that’s very likely to be true regardless of what your current p(doom) is.