I primarily upvoted it because I like the push to ‘just candidly talk about your models of stuff’:
I think we die with slightly more dignity—come closer to surviving, as we die—if we are allowed to talk about these matters plainly. Even given that people may then do unhelpful things, after being driven mad by overhearing sane conversations. I think we die with more dignity that way, than if we go down silent and frozen and never talking about our impending death for fear of being overheard by people less sane than ourselves.
I think that in the last surviving possible worlds with any significant shred of subjective probability, people survived in part because they talked about it; even if that meant other people, the story’s antagonists, might possibly hypothetically panic.
Also because I think Eliezer’s framing will be helpful for a bunch of people working on x-risk. Possibly a minority of people, but not a tiny minority. Per my reply to AI_WAIFU, I think there are lots of people who make the two specific mistakes Eliezer is warning about in this post (‘making a habit of strategically saying falsehoods’ and/or ‘making a habit of adopting optimistic assumptions on the premise that the pessimistic view says we’re screwed anyway’).
The latter, especially, is something I’ve seen in EA a lot, and I think the arguments against it here are correct (and haven’t been talked about much).
Given how long it took me to conclude whether these were Eliezer’s true thoughts or a representation of his predicted thoughts in a somewhat probable future, I’m not sure whether I’d use the label “candid” to describe the post, at least without qualification.
While the post does contain a genuinely useful way of framing near-hopeless situations and a nuanced and relatively terse lesson in practical ethics, I would describe the post as an extremely next-level play in terms of its broader purpose (and leave it at that).
I actually think Yudkowsky’s biggest problem may be that he is not talking about his models. In his most prominent posts about AGI doom, such as this and the List of Lethalities, he needs to provide a complete model that clearly and convincingly leads to doom (hopefully without the extreme rhetoric) in order to justify the extreme rhetoric. Why does attempted, but imperfect, alignment lead universally to doom in all likely AGI designs*, when we lack familiarity with the relevant mind design space, or with how long it will take to escalate a given design from AGI to ASI?
* I know his claim isn’t quite this expansive, but his rhetorical style encourages an expansive interpretation.
I’m baffled he gives so little effort to explaining his model. In List of Lethalities he spends a few paragraphs of preamble to cover some essential elements of concern (-3, −2, −1), then offers a few potentially-reasonable-but-minimally-supported assertions, before spending much of the rest of the article prattling off the various ways AGI can kill everyone. Personally I felt like he just skipped over a lot of the important topics, and so didn’t bother to read it to the end.
I think there is probably some time after the first AGI or quasi-AGI arrives, but before the most genocide-prone AGI arrives, in which alignment work can still be done. Eliezer’s rhetorical approach confusingly chooses to burn bridges with this world, as he and MIRI (and probably, by association, rationalists) will be regarded as a laughing stock when that world arrives. Various techbros including AI researchers will be saying “well, AGI came and we’re all still alive, yet there’s EY still reciting his doomer nonsense”. EY will uselessly protest “I didn’t say AGI would necessarily kill everyone right away” while the techbros retweet old EY quotes that kinda sound like that’s what he’s saying.
Edit: for whoever disagreed & downvoted: what for? You know there are e/accs on Twitter telling everyone that the idea of x-risk is based on Yudkowsky being “king of his tribe”, and surely you know that this is not how LessWrong is supposed to work. The risk isn’t supposed to be based on EY’s say-so; a complete and convincing model is needed. If, on the other hand, you disagreed that his communication is incomplete and unconvincing, it should not offend you that not everyone agrees. Like, holy shit: you think humanity will cause apocalypse because it’s not listening to EY, but how dare somebody suggest that EY needs better communication. I wrote this comment because I think it’s very important; what are you here for?
I primarily upvoted it because I like the push to ‘just candidly talk about your models of stuff’:
Also because I think Eliezer’s framing will be helpful for a bunch of people working on x-risk. Possibly a minority of people, but not a tiny minority. Per my reply to AI_WAIFU, I think there are lots of people who make the two specific mistakes Eliezer is warning about in this post (‘making a habit of strategically saying falsehoods’ and/or ‘making a habit of adopting optimistic assumptions on the premise that the pessimistic view says we’re screwed anyway’).
The latter, especially, is something I’ve seen in EA a lot, and I think the arguments against it here are correct (and haven’t been talked about much).
Given how long it took me to conclude whether these were Eliezer’s true thoughts or a representation of his predicted thoughts in a somewhat probable future, I’m not sure whether I’d use the label “candid” to describe the post, at least without qualification.
While the post does contain a genuinely useful way of framing near-hopeless situations and a nuanced and relatively terse lesson in practical ethics, I would describe the post as an extremely next-level play in terms of its broader purpose (and leave it at that).
I actually think Yudkowsky’s biggest problem may be that he is not talking about his models. In his most prominent posts about AGI doom, such as this and the List of Lethalities, he needs to provide a complete model that clearly and convincingly leads to doom (hopefully without the extreme rhetoric) in order to justify the extreme rhetoric. Why does attempted, but imperfect, alignment lead universally to doom in all likely AGI designs*, when we lack familiarity with the relevant mind design space, or with how long it will take to escalate a given design from AGI to ASI?
* I know his claim isn’t quite this expansive, but his rhetorical style encourages an expansive interpretation.
I’m baffled he gives so little effort to explaining his model. In List of Lethalities he spends a few paragraphs of preamble to cover some essential elements of concern (-3, −2, −1), then offers a few potentially-reasonable-but-minimally-supported assertions, before spending much of the rest of the article prattling off the various ways AGI can kill everyone. Personally I felt like he just skipped over a lot of the important topics, and so didn’t bother to read it to the end.
I think there is probably some time after the first AGI or quasi-AGI arrives, but before the most genocide-prone AGI arrives, in which alignment work can still be done. Eliezer’s rhetorical approach confusingly chooses to burn bridges with this world, as he and MIRI (and probably, by association, rationalists) will be regarded as a laughing stock when that world arrives. Various techbros including AI researchers will be saying “well, AGI came and we’re all still alive, yet there’s EY still reciting his doomer nonsense”. EY will uselessly protest “I didn’t say AGI would necessarily kill everyone right away” while the techbros retweet old EY quotes that kinda sound like that’s what he’s saying.
Edit: for whoever disagreed & downvoted: what for? You know there are e/accs on Twitter telling everyone that the idea of x-risk is based on Yudkowsky being “king of his tribe”, and surely you know that this is not how LessWrong is supposed to work. The risk isn’t supposed to be based on EY’s say-so; a complete and convincing model is needed. If, on the other hand, you disagreed that his communication is incomplete and unconvincing, it should not offend you that not everyone agrees. Like, holy shit: you think humanity will cause apocalypse because it’s not listening to EY, but how dare somebody suggest that EY needs better communication. I wrote this comment because I think it’s very important; what are you here for?