DPiepgrass comments on MIRI announces new “Death With Dignity” strategy

DPiepgrass 22 Aug 2024 23:03 UTC
−1 points
−2
I actually think Yudkowsky’s biggest problem may be that he is not talking about his models. In his most prominent posts about AGI doom, such as this and the List of Lethalities, he needs to provide a complete model that clearly and convincingly leads to doom (hopefully without the extreme rhetoric) in order to justify the extreme rhetoric. Why does attempted, but imperfect, alignment lead universally to doom in all likely AGI designs*, when we lack familiarity with the relevant mind design space, or with how long it will take to escalate a given design from AGI to ASI?
* I know his claim isn’t quite this expansive, but his rhetorical style encourages an expansive interpretation.
I’m baffled he gives so little effort to explaining his model. In List of Lethalities he spends a few paragraphs of preamble to cover some essential elements of concern (-3, −2, −1), then offers a few potentially-reasonable-but-minimally-supported assertions, before spending much of the rest of the article prattling off the various ways AGI can kill everyone. Personally I felt like he just skipped over a lot of the important topics, and so didn’t bother to read it to the end.
I think there is probably some time after the first AGI or quasi-AGI arrives, but before the most genocide-prone AGI arrives, in which alignment work can still be done. Eliezer’s rhetorical approach confusingly chooses to burn bridges with this world, as he and MIRI (and probably, by association, rationalists) will be regarded as a laughing stock when that world arrives. Various techbros including AI researchers will be saying “well, AGI came and we’re all still alive, yet there’s EY still reciting his doomer nonsense”. EY will uselessly protest “I didn’t say AGI would necessarily kill everyone right away” while the techbros retweet old EY quotes that kinda sound like that’s what he’s saying.
Edit: for whoever disagreed & downvoted: what for? You know there are e/accs on Twitter telling everyone that the idea of x-risk is based on Yudkowsky being “king of his tribe”, and surely you know that this is not how LessWrong is supposed to work. The risk isn’t supposed to be based on EY’s say-so; a complete and convincing model is needed. If, on the other hand, you disagreed that his communication is incomplete and unconvincing, it should not offend you that not everyone agrees. Like, holy shit: you think humanity will cause apocalypse because it’s not listening to EY, but how dare somebody suggest that EY needs better communication. I wrote this comment because I think it’s very important; what are you here for?
What links here?
- TAG's comment on james oofou’s Shortform by james oofou (14 Sep 2024 10:42 UTC; 3 points)