I really wish there was an “agree/disagree” button for posts. I’d like to upvote this post (for epistemic virtue / presenting reasonable “contrarian” views and explaining why one holds them), but I also strongly disagree with the conclusions and suggested policies. (I ended up voting neither up nor down.)
EDIT: After reading Akash’s comments, and re-reading the post more carefully: I largely agree with Akash (and updated towards thinking that my standards for “epistemic virtue” are/were too low).
I downvoted the post because I don’t think it presents strong epistemics. Some specific critiques:
The author doesn’t explain the reasoning that produced the updates. (They link to posts, but I don’t think it’s epistemically sound to link to say “I made updates and you can find the reasons why in these posts.” At best, people read the posts, and then come away thinking “huh, I wonder which of these specific claims/arguments were persuasive to the poster.”)
The author recommends policy changes (to LW and the field of alignment) that (in my opinion) don’t seem to follow from the claims presented. (The claim “LW and the alignment community should shift their focuses” does not follow from “there is a 50-70% chance of alignment by default”. See comment for more).
The author doesn’t explain their initial threat model, why it was dominated by deception, and why they’re unconvinced by other models of risk & other threat models.
I do applaud the author for sharing the update and expressing an unpopular view. I also feel some pressure to not downvote it because I don’t want to be “downvoting something just because I disagree with it”, but I think in this case it really is the post itself. (I didn’t downvote the linked post, for example).
I really wish there was an “agree/disagree” button for posts. I’d like to upvote this post (for
epistemic virtue /presenting reasonable “contrarian” views and explaining why one holds them), but I also strongly disagree with the conclusions and suggested policies. (I ended up voting neither up nor down.)EDIT: After reading Akash’s comments, and re-reading the post more carefully: I largely agree with Akash (and updated towards thinking that my standards for “epistemic virtue” are/were too low).
I downvoted the post because I don’t think it presents strong epistemics. Some specific critiques:
The author doesn’t explain the reasoning that produced the updates. (They link to posts, but I don’t think it’s epistemically sound to link to say “I made updates and you can find the reasons why in these posts.” At best, people read the posts, and then come away thinking “huh, I wonder which of these specific claims/arguments were persuasive to the poster.”)
The author recommends policy changes (to LW and the field of alignment) that (in my opinion) don’t seem to follow from the claims presented. (The claim “LW and the alignment community should shift their focuses” does not follow from “there is a 50-70% chance of alignment by default”. See comment for more).
The author doesn’t explain their initial threat model, why it was dominated by deception, and why they’re unconvinced by other models of risk & other threat models.
I do applaud the author for sharing the update and expressing an unpopular view. I also feel some pressure to not downvote it because I don’t want to be “downvoting something just because I disagree with it”, but I think in this case it really is the post itself. (I didn’t downvote the linked post, for example).