I downvoted the post because I don’t think it presents strong epistemics. Some specific critiques:
The author doesn’t explain the reasoning that produced the updates. (They link to posts, but I don’t think it’s epistemically sound to link to say “I made updates and you can find the reasons why in these posts.” At best, people read the posts, and then come away thinking “huh, I wonder which of these specific claims/arguments were persuasive to the poster.”)
The author recommends policy changes (to LW and the field of alignment) that (in my opinion) don’t seem to follow from the claims presented. (The claim “LW and the alignment community should shift their focuses” does not follow from “there is a 50-70% chance of alignment by default”. See comment for more).
The author doesn’t explain their initial threat model, why it was dominated by deception, and why they’re unconvinced by other models of risk & other threat models.
I do applaud the author for sharing the update and expressing an unpopular view. I also feel some pressure to not downvote it because I don’t want to be “downvoting something just because I disagree with it”, but I think in this case it really is the post itself. (I didn’t downvote the linked post, for example).
I downvoted the post because I don’t think it presents strong epistemics. Some specific critiques:
The author doesn’t explain the reasoning that produced the updates. (They link to posts, but I don’t think it’s epistemically sound to link to say “I made updates and you can find the reasons why in these posts.” At best, people read the posts, and then come away thinking “huh, I wonder which of these specific claims/arguments were persuasive to the poster.”)
The author recommends policy changes (to LW and the field of alignment) that (in my opinion) don’t seem to follow from the claims presented. (The claim “LW and the alignment community should shift their focuses” does not follow from “there is a 50-70% chance of alignment by default”. See comment for more).
The author doesn’t explain their initial threat model, why it was dominated by deception, and why they’re unconvinced by other models of risk & other threat models.
I do applaud the author for sharing the update and expressing an unpopular view. I also feel some pressure to not downvote it because I don’t want to be “downvoting something just because I disagree with it”, but I think in this case it really is the post itself. (I didn’t downvote the linked post, for example).