“Why Not Just...”johnswentworthAug 8, 2022, 6:15 PMA compendium of rants about alignment proposals, of varying charitability.Deep Learning Systems Are Not Less Interpretable Than Logic/Probability/EtcjohnswentworthJun 4, 2022, 5:41 AM160 points55 comments2 min readLW link1 reviewGodzilla StrategiesjohnswentworthJun 11, 2022, 3:44 PM159 points72 comments3 min readLW linkRant on Problem Factorization for AlignmentjohnswentworthAug 5, 2022, 7:23 PM104 points53 comments6 min readLW linkInterpretability/Tool-ness/Alignment/Corrigibility are not ComposablejohnswentworthAug 8, 2022, 6:05 PM144 points13 comments3 min readLW linkHow To Go From Interpretability To Alignment: Just Retarget The SearchjohnswentworthAug 10, 2022, 4:08 PM210 points34 comments3 min readLW link1 reviewOversight Misses 100% of Thoughts The AI Does Not ThinkjohnswentworthAug 12, 2022, 4:30 PM111 points49 comments1 min readLW linkHuman Mimicry Mainly Works When We’re Already ClosejohnswentworthAug 17, 2022, 6:41 PM82 points16 comments5 min readLW linkWorlds Where Iterative Design FailsjohnswentworthAug 30, 2022, 8:48 PM209 points30 comments10 min readLW link1 reviewWhy Not Just… Build Weak AI Tools For AI Alignment Research?johnswentworthMar 5, 2023, 12:12 AM184 points18 comments6 min readLW linkWhy Not Just Outsource Alignment Research To An AI?johnswentworthMar 9, 2023, 9:49 PM151 points50 comments9 min readLW link1 reviewOpenAI Launches Superalignment TaskforceZviJul 11, 2023, 1:00 PM150 points40 comments49 min readLW link(thezvi.wordpress.com)