Lauro Langosco comments on AI as a science, and three obstacles to alignment strategies

Lauro Langosco 27 Oct 2023 12:38 UTC
LW: 3 AF: 3
0
AF
The argument I think is good (nr (2) in my previous comment) doesn’t go through reference classes at all. I don’t want to make an outside-view argument (eg “things we call optimization often produce misaligned results, therefore sgd is dangerous”). I like the evolution analogy because it makes salient some aspects of AI training that make misalignment more likely. Once those aspects are salient you can stop thinking about evolution and just think directly about AI.