interstice comments on Counterarguments to the basic AI x-risk case

interstice 16 Oct 2022 6:47 UTC
8 points
5
An AI with a good world model will predictably have a model of your values, but that’s different from being able to actually elicit that model via e.g. a series of labeled examples. That’s the part that seemed less plausible before DL.