Ajeya Cotra comments on Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover

Ajeya Cotra 19 Jul 2022 23:44 UTC
LW: 14 AF: 9
1
AF
Hm, not sure I understand but I wasn’t trying to make super specific mechanistic claims here—I agree that what I said doesn’t reduce confusion about the specific internal mechanisms of how lying gets to be hard for most humans, but I wasn’t intending to claim that it was. I also should have said something like “evolutionary, cultural, and individual history” instead (I was using “evolution” as a shorthand to indicate it seems common among various cultures but of course that doesn’t mean don’t-lie genes are directly bred into us! Most human universals aren’t; we probably don’t have honor-the-dead and different-words-for-male-and-female genes).

I was just making the pretty basic point “AIs in general, and Alex in particular, are produced through a very different process from humans, so it seems like ‘humans find lying hard’ is pretty weak evidence that ‘AI will by default find lying hard.’”

I agree that asking “What specific neurological phenomena make it so most people find it hard to lie?” could serve as inspiration to do AI honesty research, and wasn’t intending to claim otherwise in that paragraph (though separately, I am somewhat pessimistic about this research direction).