Anthony DiGiovanni comments on Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover

Anthony DiGiovanni 13 Aug 2022 11:06 UTC
1 point
0
I like that this post clearly argues for some reasons why we might expect deception (and similar dynamics) to not just be possible in the sense of getting equal training rewards, but to actually provide higher rewards than the honest alternatives. This positively updates my probability of those scenarios.