Zac Hatfield-Dodds comments on [Linkpost] Treacherous turns in the wild

Zac Hatfield-Dodds 29 Apr 2021 2:21 UTC
1 point
Trying to unpack why I don’t think of this as a treacherous turn:
- It’s a simple case of a nearest unblocked strategy
- I’d expect a degree of planning and human-modelling which were absent in this case. A ‘deception phase’ based on unplanned behavioural differences in different environments doesn’t quite fit for me.
- Neither the evolved organisms nor the process of evolution are sufficiently agentlike that I find the “treacherous turn” to be a useful intuition pump.
I think it’s mostly the intuition-pump argument; there are obviously risks that you evolve behaviour that you didn’t want (mostly but not always via goal misspecification), but the treacherous turn to me implies a degree of planning and possibly acausal cooperation that would be very much more difficult to evolve.