Not Relevant comments on Gradient hacking

Not Relevant 27 Apr 2022 20:58 UTC
3 points
I see, so your claim here is that gradient hacking is a convergent strategy for all agents of sufficient intelligence. That’s helpful, thanks.

I am still confused about this in the case that Alice is checking whether or not she has X goal, since by definition it is to her goal Y’s detriment to not have children if she finds she has a different goal Y!=X.