I was mostly noting that I hadn’t thought of this, hadn’t seen it mentioned
There was some related discussion back in 2012 but of course you can be excused for not knowing about that. :) (The part about “AIXI would fail due to incorrect decision theory” is in part talking about reward-maximizing agent doing reward hacking.)
There was some related discussion back in 2012 but of course you can be excused for not knowing about that. :) (The part about “AIXI would fail due to incorrect decision theory” is in part talking about reward-maximizing agent doing reward hacking.)