Eliezer Yudkowsky comments on Decision theory does not imply that we get to have nice things

Eliezer Yudkowsky 7 Nov 2022 4:16 UTC
LW: 7 AF: 6
1
AF
The cogitation here is implicitly hypothesizing an AI that’s explicitly considering the data and trying to compress it, having been successfully anchored on that data’s compression as identifying an ideal utility function. You’re welcome to think of the preferences as a static object shaped by previous unreflective gradient descent; it sure wouldn’t arrive at any better answers that way, and would also of course want to avoid further gradient descent happening to its current preferences.