Rob Bensinger comments on Convince me that humanity is as doomed by AGI as Yudkowsky et al., seems to believe

Rob Bensinger 11 Apr 2022 22:04 UTC
3 points
This seems like a very different position from the one you just gave:
I worry that when people reason about utility functions, they’re relying upon the availability heuristic. When people try to picture “a random utility function”, they’re heavily biased in favor of the kind of utility functions they’re familiar with, like paperclip-maximization, prediction error minimization, or corporate profit-optimization.
How do we know that a random sample from utility-function-space looks anything like the utility functions we’re familiar with? We don’t. I wrote a very short story to this effect. If you can retroactively fit a utility function to any sequence of actions, what predictive power do we gain by including utility functions into our models of AGI?
I took you to be saying, ‘You can retroactively fit a utility function to any sequence of actions, so we gain no predictive power by thinking in terms of utility functions or coherence theorems at all. People worry about paperclippers not because there are coherence pressures pushing optimizers toward paperclipper-style behavior, but because paperclippers are a vivid story that sticks in your head.’