Thank you for explaining this! But then how can this framework be used to model humans as agents? People can easily imagine outcomes worse than death or destruction of the universe.
The long answer is, here are some possibilities, roughly ordered from “boring” to “weird”:
The framework is wrong.
The framework is incomplete, there is some extension which gets rid of monotonicity. There are some obvious ways to make such extensions, but they look uglier and without further research it’s hard to say whether they break important things or not.
Humans are just not physicalist agents, you’re not supposed to model them using this framework, even if this framework can be useful for AI. This is why humans took so much time coming up with science.
Like #3, and also if we thought long enough we would become convinced of some kind of simulation/deity hypothesis (where the simulator/deity is a physicalist), and this is normatively correct for us.
Because the universe is effectively finite (since it’s asymptotically de Sitter), there are only so many computations that can run. Therefore, even if you only assign positive value to running certain computations, it effectively implies that running other computations is bad. Moreover, the fact the universe is finite is unsurprising since infinite universes tend to have all possible computations running which makes them roughly irrelevant hypotheses for a physicalist.
We are just confused about hell being worse than death. For example, maybe people in hell have no qualia. This makes some sense if you endorse the (natural for physicalists) anthropic theory that only the best-off future copy of you matters. You can imagine there always being a “dead copy” of you, so that if something worst-than-death happens to the apparent-you, your subjective experiences go into the “dead copy”.
Thank you for explaining this! But then how can this framework be used to model humans as agents? People can easily imagine outcomes worse than death or destruction of the universe.
The short answer is, I don’t know.
The long answer is, here are some possibilities, roughly ordered from “boring” to “weird”:
The framework is wrong.
The framework is incomplete, there is some extension which gets rid of monotonicity. There are some obvious ways to make such extensions, but they look uglier and without further research it’s hard to say whether they break important things or not.
Humans are just not physicalist agents, you’re not supposed to model them using this framework, even if this framework can be useful for AI. This is why humans took so much time coming up with science.
Like #3, and also if we thought long enough we would become convinced of some kind of simulation/deity hypothesis (where the simulator/deity is a physicalist), and this is normatively correct for us.
Because the universe is effectively finite (since it’s asymptotically de Sitter), there are only so many computations that can run. Therefore, even if you only assign positive value to running certain computations, it effectively implies that running other computations is bad. Moreover, the fact the universe is finite is unsurprising since infinite universes tend to have all possible computations running which makes them roughly irrelevant hypotheses for a physicalist.
We are just confused about hell being worse than death. For example, maybe people in hell have no qualia. This makes some sense if you endorse the (natural for physicalists) anthropic theory that only the best-off future copy of you matters. You can imagine there always being a “dead copy” of you, so that if something worst-than-death happens to the apparent-you, your subjective experiences go into the “dead copy”.