Kaj_Sotala comments on Can there be an indescribable hellworld?

Kaj_Sotala 29 Jan 2019 15:50 UTC
5 points
A hellworld is ultimately a world that is against our values. However, our values are underdefined and changeable. So to have any chance of saying what these values are, we need to either extract key invariant values, synthesise our contradictory values into some complete whole, or use some extrapolation procedure (eg CEV). In any case, there is a procedure for establishing our values (or else the very concept of “hellworld” makes no sense).
It feels worth distinguishing between two cases of “hellworld”:
1. A world which is not aligned with the values of that world’s inhabitants themselves. One could argue that in order to merit the designation “hellworld”, the world has to be out of alignment with the values of its inhabitants in such a way as to cause suffering. Assuming that we can come up with a reasonable definition of suffering, then detecting these kinds of worlds seems relatively straightforward: we can check whether they contain immense amounts of suffering.
2. A world whose inhabitants do not suffer, but which we might consider hellish according to our values. For example, something like a Brave New World scenario, where people generally consider themselves happy but where that happiness comes at the cost of suppressing individuality and promoting superficial pleasures.
It’s for detecting an instance of the second case that we need to understand our values better. But it’s not clear to me that such a world should qualify as a “hellworld”, which to me sounds like a world with negative value. While I don’t find the notion of being the inhabitant of a Brave New World particularly appealing, a world where most people are happy but only in a superficial way sounds more like “overall low positive value” than “negative value” to me. Assuming that you’ve internalized its values and norms, existing in a BNW doesn’t seem like a fate worse than death, it just sounds like a future that could have gone better.
Of course, there is an argument that even if a BNW would be okay to its inhabitants once we got there, getting there might cause a lot of suffering: for instance, if there were lots of people who were forced against their will to adapt to the system. Since many of us might find the BNW to be a fate worse than death, then conditional on us surviving to live in the BNW, it’s a hellworld (at least to us). But again this doesn’t seem like it requires a thorough understanding of our values to detect: it just requires detecting the fact that if we survive to live in the BNW, we will experience a lot of suffering due to being in a world which is contrary to our values.
- Stuart_Armstrong 30 Jan 2019 18:10 UTC
  4 points
  Parent
  
  Assuming that we can come up with a reasonable definition of suffering
  
  Checking whether there is a large amount of suffering in a deliberately obfuscated world seems hard, or impossible if a superintelligent has done the obfuscating.
  - Kaj_Sotala 30 Jan 2019 20:39 UTC
    2 points
    Parent
    True, not disputing that. Only saying that it seems like an easier problem than solving human values first, and then checking whether those values are satisfied.