What does “want” mean here? Why is game theory some how extra special bad or good? From a behaviorist point of view, how do I tell apart an angel from a devil that has been game-theoried into being an angel? Do AGI’s have separate modules labeled “utility module” and “game theory” modules and making changes to the utility module is somehow good, but making changes to the game theory module is bad? Do angels have a utility function that just says “do the good’, or does it just contain a bunch of traits that we think are likely to result in good outcomes?
An angel is an AGI programmed to help us and do exactly what we want directly, without relying on game theory.
A devil wants to make paperclips but we force it to make flourishing human lives or whatever. An angel just wants flourishing human lives.
What does “want” mean here? Why is game theory some how extra special bad or good? From a behaviorist point of view, how do I tell apart an angel from a devil that has been game-theoried into being an angel? Do AGI’s have separate modules labeled “utility module” and “game theory” modules and making changes to the utility module is somehow good, but making changes to the game theory module is bad? Do angels have a utility function that just says “do the good’, or does it just contain a bunch of traits that we think are likely to result in good outcomes?