What does “want” mean here? Why is game theory some how extra special bad or good? From a behaviorist point of view, how do I tell apart an angel from a devil that has been game-theoried into being an angel? Do AGI’s have separate modules labeled “utility module” and “game theory” modules and making changes to the utility module is somehow good, but making changes to the game theory module is bad? Do angels have a utility function that just says “do the good’, or does it just contain a bunch of traits that we think are likely to result in good outcomes?
What does “want” mean here? Why is game theory some how extra special bad or good? From a behaviorist point of view, how do I tell apart an angel from a devil that has been game-theoried into being an angel? Do AGI’s have separate modules labeled “utility module” and “game theory” modules and making changes to the utility module is somehow good, but making changes to the game theory module is bad? Do angels have a utility function that just says “do the good’, or does it just contain a bunch of traits that we think are likely to result in good outcomes?