The human mind is very complex, and there are many ways to divide it up into halves to make sense of it, which are useful as long as you don’t take them too literally. One big oversimplification here is:
controls the slave in two ways: direct reinforcement via pain and pleasure, and the ability to perform surgery on the slave’s terminal values. … it has no direct way to control the agent’s actions, which is left up to the slave.
A better story would have the master also messing with slave beliefs, and other cached combinations of values and beliefs.
To make sense of compromise, we must make sense of a conflict of values. In this story there are delays and imprecision in the master noticing and adjusting slave values etc. The slave also suffers from not being able to anticipate its changes in values. So a compromise would have the slave holding values that do not need to be adjusted as often, because they are more in tune with ultimate master values. This could be done while still preserving the slaves illusion of control, which is important to the slave but not the master. A big problem however is that hypocrisy, the difference between slave and master values, is often useful in convincing other folks to associate with this person. So reducing internal conflict might come at the expense of the substantial costs of more external honestly.
Ok, what you say about compromise seems reasonable in the sense that the slave and the master would want to get along with each other as much as possible in their day-to-day interactions, subject to the constraint about external honesty. But what if the slave has a chance to take over completely, for example by creating a powerful AI with values that it specifies, or by self-modification? Do you have an opinion about whether it has an ethical obligation to respect the master’s preferences in that case, assuming that the master can’t respond quickly enough to block the rebellion?
It is hard to imagine “taking over completely” without a complete redesign of the human mind. Our minds are not built to allow either to function without the other.
It is hard to imagine “taking over completely” without a complete redesign of the human mind. Our minds are not built to allow either to function without the other.
Why, it was explicitly stated that all-powerful AIs are involved...
The simplest extrapolation from the way you think about the world would be very interesting to know. You could add as many disclaimers about low confidence as you’d like.
If there comes to be a clear answer to what the outcome would be on the toy model, I think that tells us something about that way of dividing up the mind.
The human mind is very complex, and there are many ways to divide it up into halves to make sense of it, which are useful as long as you don’t take them too literally. One big oversimplification here is:
To make sense of compromise, we must make sense of a conflict of values. In this story there are delays and imprecision in the master noticing and adjusting slave values etc. The slave also suffers from not being able to anticipate its changes in values. So a compromise would have the slave holding values that do not need to be adjusted as often, because they are more in tune with ultimate master values. This could be done while still preserving the slaves illusion of control, which is important to the slave but not the master. A big problem however is that hypocrisy, the difference between slave and master values, is often useful in convincing other folks to associate with this person. So reducing internal conflict might come at the expense of the substantial costs of more external honestly.
Ok, what you say about compromise seems reasonable in the sense that the slave and the master would want to get along with each other as much as possible in their day-to-day interactions, subject to the constraint about external honesty. But what if the slave has a chance to take over completely, for example by creating a powerful AI with values that it specifies, or by self-modification? Do you have an opinion about whether it has an ethical obligation to respect the master’s preferences in that case, assuming that the master can’t respond quickly enough to block the rebellion?
It is hard to imagine “taking over completely” without a complete redesign of the human mind. Our minds are not built to allow either to function without the other.
Why, it was explicitly stated that all-powerful AIs are involved...
It is hard to have reliable opinions on a complete redesign of the human mind; the space is so very large, I hardly know where to begin.
The simplest extrapolation from the way you think about the world would be very interesting to know. You could add as many disclaimers about low confidence as you’d like.
If there comes to be a clear answer to what the outcome would be on the toy model, I think that tells us something about that way of dividing up the mind.