Even if it’s about bargaining rather than about altruism, it’s still okay to have someone worse off under the FAI just so long as they would not be able to predict ahead of time that they wold get the short end of the stick. It’s possible to have everyone benefit in expectation by creating an AI that is willing to make some people (who humans cannot predict the identity of ahead of time) worse off if it brings sufficient gain to the others.
I agree with this, which is why I said “worse off in expected utility” at the beginning of the thread. But I think you need “would not be able to predict ahead of time” in a fairly strong sense, namely that they would not be able to predict it even if they knew all the details of how the FAI worked. Otherwise they’d want to adopt the conditional strategy “learn more about the FAI design, and try to shut it down if I learn that I will get the short end of the stick”. It seems like the easiest way to accomplish this is to design the FAI to explicitly not make certain people worse off, rather than depend on that happening as a likely side effect of other design choices.
Even if it’s about bargaining rather than about altruism, it’s still okay to have someone worse off under the FAI just so long as they would not be able to predict ahead of time that they wold get the short end of the stick. It’s possible to have everyone benefit in expectation by creating an AI that is willing to make some people (who humans cannot predict the identity of ahead of time) worse off if it brings sufficient gain to the others.
I agree with this, which is why I said “worse off in expected utility” at the beginning of the thread. But I think you need “would not be able to predict ahead of time” in a fairly strong sense, namely that they would not be able to predict it even if they knew all the details of how the FAI worked. Otherwise they’d want to adopt the conditional strategy “learn more about the FAI design, and try to shut it down if I learn that I will get the short end of the stick”. It seems like the easiest way to accomplish this is to design the FAI to explicitly not make certain people worse off, rather than depend on that happening as a likely side effect of other design choices.