Fair enough. But I think an error theorist is committed to saying something like “FAI is impossible, so your assertion to have it is a lie.” In the game we are playing, a lie from the AI seems to completely justify destroying it.
More generally, if error theory is true, humanity as a whole is just doomed if hard-takeoff AI happens. There might be some fragment that is compatible, but Friendly-to-a-fragment-of-humanity AI is another name for unFriendly AI.
The moral relativist might say that fragment-Friendly is possible, and is a worthwhile goal. I’m uncertain, but even if that were true, fragment-Friendly AI seems to involve fixing a particular moral scheme in place and punishing any drift from that position. That doesn’t seem particularly desirable. Especially since moral drift seems to be a brute fact about humanity’s moral life.
If (different) personal-FAIs are possible for many (most) people, you can divide the future resources in some way among the personal-FAIs of these people. We might call this outcome a (provisional) humanity-FAI.
Perhaps, but we already know that most people (and groups) are not Friendly. Making them more powerful by giving them safe-for-them genies seems unlikely to sum to Friendly-to-all.
In short, if there were mutually acceptable ways to divide the limited resources, we’d already be dividing the resources those ways. The increased wealth from the industrial revolution and information revolution have reduced certain kinds of conflicts, but haven’t abolished conflict. Unfortunately, it doesn’t seem like the increased-wealth-effect of AI is any likelier to abolish conflict—Friendly is a separate property that we’d like the AI to have that would solve this problem.
Perhaps, but we already know that most people (and groups) are not Friendly
Not clear what you refer to by “Friendly” (I think this should be tabooed rather than elaborated), no idea what the relevance of properties of humans is in this context.
Making them more powerful by giving them safe-for-them genies seems unlikely to sum to Friendly-to-all.
I sketched a particular device, for you to evaluate. Whether it’s “Friendly-to-all” is a more vague question than that (and I’m not sure what you understand by that concept), so I think should be avoided. The relevant question is whether you would prefer the device I described (where you personally get the 1/Nth part of the universe with a genie to manage it) to deleting the Earth and everyone on it. In this context, even serious flaws (such as some of the other parts of the universe being mismanaged) may become irrelevant to the decision.
Fair enough. But I think an error theorist is committed to saying something like “FAI is impossible, so your assertion to have it is a lie.” In the game we are playing, a lie from the AI seems to completely justify destroying it.
More generally, if error theory is true, humanity as a whole is just doomed if hard-takeoff AI happens. There might be some fragment that is compatible, but Friendly-to-a-fragment-of-humanity AI is another name for unFriendly AI.
The moral relativist might say that fragment-Friendly is possible, and is a worthwhile goal. I’m uncertain, but even if that were true, fragment-Friendly AI seems to involve fixing a particular moral scheme in place and punishing any drift from that position. That doesn’t seem particularly desirable. Especially since moral drift seems to be a brute fact about humanity’s moral life.
If (different) personal-FAIs are possible for many (most) people, you can divide the future resources in some way among the personal-FAIs of these people. We might call this outcome a (provisional) humanity-FAI.
Perhaps, but we already know that most people (and groups) are not Friendly. Making them more powerful by giving them safe-for-them genies seems unlikely to sum to Friendly-to-all.
In short, if there were mutually acceptable ways to divide the limited resources, we’d already be dividing the resources those ways. The increased wealth from the industrial revolution and information revolution have reduced certain kinds of conflicts, but haven’t abolished conflict. Unfortunately, it doesn’t seem like the increased-wealth-effect of AI is any likelier to abolish conflict—Friendly is a separate property that we’d like the AI to have that would solve this problem.
Not clear what you refer to by “Friendly” (I think this should be tabooed rather than elaborated), no idea what the relevance of properties of humans is in this context.
I sketched a particular device, for you to evaluate. Whether it’s “Friendly-to-all” is a more vague question than that (and I’m not sure what you understand by that concept), so I think should be avoided. The relevant question is whether you would prefer the device I described (where you personally get the 1/Nth part of the universe with a genie to manage it) to deleting the Earth and everyone on it. In this context, even serious flaws (such as some of the other parts of the universe being mismanaged) may become irrelevant to the decision.