That’s the thing, all of humanity is going to have no bargaining power and so universal friendly bargaining needs to offer bargaining power to those who don’t have the ability to demand it.
What is the incentive for the people who have influence over the development of AI to implement such a thing? Why not only include bargaining power for the value systems of said people with influence?
Maybe there’s a utility function that reliably points away from hell no matter who runs it, but there are plenty of people who actually want some specific variety of hell for those they dislike, so they won’t run that utility function.
now you are getting into the part where you are writing posts I would have written. we started out very close to agreeing anyway.
The reason is that failure to do this will destroy them too, bargaining that doesn’t support those who can’t demand it will destroy all of humanity, but that’s not obvious to most of them right now and it won’t be until it’s too late
What about bargaining which only supports those who can demand it in the interim before value lock-in, when humans still have influence? If people in power successfully lock-in their own values into the AGI, the fact they have no bargaining power after the AI takes over doesn’t matter, since it’s aligned to them. If that set of values screws over others who don’t have bargaining power even before the AI takeover, that won’t hurt them after the AI takes over.
yep, this is pretty much the thing I’ve been worried about, and it always has been. I’d say that that is the classic inter-agent safety failure that has been ongoing since AI was invented in 12th-century France. But I think people overestimate how much they can control their children, and the idea that the people in power are going to successfully lock in their values without also protecting extant humans and other beings with weak bargaining power is probably a (very hard to dispel) fantasy.
What do you mean that AI was invented in 12th century France?
And why do you think that locking in values to protect some humans and not others, or humans and not animals, or something like this, is less possible than locking in values to protect all sentient beings? What makes it a “fantasy”?
That’s the thing, all of humanity is going to have no bargaining power and so universal friendly bargaining needs to offer bargaining power to those who don’t have the ability to demand it.
What is the incentive for the people who have influence over the development of AI to implement such a thing? Why not only include bargaining power for the value systems of said people with influence?
Maybe there’s a utility function that reliably points away from hell no matter who runs it, but there are plenty of people who actually want some specific variety of hell for those they dislike, so they won’t run that utility function.
now you are getting into the part where you are writing posts I would have written. we started out very close to agreeing anyway.
The reason is that failure to do this will destroy them too, bargaining that doesn’t support those who can’t demand it will destroy all of humanity, but that’s not obvious to most of them right now and it won’t be until it’s too late
What about bargaining which only supports those who can demand it in the interim before value lock-in, when humans still have influence? If people in power successfully lock-in their own values into the AGI, the fact they have no bargaining power after the AI takes over doesn’t matter, since it’s aligned to them. If that set of values screws over others who don’t have bargaining power even before the AI takeover, that won’t hurt them after the AI takes over.
yep, this is pretty much the thing I’ve been worried about, and it always has been. I’d say that that is the classic inter-agent safety failure that has been ongoing since AI was invented in 12th-century France. But I think people overestimate how much they can control their children, and the idea that the people in power are going to successfully lock in their values without also protecting extant humans and other beings with weak bargaining power is probably a (very hard to dispel) fantasy.
What do you mean that AI was invented in 12th century France?
And why do you think that locking in values to protect some humans and not others, or humans and not animals, or something like this, is less possible than locking in values to protect all sentient beings? What makes it a “fantasy”?