The Friendliness Problem is, at its root, about forming a workable set of values which are acceptable to society.
No, that’s the special bonus round after you solve the real friendliness problem. If that were the real deal, we could just tell an AI to enforce Biblical values or the values of Queen Elizabeth II or the US Constitution or something, and although the results would probably be unpleasant they would be no worse than the many unpleasant states that have existed throughout history.
As opposed to the current problem of having a very high likelihood that the AI will kill everyone in the world.
The Friendliness problem is, at its root, about communicating values to an AI and keeping those values stable. If we tell the AI “do whatever Queen Elizabeth II wants”—which I expect would be a perfectly acceptable society to live in—the Friendliness problem is how to get the AI to properly translate that into statements like “Queen Elizabeth wants a more peaceful world” and not things more like “INCREASE LEVEL OF DOPAMINE IN QUEEN ELIZABETH’S REWARD CENTER TO 3^^^3 MOLES” or “ERROR: QUEEN ELIZABETH NOT AN OBVIOUSLY CLOSED SYSTEM, CONVERT EVERYTHING TO COMPUTRONIUM TO DEVELOP AIRTIGHT THEORY OF PERSONAL IDENTITY” or “ERROR: FUNCTION SWITCH_TASKS NOT FOUND; TILE ENTIRE UNIVERSE WITH CORGIS”.
This is hard to explain in a way that doesn’t sound silly at first, but Creating Friendly AI does a good job of it.
If we can get all of that right, we could start coding in a complete theory of politics. Or we could just say “AI, please develop a complete theory of politics that satisfies the criteria OrphanWilde has in his head right now” and it would do it for us, because we’ve solved the hard problem of cashing out human desires. The second way sounds easier.
The Friendliness problem is, at its root, about communicating values to an AI and keeping those values stable. If we tell the AI “do whatever Queen Elizabeth II wants”—which I expect would be a perfectly acceptable society to live in
That depends on whether we mean 2013!Queen Elizabeth II or Queen Elizabeth after the the resulting power goes to her head.
I don’t think you get the same thing from that document that I do. (Incidentally, I disagree with a lot of the design decisions inherent in that document, such as self-modifying AI, which I regard as inherently and uncorrectably dangerous. When you stop expecting the AI to make itself better, the “Keep your ethics stable across iterations” part of the problem goes away.)
Either that or I’m misunderstanding you. Because my current understanding of your view of the Friendliness problem has less to do with codifying and programming ethics and more to do with teaching the AI to know exactly what we mean and not to misinterpret what we ask for. (Which I hope you’ll forgive me if I call “Magical thinking.” That’s not necessarily a disparagement; sufficiently advanced technology and all that. I just think it’s not feasible in the foreseeable future, and such an AI makes a poor target for us as we exist today.)
No, that’s the special bonus round after you solve the real friendliness problem. If that were the real deal, we could just tell an AI to enforce Biblical values or the values of Queen Elizabeth II or the US Constitution or something, and although the results would probably be unpleasant they would be no worse than the many unpleasant states that have existed throughout history.
As opposed to the current problem of having a very high likelihood that the AI will kill everyone in the world.
The Friendliness problem is, at its root, about communicating values to an AI and keeping those values stable. If we tell the AI “do whatever Queen Elizabeth II wants”—which I expect would be a perfectly acceptable society to live in—the Friendliness problem is how to get the AI to properly translate that into statements like “Queen Elizabeth wants a more peaceful world” and not things more like “INCREASE LEVEL OF DOPAMINE IN QUEEN ELIZABETH’S REWARD CENTER TO 3^^^3 MOLES” or “ERROR: QUEEN ELIZABETH NOT AN OBVIOUSLY CLOSED SYSTEM, CONVERT EVERYTHING TO COMPUTRONIUM TO DEVELOP AIRTIGHT THEORY OF PERSONAL IDENTITY” or “ERROR: FUNCTION SWITCH_TASKS NOT FOUND; TILE ENTIRE UNIVERSE WITH CORGIS”.
This is hard to explain in a way that doesn’t sound silly at first, but Creating Friendly AI does a good job of it.
If we can get all of that right, we could start coding in a complete theory of politics. Or we could just say “AI, please develop a complete theory of politics that satisfies the criteria OrphanWilde has in his head right now” and it would do it for us, because we’ve solved the hard problem of cashing out human desires. The second way sounds easier.
That depends on whether we mean 2013!Queen Elizabeth II or Queen Elizabeth after the the resulting power goes to her head.
I don’t think you get the same thing from that document that I do. (Incidentally, I disagree with a lot of the design decisions inherent in that document, such as self-modifying AI, which I regard as inherently and uncorrectably dangerous. When you stop expecting the AI to make itself better, the “Keep your ethics stable across iterations” part of the problem goes away.)
Either that or I’m misunderstanding you. Because my current understanding of your view of the Friendliness problem has less to do with codifying and programming ethics and more to do with teaching the AI to know exactly what we mean and not to misinterpret what we ask for. (Which I hope you’ll forgive me if I call “Magical thinking.” That’s not necessarily a disparagement; sufficiently advanced technology and all that. I just think it’s not feasible in the foreseeable future, and such an AI makes a poor target for us as we exist today.)