I wonder whether all this worrying about value stability isn’t losing sight of exactly this point—just whose values we are talking about.
As I understand it, the friendly values we are talking about are supposed to be some kind of cleaned up averaging of the individual values of a population—the species H. sapiens. But as we ought to know from the theory of evolution, the properties of a population (whether we are talking about stature, intelligence, dentition, or values) are both variable within the population and subject to evolution over time. And that the reason for this change over time is not that the property is changing in any one individual, but rather that the membership in the population is changing.
In my opinion, it is a mistake to try to distill a set of essential values characteristic of humanity and then to try to freeze those values in time. There is no essence of humanity, no fixed human nature. Instead, there is an average (with variance) which has changed over evolutionary time and can be expected to continue to change as the membership in humanity continues to change over time. Most of the people whose values we need to consult in the next millennium have not even been born yet.
A preemptive caveat and apology: I haven’t fully read up everything on this site regarding the issue of FAI yet.
But something I’m wondering about: why all the fuss about creating a friendly AI, instead of a subservient AI? I don’t want an AI that looks after my interests: I’m an adult and no longer need a daycare nurse. I want an AI that will look after my interests AND obey me—and if these two come into conflict, and I’ve become aware of such conflict, I’d rather it obey me.
Isn’t obedience much easier to program in than human values? Let humans remain the judges of human values. Let AI just use its intellect to obey humans.
It will ofcourse become a dreadful weapon of war, but that’s the case with all technology. It will be a great tool of peacetime as well.
There are three kinds of genies: Genies to whom you can safely say “I wish for you to do what I should wish for”; genies for which no wish is safe; and genies that aren’t very powerful or intelligent. ... With a safe genie, wishing is superfluous. Just run the genie.
That is actually one of the articles I have indeed read: but I didn’t find it that convincing because the human could just ask the genie to describe in advance and in detail the manner in which the genie will behave to obey the man’s wishes—and then keep telling him “find another way” until he actually likes the course of action that the genie describes.
Eventually the genie will be smart enough that it will start by proposing only the courses of action the human would find acceptable—but in the meantime there won’t be much risk, because the man will always be able to veto the unacceptables courses of action.
In short the issue of “safe” vs “unsafe” only really comes when we allow genie unsupervised and unvetoed action. And I reckon that humanity WILL be tempted to allow AIs unsupervised and unvetoed action (e.g. because of cases where AIs could have saved children from burning buildings, but they couldn’t contact humans qualified to authorize them to do so), and that’ll be a dreadful temptation and risk.
It’s not just extreme cases like saving children without authorization—have you ever heard someone (possibly a parent) saying that constant supervision is more work than doing the task themselves?
I was going to say that if you can’t trust subordinates, you might as well not have them, but that’s an exaggeration—tools can be very useful. It’s fine that a crane doesn’t have the capacity for independent action, it’s still very useful for lifting heavy objects. [1]
In some ways, you get more safety by doing IA (intelligence augmentation), but while people are probably Friendly (unlikely to destroy the human race), they’re not reliably friendly.
[1] For all I know, these days the taller cranes have an active ability to rebalance themselves. If so, that’s still very limited unsupervised action.
It’s not just extreme cases like saving children without authorization—have you ever heard someone (possibly a parent) saying that constant supervision is more work than doing the task themselves?
That’s only true if you (the supervisor) know how to perform the task yourself. However, there are a great many tasks that we don’t know how to do, but could evaluate the result if the AI did them for us. We could ask it to prove P!=NP, to write provably correct programs, to design machines and materials and medications that we could test in the normal way that we test such things, etc.
I wonder whether all this worrying about value stability isn’t losing sight of exactly this point—just whose values we are talking about.
As I understand it, the friendly values we are talking about are supposed to be some kind of cleaned up averaging of the individual values of a population—the species H. sapiens. But as we ought to know from the theory of evolution, the properties of a population (whether we are talking about stature, intelligence, dentition, or values) are both variable within the population and subject to evolution over time. And that the reason for this change over time is not that the property is changing in any one individual, but rather that the membership in the population is changing.
In my opinion, it is a mistake to try to distill a set of essential values characteristic of humanity and then to try to freeze those values in time. There is no essence of humanity, no fixed human nature. Instead, there is an average (with variance) which has changed over evolutionary time and can be expected to continue to change as the membership in humanity continues to change over time. Most of the people whose values we need to consult in the next millennium have not even been born yet.
If enough people agree with you (and I’m inclined that way myself), then updating will be built into the CEV.
A preemptive caveat and apology: I haven’t fully read up everything on this site regarding the issue of FAI yet.
But something I’m wondering about: why all the fuss about creating a friendly AI, instead of a subservient AI? I don’t want an AI that looks after my interests: I’m an adult and no longer need a daycare nurse. I want an AI that will look after my interests AND obey me—and if these two come into conflict, and I’ve become aware of such conflict, I’d rather it obey me.
Isn’t obedience much easier to program in than human values? Let humans remain the judges of human values. Let AI just use its intellect to obey humans.
It will ofcourse become a dreadful weapon of war, but that’s the case with all technology. It will be a great tool of peacetime as well.
See The Hidden Complexity of Wishes, for example.
That is actually one of the articles I have indeed read: but I didn’t find it that convincing because the human could just ask the genie to describe in advance and in detail the manner in which the genie will behave to obey the man’s wishes—and then keep telling him “find another way” until he actually likes the course of action that the genie describes.
Eventually the genie will be smart enough that it will start by proposing only the courses of action the human would find acceptable—but in the meantime there won’t be much risk, because the man will always be able to veto the unacceptables courses of action.
In short the issue of “safe” vs “unsafe” only really comes when we allow genie unsupervised and unvetoed action. And I reckon that humanity WILL be tempted to allow AIs unsupervised and unvetoed action (e.g. because of cases where AIs could have saved children from burning buildings, but they couldn’t contact humans qualified to authorize them to do so), and that’ll be a dreadful temptation and risk.
It’s not just extreme cases like saving children without authorization—have you ever heard someone (possibly a parent) saying that constant supervision is more work than doing the task themselves?
I was going to say that if you can’t trust subordinates, you might as well not have them, but that’s an exaggeration—tools can be very useful. It’s fine that a crane doesn’t have the capacity for independent action, it’s still very useful for lifting heavy objects. [1]
In some ways, you get more safety by doing IA (intelligence augmentation), but while people are probably Friendly (unlikely to destroy the human race), they’re not reliably friendly.
[1] For all I know, these days the taller cranes have an active ability to rebalance themselves. If so, that’s still very limited unsupervised action.
That’s only true if you (the supervisor) know how to perform the task yourself. However, there are a great many tasks that we don’t know how to do, but could evaluate the result if the AI did them for us. We could ask it to prove P!=NP, to write provably correct programs, to design machines and materials and medications that we could test in the normal way that we test such things, etc.