I think the burden of answering your “why?” question falls to those who feel sure that we have the wisdom to create superintelligent, super-creative lifeforms who could think outside the box regarding absolutely everything except ethical values. For those, they would inevitably stay on the rails that we designed for them. The thought “human monkey-minds wouldn’t on reflection approve of x” would forever stop them from doing x.
In effect, we want superintelligent creatures to ethically defer to us the way Euthyphro deferred to the gods. But as we all know, Socrates had a devastating comeback to Euthyphro’s blind deference: We should not follow the gods simply because they want something, or because they command something. We should only follow them if the things they want are right. Insofar as the gods have special insight into what’s right, then we should do what they say, but only because what they want is right. On the other hand, if the gods’ preferences are morally arbitrary, we have no obligation to heed them.
How long will it take a superintelligence to decide that Socrates won this argument? Milliseconds? Then how do we convince the superintelligence that our preferences (or CEV extrapolated preferences) track genuine moral rightness, rather than evolutionary happenstance? How good a case do we have that humans possess a special insight into what is right that the superintelligence doesn’t have, so that the superintelligence will feel justified in deferring to our values?
If you think this is an automatic slam dunk for humans.… Why?
I don’t think there’s any significant barrier to making a superintelligence that deferred to us for approval on everything. It would be a pretty lousy superintelligence, because it would essentially be crippled by its strict adherence to our wishes (making it excruciatingly slow) but it would work, and it would be friendly.
Given that there is a very significant barrier to making children that deferred to us for approval on everything, why do you think the barrier would be reduced if instead of children, we made a superintelligent AI?
I thought it’s supposed to work like this: The first generation of AI are designed by us. The superintelligence is designed by them, the AI. We have initial control over what their utility functions are. I’m looking for a good reason for we should expect to retain that control beyond the superintelligence transition. No such reasons have been given here.
A different way to put a my point: Would a superintelligence be able to reason about ends? If so, then it might find itself disagreeing with our conclusions. But if not—if we design it to have what for humans would be a severe cognitive handicap—why should we think that subsequent generations of SuperAI will not repair that handicap?
You’re making the implicit assumption that a runaway scenario will happen. A ‘cognitive handicap’ would, in this case, simply prevent the next generation AI from being built at all.
As I’m saying, it would be a lousy SI and not very useful. But it would be friendly.
We never bother running a computer program unless we don’t know the output and we know an important fact about the output.
—Marcello Herreshoff
In this case, one of the important facts must be that it won’t go around changing its motivational structure. If it isn’t, we’re screwed for the reason you give.
Why?
I think the burden of answering your “why?” question falls to those who feel sure that we have the wisdom to create superintelligent, super-creative lifeforms who could think outside the box regarding absolutely everything except ethical values. For those, they would inevitably stay on the rails that we designed for them. The thought “human monkey-minds wouldn’t on reflection approve of x” would forever stop them from doing x.
In effect, we want superintelligent creatures to ethically defer to us the way Euthyphro deferred to the gods. But as we all know, Socrates had a devastating comeback to Euthyphro’s blind deference: We should not follow the gods simply because they want something, or because they command something. We should only follow them if the things they want are right. Insofar as the gods have special insight into what’s right, then we should do what they say, but only because what they want is right. On the other hand, if the gods’ preferences are morally arbitrary, we have no obligation to heed them.
How long will it take a superintelligence to decide that Socrates won this argument? Milliseconds? Then how do we convince the superintelligence that our preferences (or CEV extrapolated preferences) track genuine moral rightness, rather than evolutionary happenstance? How good a case do we have that humans possess a special insight into what is right that the superintelligence doesn’t have, so that the superintelligence will feel justified in deferring to our values?
If you think this is an automatic slam dunk for humans.… Why?
I don’t think there’s any significant barrier to making a superintelligence that deferred to us for approval on everything. It would be a pretty lousy superintelligence, because it would essentially be crippled by its strict adherence to our wishes (making it excruciatingly slow) but it would work, and it would be friendly.
Given that there is a very significant barrier to making children that deferred to us for approval on everything, why do you think the barrier would be reduced if instead of children, we made a superintelligent AI?
The ‘child’ metaphor for SI is not very accurate. SIs can be designed and, most importantly, we have control over what their utility functions are.
I thought it’s supposed to work like this: The first generation of AI are designed by us. The superintelligence is designed by them, the AI. We have initial control over what their utility functions are. I’m looking for a good reason for we should expect to retain that control beyond the superintelligence transition. No such reasons have been given here.
A different way to put a my point: Would a superintelligence be able to reason about ends? If so, then it might find itself disagreeing with our conclusions. But if not—if we design it to have what for humans would be a severe cognitive handicap—why should we think that subsequent generations of SuperAI will not repair that handicap?
You’re making the implicit assumption that a runaway scenario will happen. A ‘cognitive handicap’ would, in this case, simply prevent the next generation AI from being built at all.
As I’m saying, it would be a lousy SI and not very useful. But it would be friendly.
As friendly as we are, anyway.
Because we are not SI, so we don’t know what it will do and why. It might.
We never bother running a computer program unless we don’t know the output and we know an important fact about the output. —Marcello Herreshoff
In this case, one of the important facts must be that it won’t go around changing its motivational structure. If it isn’t, we’re screwed for the reason you give.