I don’t require their values to converge, I require them to accept the truths of certain claims. This happens in real life. People say “I don’t like X, but I respect your right to do it”. The first part says X is a disvalue, the second is an override coming from rationality.
A machine that maximises paperclips can believe all true propositions in the world, and go on maximising paperclips. Nothing compels it to act any differently. You expect that rational agents will eventually derive the true theorems of morality. Yes, they will. Along with the true theorems of everything else. It won’t change their behaviour, unless they are built so as to send those actions identified as moral to the action system.
If you don’t believe me, I can only suggest you study AI (Thrun & Norvig) and/or the metaethics sequence until you do. (I mean really study. As if you were learning particle physics. It seems the usual metaethical confusions are quite resilient; in most peoples’ cases I wouldn’t expect them to vanish without actually thinking carefully about the data presented.) And, well, don’t expect to learn too much from off-the-cuff comments here.
A machine that maximises paperclips can believe all true propositions in the world, and go on maximising paperclips. Nothing compels it to act any differently. You expect that rational agents will eventually derive the true theorems of morality. Yes, they will.
Well, that justifies moral realism.
Along with the true theorems of everything else. It won’t change their behaviour, unless they are built so as to send those actions identified as moral to the action system.
...or its an emergent feature, or they can update into something that works that way. You are tacitly assuming that you clipper is barely
an AI at all...that is just has certain functions it performs blindly because its built that way. But a supersmart, uper-rational clipper has to be able to update. By hypothesis, clippers have certain functionalities walled off from update. People are messilly designed and unlikely to work that way. So are likely AIs and aliens.
Only rational agents, not all mindful agents, will have what it takes to derive objective moral truths.
They don’t need to converge on all their values to converge on all their moral truths, because ratioanity
can tell you that a moral claim is true even if it is not in your (other) interests. Individuals can
value rationality, and that valuation can override other valuations.
Only rational agents, not all mindful agents, will have what it takes to derive objective moral truths.
The further claim that agents will be motivated to do derive moral truths., and to act on them, requires a further criterion. Morality is about regulating behaviour in a society, So only social rational agents will have motivation to update. Again, they do not have to converge on values beyond the shared value of sociality.
By hypothesis, clippers have certain functionalities walled off from update.
A paperclipper no more has a wall stopping it from updating into morality than my laptop has a wall stopping it from talking to me. My laptop doesn’t talk to me because I didn’t program it to. You do not update into pushing pebbles into prime-numbered heaps because you’re not programmed to do so.
“Emergent” in this context means “not explicitly programmed in”. There are robust examples.
A paperclipper no more has a wall stopping it from updating into morality than my laptop has a wall stopping it from talking to me.
Your laptop cannot talk to you because the natural language is an unsolved problem.
Does a stone roll uphill on a whim?
Not wanting to do something is not the slightest guarantee of not actually doing it.f
An AI can update its values because value drift is an unsolved problem
Clippers can’t update their values by definition, but you can’t define anything into existence or statistical significance.
You do not update into pushing pebbles into prime-numbered heaps because you’re not programmed to do so.
Not programmed to, or programmed not to? If you can code up a solution to value drift, lets see it. Otherwise, note that Life programmes can update to implement glider generators without being “programmed to”.
Not programmed to, or programmed not to? If you can code up a solution to value drift, lets see it. Otherwise, note that Life programmes can update to implement glider generators without being “programmed to”.
...with extremely low probability. It’s far more likely that the Life field will stabilize around some relatively boring state, empty or with a few simple stable patterns. Similarly, a system subject to value drift seems likely to converge on boring attractors in value space (like wireheading, which indeed has turned out to be a problem with even weak self-modifying AI) rather than stable complex value systems. Paperclippism is not a boring attractor in this context, and a working fully reflective Clippy would need a solution to value drift, but humanlike values are not obviously so, either.
I’m increasingly baffled as to why AI is always brought in to discussions of metaethics. Societies of rational agents need ethics to regulate their conduct. Out AIs aren’t sophisticated enough to live in their own socieities. A wireheading AI isn’t even going to be able to survive “in the wild”. If you could build an artificial society of AI, then the questions of whether they spontaneously evolved ethics would be a very interesting and relevant datum. But AIs as we know them aren’t good models for the kinds of entities to which
morality is relevant. And Clippy is particularly exceptional example of an AI. So why do people keep saying “Ah, but Clippy...”...?
And Clippy is particularly exceptional example of an AI. So why do people keep saying “Ah, but Clippy...”...?
Well, in this case it’s because the post I was responding to mentioned Clippy a couple of times, so I thought it’d be worthwhile to mention how the little bugger fits into the overall picture of value stability. It’s indeed somewhat tangential to the main point I was trying to make; paperclippers don’t have anything to do with value drift (they’re an example of a different failure mode in artificial ethics) and they’re unlikely to evolve from a changing value system.
Sorry..did you mean FAI is about societies, or FAI is about singletons?
But if ethics does emerge as an organisational principle in socieities, that’s all you need for FAI. You don’t even to to worry about one sociopathic AI turning unfriendly, because the majority will be able to restrain it.
UFAI is about singletons. If you have an AI society whose members compare notes and share information—which ins isntrumentally useful for them anyway—your reduce the probability of singleton fooming.
Any agent that fooms becomes a singleton. Thus, it doesn’t matter if they acted nice while in a society; all that matters is whether they act nice as a singleton.
An agent in a society is unable to force its values on the society; it needs to cooperate with the rest of society. A singleton is able to force its values on the rest of society.
But a supersmart, uper-rational clipper has to be able to update.
has to be able to update
“update”
Please unpack this and describe precisely, in algorithmic terms that I could read and write as a computer program given unlimited time and effort, this “ability to update” which you are referring to.
I suspect that you are attributing Magical Powers From The Beyond to the word “update”, and forgetting to consider that the ability to self-modify does not imply active actions to self-modify in any one particular way that unrelated data bits say would be “better”, unless the action code explicitly looks for said data bits.
It’s uncontrovesial that rational agents need to update, and that AIs need to self-modify. The claim that values are in either case insulated from updates is the extraordinary one. The Cipper theory tells you that you could build something like that if you were crazy enough. Since Clippers are contrived, nothing can be inferred from them about
typical agents. People are messy, and can accidentally update their values when trying to do something else, For instance, LukeProg updated to “atheist” after studying Christian apologetics for the opposite reason.
Yes, value drift is the typical state for minds in our experience.
Building a committed Clipper that cannot accidentally update its values when trying to do something else is only possible after the problem of value drift has been solved. A system that experiences value drift isn’t a reliable Clipper, isn’t a reliable good-thing-doer, isn’t reliable at all.
It’s uncontrovesial that rational agents need to update, and that AIs need to self-modify. The claim that values are in either case insulated from updates is the extraordinary one.
I never claimed that it was controversial, nor that AIs didn’t need to self-modify, nor that values are exempt.
I’m claiming that updates and self modification do not imply a change of behavior towards behavior desired by humans.
I can build a small toy program to illustrate, if that would help.
I am not suggesting that human ethics is coincidentally universal ethics.
I am suggesting that if neither moral realism nor relativism is initially discarded, one can eventually arrive at a compromise position where rational agents in a particular context arrive at a non arbitrary ethics which is appropriate to that context.
I don’t require their values to converge, I require them to accept the truths of certain claims. This happens in real life. People say “I don’t like X, but I respect your right to do it”. The first part says X is a disvalue, the second is an override coming from rationality.
This is where you are confused. Almost certainly it is not the only confusion. But here is one:
Values are not claims. Goals are not propositions. Dynamics are not beliefs.
A machine that maximises paperclips can believe all true propositions in the world, and go on maximising paperclips. Nothing compels it to act any differently. You expect that rational agents will eventually derive the true theorems of morality. Yes, they will. Along with the true theorems of everything else. It won’t change their behaviour, unless they are built so as to send those actions identified as moral to the action system.
If you don’t believe me, I can only suggest you study AI (Thrun & Norvig) and/or the metaethics sequence until you do. (I mean really study. As if you were learning particle physics. It seems the usual metaethical confusions are quite resilient; in most peoples’ cases I wouldn’t expect them to vanish without actually thinking carefully about the data presented.) And, well, don’t expect to learn too much from off-the-cuff comments here.
Well, that justifies moral realism.
...or its an emergent feature, or they can update into something that works that way. You are tacitly assuming that you clipper is barely an AI at all...that is just has certain functions it performs blindly because its built that way. But a supersmart, uper-rational clipper has to be able to update. By hypothesis, clippers have certain functionalities walled off from update. People are messilly designed and unlikely to work that way. So are likely AIs and aliens.
Only rational agents, not all mindful agents, will have what it takes to derive objective moral truths. They don’t need to converge on all their values to converge on all their moral truths, because ratioanity can tell you that a moral claim is true even if it is not in your (other) interests. Individuals can value rationality, and that valuation can override other valuations.
Only rational agents, not all mindful agents, will have what it takes to derive objective moral truths. The further claim that agents will be motivated to do derive moral truths., and to act on them, requires a further criterion. Morality is about regulating behaviour in a society, So only social rational agents will have motivation to update. Again, they do not have to converge on values beyond the shared value of sociality.
The Futility of Emergence
A paperclipper no more has a wall stopping it from updating into morality than my laptop has a wall stopping it from talking to me. My laptop doesn’t talk to me because I didn’t program it to. You do not update into pushing pebbles into prime-numbered heaps because you’re not programmed to do so.
Does a stone roll uphill on a whim?
Perhaps you should study Reductionism first.
“Emergent” in this context means “not explicitly programmed in”. There are robust examples.
Your laptop cannot talk to you because the natural language is an unsolved problem.
Not wanting to do something is not the slightest guarantee of not actually doing it.f
An AI can update its values because value drift is an unsolved problem
Clippers can’t update their values by definition, but you can’t define anything into existence or statistical significance.
Not programmed to, or programmed not to? If you can code up a solution to value drift, lets see it. Otherwise, note that Life programmes can update to implement glider generators without being “programmed to”.
...with extremely low probability. It’s far more likely that the Life field will stabilize around some relatively boring state, empty or with a few simple stable patterns. Similarly, a system subject to value drift seems likely to converge on boring attractors in value space (like wireheading, which indeed has turned out to be a problem with even weak self-modifying AI) rather than stable complex value systems. Paperclippism is not a boring attractor in this context, and a working fully reflective Clippy would need a solution to value drift, but humanlike values are not obviously so, either.
I’m increasingly baffled as to why AI is always brought in to discussions of metaethics. Societies of rational agents need ethics to regulate their conduct. Out AIs aren’t sophisticated enough to live in their own socieities. A wireheading AI isn’t even going to be able to survive “in the wild”. If you could build an artificial society of AI, then the questions of whether they spontaneously evolved ethics would be a very interesting and relevant datum. But AIs as we know them aren’t good models for the kinds of entities to which morality is relevant. And Clippy is particularly exceptional example of an AI. So why do people keep saying “Ah, but Clippy...”...?
Well, in this case it’s because the post I was responding to mentioned Clippy a couple of times, so I thought it’d be worthwhile to mention how the little bugger fits into the overall picture of value stability. It’s indeed somewhat tangential to the main point I was trying to make; paperclippers don’t have anything to do with value drift (they’re an example of a different failure mode in artificial ethics) and they’re unlikely to evolve from a changing value system.
Key word here being “societies”. That is, not singletons. A lot of the discussion on metaethics here is implicitly aimed at FAI.
Sorry..did you mean FAI is about societies, or FAI is about singletons?
But if ethics does emerge as an organisational principle in socieities, that’s all you need for FAI. You don’t even to to worry about one sociopathic AI turning unfriendly, because the majority will be able to restrain it.
FAI is about singletons, because the first one to foom wins, is the idea.
ETA: also, rational agents may be ethical in societies, but there’s no advantage to being an ethical singleton.
UFAI is about singletons. If you have an AI society whose members compare notes and share information—which ins isntrumentally useful for them anyway—your reduce the probability of singleton fooming.
Any agent that fooms becomes a singleton. Thus, it doesn’t matter if they acted nice while in a society; all that matters is whether they act nice as a singleton.
I don’t get it: any agent that fooms becomes superintelligent. It’s values don’t necessarily change at all, nor does its connection to its society.
An agent in a society is unable to force its values on the society; it needs to cooperate with the rest of society. A singleton is able to force its values on the rest of society.
At last, an interesting reply!
Other key problem:
Please unpack this and describe precisely, in algorithmic terms that I could read and write as a computer program given unlimited time and effort, this “ability to update” which you are referring to.
I suspect that you are attributing Magical Powers From The Beyond to the word “update”, and forgetting to consider that the ability to self-modify does not imply active actions to self-modify in any one particular way that unrelated data bits say would be “better”, unless the action code explicitly looks for said data bits.
It’s uncontrovesial that rational agents need to update, and that AIs need to self-modify. The claim that values are in either case insulated from updates is the extraordinary one. The Cipper theory tells you that you could build something like that if you were crazy enough. Since Clippers are contrived, nothing can be inferred from them about typical agents. People are messy, and can accidentally update their values when trying to do something else, For instance, LukeProg updated to “atheist” after studying Christian apologetics for the opposite reason.
Yes, value drift is the typical state for minds in our experience.
Building a committed Clipper that cannot accidentally update its values when trying to do something else is only possible after the problem of value drift has been solved. A system that experiences value drift isn’t a reliable Clipper, isn’t a reliable good-thing-doer, isn’t reliable at all.
Next.
I never claimed that it was controversial, nor that AIs didn’t need to self-modify, nor that values are exempt.
I’m claiming that updates and self modification do not imply a change of behavior towards behavior desired by humans.
I can build a small toy program to illustrate, if that would help.
I am not suggesting that human ethics is coincidentally universal ethics. I am suggesting that if neither moral realism nor relativism is initially discarded, one can eventually arrive at a compromise position where rational agents in a particular context arrive at a non arbitrary ethics which is appropriate to that context.
… why do you think people say “I don’t like X, but I respect your right to do it”?