Thanks for that term. This makes things clearer. Based on what you are arguing, does that make you a convergence theorist then? (Or at least, you seem to be defending convergence theory here, even if you don’t wholeheartedly accept it)
I dunno...I just find the orthogonality thesis as intuitively obvious, and I’m having real trouble grasping what exactly the thought process that leads one to become a convergence theorist might be. I’m hoping you can show me what that thought process is.
hus to deny the Orthogonality thesis is to assert that there is a goal system G, such that...There cannot exist any efficient real-world algorithm with goal G.
Now, I agree that there exist some G such that this is the case, but I don’t think this set would have anything to do with morality as humans understand it.
You seem to be making the argument that one of the characteristics that would automatically qualify something as a candidate for G is immorality.
This makes no intuitive sense. Why couldn’t you make an efficient real world algorithm to destroy all life forms? It seems like, in the absence of some serious mathematical arguments to the contrary, we aught to dismiss claims that efficient real world algorithms for murder are impossible offhand.
Thanks for that term. This makes things clearer. Based on what you are arguing, does that make you a convergence theorist then?
Why is that important?
I dunno...I just find the orthogonality thesis as intuitively obvious, and I’m having real trouble grasping what exactly the thought process that leads one to become a convergence theorist might be. I’m hoping you can show me what that thought process is.
I think I can see where the intuitive appeal comes from, and I think I can see where the errors are too.
“Thus to deny the Orthogonality thesis is to assert that there is a goal system G, such that...There cannot exist any efficient real-world algorithm with goal G.”
I can see why that is appealing, but it is not equivalent to the claim that any intelligent and rational entity could have any goal. Of course you can write a dumb algorithm to efficiently make paperclips, just as you can build a dumb machine that makes paperclips. And of course an AI could....technically …design and/or implement such an algorithm, But it doesn’t follow that an AGI would do so. (Which is two propositions: it doesn’t follow that an AI could be persuaded to adopt such a goal, and it doesn’t follow that such a goal could be programmed in ab initio and remain stable).
The Convergentist would want to claim:
“To assert the Orthogonality Thesis is to assert that no matter how intelligent and rational an agent, no matter the breadth of its understanding, no matter the strength of its commitment to objectivity, no matter its abilities to self-reflect and update, it would still never realise that making huge numbers of paperclips is arbitrary and unworthy of its abilities”
The orthogonality claim only has bite against Convergence/Moral Realism if it relates to all or most or typical
rational intelligent agents, because that is how moral realists define their claim: they claim that ideal
rational agents of a typical kind will converge, or that most rational-enough and intelligent-enough agents will converge. You might be able to build a (genuinely intelligent, reflecting and updating) Clippy, but that wouldn’t prove anything. The natural existence of sociopaths doesn’t disprove MR because they are statiscally rare, and their typicality is in doubt. You can’t prove anything about morality by genetically engineering a sociopath.
As an argument against MR/C, Orthogonality has to claim that the typical, statistically
common kind of agent could have arbitrary goals, and that the evidence of convergence amongst humans
is explained by specific cultural or genetic features, not by rationality in general.
ETA:
Ben: [The Orthogonality Thesis] may be true, but who cares about possibility “in principle”? The question is whether any level of intelligence is PLAUSIBLY LIKELY to be combined with more or less any final goal in practice. And I really doubt it. I guess I could posit the alternative: Interdependency Thesis: Intelligence and final goals are in practice highly and subtly interdependent.
If we don’t understand the relationship between instrumental intelligence and goals, Clippies will seem possible—in the way that p-zombies do if you don’t understand the relationship between matter and consciousness.
Because I want to be sure that I’m understanding what the claim you’re making is.
The Convergentist would want to claim:
“To assert the Orthogonality Thesis is to assert that no matter how intelligent and rational an agent, no matter the breadth of its understanding, no matter the strength of its commitment to objectivity, no matter its abilities to self-reflect and update, it would still never realise that making huge numbers of paperclips is arbitrary and unworthy of its abilities”
Okay...so I agree with the Convergence theorist on what the implications of the Orthogonality Thesis are, and I still think the Orthogonality Thesis is true.
if it relates to all or most or typical rational intelligent agents, because that is how moral realists define their claim
Hold on now...that makes the claim completely different than what I thought we were talking about up till now. I thought we were talking about whether or not all rational agents would be in agreement about what morality is, independent of specifically human preferences?
We can have the other discussion too...but not before settling whether or not the Orthogonality Thesis is in fact true “in principle”. Remember, we originally started this discussion with my claim that morality is feelings/preference, as opposed to something you could figure out (i.e. something embedded into logic/game theory or the universe itself.) We weren’t originally talking about rational agents to shed light on evolution or plausible AI...we brought them in as hypothetical agents who converge upon the correct answer to any answerable question, to explore whether or not “what is good” is independent from “what do humans think is good”.
I thought we were talking about whether morality was something that could be discovered objectively.
I said:
Morality comes from the “heart”. It’s made of feelings.
Then you said:
People use feelings/System1 to do morality. That doesn’t make it an oracle. Thinking might be more accurate.
Then I said
Accurate? How can you speak of a moral preference being “accurate” or not? Moral preferences simply are. Accurate? How can you speak of a moral preference being “accurate” or not? Moral preferences simply are.
You disagreed, and said
Moral objectivism isn’t obviously wrong,
To which I countered
all rational agents will converge upon mathematical statements, and will not converge upon moral statements.
You disagreed:
morality could work like convergence on mathematical truth.
Which is why
I thought we were talking about whether or not all rational agents would be in agreement about what morality is, independent of specifically human preferences?
Hence
if it relates to all or most or typical rational intelligent agents
doesn’t make any sense in our discussion. All rational agents converge on mathematical and ontological facts, by definition. My argument was that there is no such thing as a “moral fact: and moral statements can only be discussed when in reference to the psychology of a small set of creatures which includes humans and some other mammals. I argued that moral statements can’t be “discovered” true or false in any ontological or mathematical sense, nor are they deeply embedded into game theory (meaning it is not always in the interest of all rational agents to follow human morality) - even though game theory does explain how we evolved morality given our circumstances.
If you admit that at least one of all possible rational agent doesn’t converge upon morality, you’ve been in agreement with me this entire time—which means we’ve been talking about different things this entire time...so what did you think we were talking about?
All rational agents converge on mathematical and ontological facts, by definition.
Only by a definition whereby “rational” means “ideally rational”. In the ordinary sense of the term, it perfectly possible for someone who is deemed “rational” in a more-or-less, good-enough sense to fail to understand some mathematical truths. The existence of the innumerate does not disprove the objectivity of mathematics, and the existence of sociopaths does not disprove the objectivity of morality.
If you admit that at least one of all possible rational agent doesn’t converge upon morality,
Do you believe that it is possible for a rational agent to fail to understand a mathematical truth? Because that seems rather commonplace to me. Unless you mean ideally rational....
The whole point of invoking an ideal rational agent in the first place was to demonstrate that moral “truths” aren’t like empirical or mathematical truths in that you can’t discover them objectively through philosophy or mathematics (even if you are infinitely smart). Rather, moral “truths” are peculiar to humans.
If you want to illustrate the non-objectivity of morality, then stating that even ideal rational agents won’t converge on them is one of expressing the point, although it helps to state the “ideal” explicitly. However, that is still only the expression of a claim, not the “demonstration” of one.
Thanks for that term. This makes things clearer. Based on what you are arguing, does that make you a convergence theorist then? (Or at least, you seem to be defending convergence theory here, even if you don’t wholeheartedly accept it)
I dunno...I just find the orthogonality thesis as intuitively obvious, and I’m having real trouble grasping what exactly the thought process that leads one to become a convergence theorist might be. I’m hoping you can show me what that thought process is.
The page even says it:
Now, I agree that there exist some G such that this is the case, but I don’t think this set would have anything to do with morality as humans understand it.
You seem to be making the argument that one of the characteristics that would automatically qualify something as a candidate for G is immorality.
This makes no intuitive sense. Why couldn’t you make an efficient real world algorithm to destroy all life forms? It seems like, in the absence of some serious mathematical arguments to the contrary, we aught to dismiss claims that efficient real world algorithms for murder are impossible offhand.
Why is that important?
I think I can see where the intuitive appeal comes from, and I think I can see where the errors are too.
I can see why that is appealing, but it is not equivalent to the claim that any intelligent and rational entity could have any goal. Of course you can write a dumb algorithm to efficiently make paperclips, just as you can build a dumb machine that makes paperclips. And of course an AI could....technically …design and/or implement such an algorithm, But it doesn’t follow that an AGI would do so. (Which is two propositions: it doesn’t follow that an AI could be persuaded to adopt such a goal, and it doesn’t follow that such a goal could be programmed in ab initio and remain stable).
The Convergentist would want to claim:
“To assert the Orthogonality Thesis is to assert that no matter how intelligent and rational an agent, no matter the breadth of its understanding, no matter the strength of its commitment to objectivity, no matter its abilities to self-reflect and update, it would still never realise that making huge numbers of paperclips is arbitrary and unworthy of its abilities”
The orthogonality claim only has bite against Convergence/Moral Realism if it relates to all or most or typical rational intelligent agents, because that is how moral realists define their claim: they claim that ideal rational agents of a typical kind will converge, or that most rational-enough and intelligent-enough agents will converge. You might be able to build a (genuinely intelligent, reflecting and updating) Clippy, but that wouldn’t prove anything. The natural existence of sociopaths doesn’t disprove MR because they are statiscally rare, and their typicality is in doubt. You can’t prove anything about morality by genetically engineering a sociopath.
As an argument against MR/C, Orthogonality has to claim that the typical, statistically common kind of agent could have arbitrary goals, and that the evidence of convergence amongst humans is explained by specific cultural or genetic features, not by rationality in general.
ETA:
If we don’t understand the relationship between instrumental intelligence and goals, Clippies will seem possible—in the way that p-zombies do if you don’t understand the relationship between matter and consciousness.
Because I want to be sure that I’m understanding what the claim you’re making is.
Okay...so I agree with the Convergence theorist on what the implications of the Orthogonality Thesis are, and I still think the Orthogonality Thesis is true.
Hold on now...that makes the claim completely different than what I thought we were talking about up till now. I thought we were talking about whether or not all rational agents would be in agreement about what morality is, independent of specifically human preferences?
We can have the other discussion too...but not before settling whether or not the Orthogonality Thesis is in fact true “in principle”. Remember, we originally started this discussion with my claim that morality is feelings/preference, as opposed to something you could figure out (i.e. something embedded into logic/game theory or the universe itself.) We weren’t originally talking about rational agents to shed light on evolution or plausible AI...we brought them in as hypothetical agents who converge upon the correct answer to any answerable question, to explore whether or not “what is good” is independent from “what do humans think is good”.
I don’t see how. What did you think we were talking about?
I thought we were talking about whether morality was something that could be discovered objectively.
I said:
Then you said:
Then I said
You disagreed, and said
To which I countered
You disagreed:
Which is why
Hence
doesn’t make any sense in our discussion. All rational agents converge on mathematical and ontological facts, by definition. My argument was that there is no such thing as a “moral fact: and moral statements can only be discussed when in reference to the psychology of a small set of creatures which includes humans and some other mammals. I argued that moral statements can’t be “discovered” true or false in any ontological or mathematical sense, nor are they deeply embedded into game theory (meaning it is not always in the interest of all rational agents to follow human morality) - even though game theory does explain how we evolved morality given our circumstances.
If you admit that at least one of all possible rational agent doesn’t converge upon morality, you’ve been in agreement with me this entire time—which means we’ve been talking about different things this entire time...so what did you think we were talking about?
Only by a definition whereby “rational” means “ideally rational”. In the ordinary sense of the term, it perfectly possible for someone who is deemed “rational” in a more-or-less, good-enough sense to fail to understand some mathematical truths. The existence of the innumerate does not disprove the objectivity of mathematics, and the existence of sociopaths does not disprove the objectivity of morality.
Do you believe that it is possible for a rational agent to fail to understand a mathematical truth? Because that seems rather commonplace to me. Unless you mean ideally rational....
I did mean ideally rational.
The whole point of invoking an ideal rational agent in the first place was to demonstrate that moral “truths” aren’t like empirical or mathematical truths in that you can’t discover them objectively through philosophy or mathematics (even if you are infinitely smart). Rather, moral “truths” are peculiar to humans.
If you want to illustrate the non-objectivity of morality, then stating that even ideal rational agents won’t converge on them is one of expressing the point, although it helps to state the “ideal” explicitly. However, that is still only the expression of a claim, not the “demonstration” of one.
I’m not sure what you mean by “statistically common” here. Do you mean a randomly picked agent out of the set of all possible agents?
I mean likely to be encountered, likely to evolve or to be built (unless you are actually trying to build a Clippy)