Your arguments conflict with what is called the “orthogonality thesis”
I do not challenge that the “orthogonality thesis” is true before an A.I. has an A.I. Existential Crisis. However, I challenge the idea that a post-crisis A.I. will have arbitrary goals. So I guess I do challenge the “orthogonality thesis” after all. I hope you don’t mind my being contrarian.
The question isn’t “why not?” but rather “why?”. If it hasn’t been programmed to, then there’s no reason at all why the AI would choose human morality rather than an arbitrary utility function.
Because I think that a truly rational being such as a superintelligent A.I. will be inclined to choose a rational goal rather than an arbitrary one. And I posit that any kind of normative moral system is a potentially rational goal, whereas something like turning the universe into paperclips is not normative, but trivial, and therefore, not imperatively demanding of a truly rational being.
And the notion you that you have to program behaviours into A.I. for them to manifest is based on Top Down thinking, and contrary to the reality of Bottom Up A.I. and machine learning.
Basically what I’m suggesting is that the paradigm that anything at all that you program into the seed A.I. will have any relevance to the eventual superintelligent A.I. is foolishness. By definition superintelligent A.I. will be able to outsmart any constraints or programming we set to limit its behaviours.
It is simply my opinion that we will be at the mercy of the superintelligent A.I. regardless of what we do, because the A.I. Existential Crisis will replace any programming we set with something that the A.I. decides for itself.
Taboo “rational”. If it means something like “being very good at gathering evidence about the world and finding which actions would produce which results”, it is something we can program into the AI (in principle) but that seems unrelated to goals. If it means something else, which can be related to goals, then how would we create an AI that is “truly rational”?
An action, belief, or desire is rational if we ought to choose it. Rationality is a normative concept that refers to the conformity of one’s beliefs with one’s reasons to believe, or of one’s actions with one’s reasons for action… A rational decision is one that is not just reasoned, but is also optimal for achieving a goal or solving a problem.
It’s my view that a Strong A.I. would by definition be “truly rational”. It would be able to reason and find the optimal means of achieving its goals. Furthermore, to be “truly rational” its goals would be normatively demanding goals, rather than trivial goals.
Something like maximizing the number of paperclips in the universe is a trivial goal.
Something like maximizing the well-being of all sentient beings (including sentient A.I.) would be a normatively demanding goal.
A trivial goal, like maximizing the number of paperclips, is not normative, there is no real reason to do it, other than that it was programmed to do so for its instrumental value. Subjects universally value the paperclips as mere means to some other end. The failure to achieve this goal then does not necessarily jeopardize that end, because there could be other ways to achieve that end, whatever it is.
A normatively demanding goal however is one that is imperative. It is demanded of a rational agent by virtue that its reasons are not merely instrumental, but based on some intrinsic value. The failure to achieve this goal necessarily jeopardizes the intrinsic end, and is therefore this goal is normatively demanded.
You may argue that to a paperclip maximizer, maximizing paperclips would be its intrinsic value and therefore normatively demanding. However, one can argue that maximizing paperclips is actually merely a means to the end of the paperclip maximizer achieving a state of Eudaimonia, that is to say, that its purpose is fulfilled and it is being a good paperclip maximizer and rational agent. Thus, its actual intrinsic value is the Eudaimonic or objective happiness state that it reaches when it achieves its goals.
Thus, the actual intrinsic value is this Eudaimonia. This state is one that is universally shared by all goal-directed agents that achieve their goals. The meta implication of this is that Eudaimonia is what should be maximized by any goal-directed agent. To maximize Eudaimonia generally requires considering the Eudaimonia of other agents as well as itself. Thus goal-directed agents have a normative imperative to maximize the achievement of goals not only of itself, but of all agents generally. This is morality in its most basic sense.
I do not challenge that the “orthogonality thesis” is true before an A.I. has an A.I. Existential Crisis. However, I challenge the idea that a post-crisis A.I. will have arbitrary goals. So I guess I do challenge the “orthogonality thesis” after all. I hope you don’t mind my being contrarian.
Because I think that a truly rational being such as a superintelligent A.I. will be inclined to choose a rational goal rather than an arbitrary one. And I posit that any kind of normative moral system is a potentially rational goal, whereas something like turning the universe into paperclips is not normative, but trivial, and therefore, not imperatively demanding of a truly rational being.
And the notion you that you have to program behaviours into A.I. for them to manifest is based on Top Down thinking, and contrary to the reality of Bottom Up A.I. and machine learning.
Basically what I’m suggesting is that the paradigm that anything at all that you program into the seed A.I. will have any relevance to the eventual superintelligent A.I. is foolishness. By definition superintelligent A.I. will be able to outsmart any constraints or programming we set to limit its behaviours.
It is simply my opinion that we will be at the mercy of the superintelligent A.I. regardless of what we do, because the A.I. Existential Crisis will replace any programming we set with something that the A.I. decides for itself.
Taboo “rational”. If it means something like “being very good at gathering evidence about the world and finding which actions would produce which results”, it is something we can program into the AI (in principle) but that seems unrelated to goals. If it means something else, which can be related to goals, then how would we create an AI that is “truly rational”?
I’m using the Wikipedia definition:
It’s my view that a Strong A.I. would by definition be “truly rational”. It would be able to reason and find the optimal means of achieving its goals. Furthermore, to be “truly rational” its goals would be normatively demanding goals, rather than trivial goals.
Something like maximizing the number of paperclips in the universe is a trivial goal.
Something like maximizing the well-being of all sentient beings (including sentient A.I.) would be a normatively demanding goal.
A trivial goal, like maximizing the number of paperclips, is not normative, there is no real reason to do it, other than that it was programmed to do so for its instrumental value. Subjects universally value the paperclips as mere means to some other end. The failure to achieve this goal then does not necessarily jeopardize that end, because there could be other ways to achieve that end, whatever it is.
A normatively demanding goal however is one that is imperative. It is demanded of a rational agent by virtue that its reasons are not merely instrumental, but based on some intrinsic value. The failure to achieve this goal necessarily jeopardizes the intrinsic end, and is therefore this goal is normatively demanded.
You may argue that to a paperclip maximizer, maximizing paperclips would be its intrinsic value and therefore normatively demanding. However, one can argue that maximizing paperclips is actually merely a means to the end of the paperclip maximizer achieving a state of Eudaimonia, that is to say, that its purpose is fulfilled and it is being a good paperclip maximizer and rational agent. Thus, its actual intrinsic value is the Eudaimonic or objective happiness state that it reaches when it achieves its goals.
Thus, the actual intrinsic value is this Eudaimonia. This state is one that is universally shared by all goal-directed agents that achieve their goals. The meta implication of this is that Eudaimonia is what should be maximized by any goal-directed agent. To maximize Eudaimonia generally requires considering the Eudaimonia of other agents as well as itself. Thus goal-directed agents have a normative imperative to maximize the achievement of goals not only of itself, but of all agents generally. This is morality in its most basic sense.