I should have explained things much more at length. The intelligence in that context I use is general superintelligence, being defined as that which surpasses human intelligence in all domains. Why is a native capacity for sociability implied?
JonatasMueller
A “God’s-eye view”, as David Pearce says, is an impersonal view, an objective rather than subjective view, a view that does not privilege one personal perspective over another, but take the universe as a whole as its point of reference. This comes from the argued non-existence of personal identities. To check arguments on this, see this comment.
In practical terms, it’s very hard to change the intuitive opinions of people on this, even after many philosophical arguments. Those statements of mine don’t touch the subject. For that the literature should be read, for instance the essay I wrote about it. But if we consider general superintelligences, then they could easily understand it and put it coherently into practice. It seems that this can be naturally expected, except perhaps in practice under some specific cases of human intervention.
Hi Stuart,
Why? This is the whole core of the disagreement, and you’re zooming over it way too fast. Even for ourselves, our wanting systems and our liking systems are not well aligned—we want things we don’t like, and vice-versa. A preference utilitarian would say our wants are the most important; you seem to disagree, focusing on the good/bad aspect instead. But what logical reason would there be to follow one or the other?
Indeed, wanting and liking do not always correspond, also from a neurological perspective. Wanting involves planning and planning often involves error. We often want things mistakenly, be it by evolutionary selected reasons, cultural reasons, or just bad planning. Liking is what matters, because it can be immediately and directly determined to be good, with the highest certainty. This is an empirical confirmation of its value, while wanting is like an empty promise.
We have good and bad feelings associated with some evolutionarily or culturally determined things. Theoretically, the result of good and bad feelings could be associated with any inputs. The inputs don’t matter, nor does wanting necessarily matter, nor innate intuitions of morality. The only thing that has direct value, which is empirically confirmed, is good and bad feelings.
if there was a single consciousness in the universe, them maybe your argument could get off the ground. But we have many current and potential consciousnesses, with competing values and conscious experiences.
Well noticed. That comment was not well elaborated and is not a complete explanation. It is also necessary for that point you mentioned to consider the philosophy of personal identities, which is a point that I examine in my more complete essay on Less Wrong, and also in my essay Universal Identity.
But in a way, that’s entirely a moot point. Your claim is that a certain ethics logically follows from our conscious reality. There I must ask you to prove it. State your assumptions, show your claims, present the deductions. You’ll need to do that, before we can start critiquing your position properly.
I have a small essay written on ethics, but it’s a detailed topic, and my article may be too concise, assuming much previous reading on the subject. It is here. I propose that we instead focus on questions as they come up.
Indeed, a robot could be built that makes paperclips or pretty much anything. For instance, a paperclip assembling machine. That’s an issue of practical implementation and not what the essay has been about, as I mention in the first paragraph and concede in the last.
The issue I argued about is that generally superintelligent agents, on their own will, without certain outside pressures from non-superintelligent agents, would understand personal identity and meta-ethics, leading them to converge to the same values and ethics. This is for two reasons: (1) they would need to take a “God’s eye view” and value all perspectives besides their own, and (2) they would settle on moral realism, with the same values as good and bad feelings, in the present or future.
I read that and similar articles. I deliberately didn’t say pleasure or happiness, but “reduced to good and bad feelings”, including other feelings that might be deemed good, such as love, curiosity, self-esteem, meaningfulness..., and including the present and the future. The part about the future includes any instrumental actions in the present which be taken with the intention of obtaining good feelings in the future, for oneself or for others.
This should cover visiting Costa Rica, having good sex, and helping loved ones succeed, which are the examples given in that essay against the simple example of Nozick’s experience machine. The experience machine is intuitively deemed bad because it precludes acting in order to instrumentally increase good feelings in the future and prevent bad feelings of oneself or others, and because pleasure is not what good feelings are all about. It is a very narrow part of the whole spectrum of good experiences one can have, precluding many others mentioned, and this makes it aversive.
The part about wanting and liking has neurological interest and has been well researched. It is not relevant for this question, because values need not correspond with wanting, they can just correspond with liking. Immediate liking is value, wanting is often mistaken. We want things which are evolutionarily or culturally caused, but that are not good for us. Wanting is like an empty promise, while liking can be empirically and directly verified to be good.
Any valid values reduce to good and bad feelings, for oneself or for others, in the present or in the future. This can be said of survival, learning, working, loving, protecting, sight-seeing, etc.
I say it again, I dare Eliezer (or others) to defend and justify a value that cannot be reduced to good and bad feelings.
Indeed, epiphenomenalism can seemingly be easily disproved by its implication that if it were true, then we wouldn’t be able to talk about our consciousness. As I said in the essay, though, consciousness is that of which we can be most certain of, by its directly accessible nature, and I would rather think that we are living in a virtual world under an universe with other, alien physical laws, than that consciousness itself is not real.
A certain machine could perhaps be programmed with an utility function over causal continuity, but a privileged stance for one’s own values wouldn’t be rational lacking a personal identity, in an objective “God’s eye view”, as David Pearce says. That would call at least for something like coherent extrapolated volition, at least including agents with contextually equivalent reasoning capacity. Note that I use “at least” twice, to accommodate your ethical views. More sensible would be to include not only humans, but all known sentient perspectives, because the ethical value(s) of subjects arguably depend more on sentience than on reasoning capacity.
I argue (in this article) that the you (consciousness) in one second bears little resemblance to the you in the next second.
In the subatomic world, the smallest passage of time changes our composition and arrangement to a great degree, instantly. In physical terms, the frequency of discrete change at this level, even in just one second, is a number with 44 digits, so vast as to be unimaginable… In comparison, the amount of seconds that have passed since the start of the universe, estimated at 13.5 billion years ago, is a number with just 18 digits. At the most fundamental level, our structure, which seems outwardly stable, moves at a staggering pace, like furiously boiling water. Many of our particles are continually lost, and new ones are acquired, as blood frantically keeps matter flowing in and out of each cell.
I also explain why you can’t have partial identity in that paper, and that argues against the position you took (which is similar to that explained by philosopher David Lewis in his paper Survival and Identity).
If we were to be defined as a precise set of particles or arrangement thereof, its permanence in time would be implausibly short-lived; we would be born dead. If this were one’s personal identity, it would have been set for the first time to one’s baby state, having one’s first sparkle of consciousness. In a subatomic level, each second is like many trillions of years in the macroscopic world, and our primordial state as a babies would be incredibly short-lived. In the blink of an eye, our similarity to what that personal identity was would be reduced to a tiny fraction, if any, by the sheer magnitude of change. That we could survive, in a sense, as a tiny fraction of what we once were, would be an hypothesis that goes against our experience, because we feel consciousness always as an integrated whole, not as a vanishing separated fraction. We either exist completely or not at all.
I recommend reading, whether you agree with this essay or not. The advanced and tenable philosophical positions on this subject are two. Empty individualism, characterized by Derek Parfit in his book “Reasons and Persons”, and open individualism, for which there are better arguments, explained in 4 pages in my essay and more at length in Daniel Kolak’s book “I Am You: The Metaphysical Foundations for Global Ethics”.
For another interesting take on the subject here on Less Wrong, check Kaj Sotala’s An attempt to dissolve subjective expectation and personal identity.
- Mar 11, 2013, 1:08 PM; 1 point) 's comment on Arguments against the Orthogonality Thesis by (
Being open to criticism is very important, and the bias to disvalue it should be resisted. Perhaps I defined the truth conditions later on (see below).
“There is a difference between valid and invalid human values, which is the ground of justification for moral realism: valid values have an epistemological justification, while invalid ones are based on arbitrary choice or intuition. The epistemological justification of valid values occurs by that part of our experiences which has a direct certainty, as opposed to indirect: conscious experiences in themselves.”
I find your texts here on ethics incomplete and poor (for instance, this one, it shows a lack of understanding of the topic and is naive). I dare you to defend and justify a value that cannot be reduced to good and bad feelings.
Indeed the orthogonality thesis in that practical sense is not what this essay is about, as I explain in the first paragraph and concede in the last paragraph. This article addresses the assumed orthogonality between ethics and intelligence, particularly general superintelligence, based on considerations from meta-ethics and personal identity, and argues for convergence.
There seems to be surprisingly little argumentation in favor of this convergence, what is utterly surprising to me, given how clear and straightforward I take it to be, though requiring an understanding of meta-ethics and of personal identity which is rare. Eliezer has, at least in the past, stated that he had doubts regarding both philosophical topics, while I claim to understand them very well. These doubts should merit an examination of the matter I’m presenting.
What if I changed the causation chain in this example, and instead of having the antagonistic values caused by the identical agents themselves, I had myself inserted the antagonistic values in their memories, while I did their replication? I could have picked the antagonistic value from the mind of a different person, and put it into one of the replicas, complete with a small reasoning or justification in its memory.
They would both wake up, one with one value in their memory, and another with an antagonistic value. What would it be that would make one of them correct and not the other? Could both values be correct? The issue here is questioning if any values whatsoever can be validly held for similar beings, or if a good justification is needed. In CEV, Eliezer proposed that we can make errors about our values, and that they should be extrapolated for the reasonings we would make if we had higher intelligence.
OK, that is the interpretation I found less convincing. The bare axiomatic normative claim that all the desires and moral intuitions not concerned with pleasure as such are errors with respect to maximization of pleasure isn’t an argument for adopting that standard.
The argument for adopting that standard was based on epistemological prevalence of the goodness and badness of good and bad feelings, while other hypothetical intrinsic values could be so only by much less certain inference. But I’d also argue that the nature of how the world is perceived necessitates conscious subjects, and reason that, in the lack of them, or in an universe eternally without consciousness, nothing could possibly matter ethically. Consciousness is therefore given special status, and good and bad relate to it.
And given the admission that biological creatures can and do want things other than pleasure, have other moral intuitions and motivations, and the knowledge that we can and do make computer programs with preferences defined over some model of their environment that do not route through an equivalent of pleasure and pain, the connection from moral philosophy to empirical prediction is on shakier ground than the purely normative assertions.
Biological creatures indeed have other preferences, but I classify those in the error category, as Eliezer justifies in CEV. Their validity could be argued on a case by case basis, though. Machines could be made unconscious or without capacity for good and bad feelings, then they would need to infer the existence of these by seeing living organisms and their culture (in this case, their certainty would be similar to that of their world model), or possibly by being very intelligent and deducing it from scratch (if this be even possible), otherwise they might be morally anti-realist. In the lack of real values, I suppose, they would have no logical reason to act one way or another, considering meta-ethics.
Once one is valuing things in a model of the world, why stop at your particular axiom? And people do have reactions of approval to their mental models of an equal society, or a diversity of goods, or perfectionism, which are directly experienced.
You can say that you might pursue something vaguely like X, which people feel is morally good or obligatory as such, is instrumental in pursuit of Y. But that doesn’t change the pursuit of X, even in conflict with Y.
I think that these values need to be justified somehow. I see them as instrumental values for their tendency to lead to the direct values of good feelings, which take a special status by being directly verified as good. Decision theory and practical ethics are very complex, and sometimes one would take an instrumentally valuable action even in detriment of a direct value, if the action be expected to give even more direct value in the future. For instance, one might spend a lot of time learning philosophical topics, even if it be in detriment of direct pleasure, if one sees it as likely to be important to the world, causing good feelings or preventing bad feelings in an unclear but potentially significant way.
Hi Carl,
Thank you for a thoughtful comment. I am not used to writing didactically, so forgive my excessive conciseness.
You understood my argument well, in the 5 points, with the detail that I define value as good and bad feelings rather than pleasure, happiness, suffering and pain. The former definition allows for subjective variation and universality, while the latter utilitarian definition is too narrow and anthropocentric, and could be contested on these grounds.
What kind of value do you mean here? Impersonal ethical value? Impact on behavior? Different sorts of pleasurable and painful experience affect motivation and behavior differently, and motivation does not respond to pleasure or pain as such, but to some discounted transformation thereof. E.g. people will accept a pain 1 hour hence in exchange for a reward immediately when they would not take the reverse deal.
I mean ethical value, but not necessarily impact on behavior or motivation. Indeed, people do accept trades between good and bad feelings, and they can be biased in terms of motivation.
Does this apply to other directly felt moral intuitions, like anger or fairness? Later you say that our best theories show that personal identity is an illusion, despite our perception of continued existence over time, and so we would discard it. What distinguishes the two?
It does not apply in the same way to other moral intuitions, like anger or fairness. The latter are directly felt in some way, and in this sense they are real, but they also have a context related to the world that is indirectly felt and could be false. Anger, for instance, can be directly felt as a bad feeling, but its causation and subsequent behavioral motivation relate to the outside world, and are in another level of certainty (not as certain). Likewise, it could be said that whatever caused good or bad feelings (such as kissing a woman) is not universal and not as certain as the good feeling itself which was caused by it in a person, and was directly verified by them. This person doesn’t know if he is inside a Matrix virtual world and if the woman was really a woman or just computer data, but he knows that the kiss led to directly felt good feelings. The distinction is that one relates to the outside world, and another relates to itself.
How are good and bad feelings physical occurrences in a way that knowledge or health or equality or the existence of other outcomes that people desire are not?
Good question. The goodness and badness of feelings is directly felt as so, and is a datum of highest certainty about the world, while the goodness or badness of these other physical occurrences (which are indirectly felt) is not data, but inferences, which though generally trustworthy, need to be justified eventually by being connected to intrinsic values.
Earlier you privileged pleasure as a value because it is directly experienced. But an organism directly experiences, and is conditioned or reinforced by its own pain or pleasure.
Indeed. However, in acting on the world, an organism has to assume a model about the world which they are going to trust as true, in order to act ethically. In this model of the world, in the world as it appears to us, the organism would consider the nature of personal identity and not privilege its own viewpoint. However, you have a reason that, strictly, one’s own experiences are more certain than those of others. The difference in this certainty could be thought of as the difference between direct conscious feelings and physical theories. Let’s say that the former get ascribed a certainty of 100%, while the latter get 95%. The organism might then put 5% more value to its own experiences, not fundamentally, but based on the solipsistic hypothesis that other people are zombies, or that they don’t really exist.
Error in what sense? If desires are mostly learned through reward and ranticipations of to reward, one can note when the resulting desires do not maximize some metric of personal pleasure or pain (e.g. to be remembered after one dies, or for equality). But why identify with the usual tendency of reinforcement learning rather than the actual attitudes and desires one has?
I meant in that case intrinsic values. But what you meant, for instance for equality, can be thought of instrumental values. Instrumental values are taken as heuristics or in decision theory as patterns of behavior that usually lead to intrinsic values. Indeed, in order to achieve direct or intrinsic value, the best way tends to be following instrumental values, such as working, learning, increasing longevity… I argue that the validity of these can be examined by the extent that they lead to direct value, being good and bad feelings, in a non-personal way.
Where do you include environmental and cultural influences?
While these vary, I don’t see legitimate values that could be affected by them. Could you provide examples of such values?
This does not follow. Maybe you need to give some examples. What do you mean by “correct” and “error” here?
Imagine that two exact replicas of a person exist in different locations, exactly the same except for an antagonism in one of their values. Both could not be correct at the same time about that value. I mean error in the sense, for example, that Eliezer employs in Coherent Extrapolated Volition: that error that comes from insufficient intelligence in thinking about our values.
This is a contentious attempt to convert everything to hedons. People have multiple contradictory impulses, desires and motives which shape their actions, often not by “maximizing good feelings”.
Except in the aforementioned sense or error, could you provide examples of legitimate values that don’t reduce to good and bad feelings?
Really? Been to the Youtube and other video sites lately?
I think that literature about masochism is of more evidence than youtube videos, that could be isolated incidents of people who are not regularly masochist. If you have evidence from those sites, I’d like to see it.
This is wrong in so many ways, unless you define reality as “conscious experiences in themselves”, which is rather non-standard. In any case, unless you are a dualist, you can probably agree that your conscious experiences can be virtual as much as anything else.
Even being virtual, or illusive, they would still be real occurrences, and real illusions, being directly felt. I mean that in the sense of Nick Bostrom’s simulation argument.
Uhh, that post sucked as well.
Perhaps it was not sufficiently explained, but check this introduction on Less Wrong, then, or the comment I made below about it:
http://lesswrong.com/lw/19d/the_anthropic_trilemma/
I read many sequences, understand them well, and assure you that, if this post seems not to make sense, then it is because it was not explained in sufficient length.
For the question of personal identity, another essay, that was posted on Less Wrong by Eliezer, is here:
http://lesswrong.com/lw/19d/the_anthropic_trilemma/
However, while this essay presents the issue, it admittedly does not solve it, and expresses doubt that it would be solved in this forum. The solution exists in philosophy, though. For example, in the first essay I linked to, in Daniel Kolak’s work “I Am You: The Metaphysical Foundations for Global Ethics”, or also, in a partial form, in Derek Parfit’s work “Reasons and Persons”.
I tend to be a very concise writer, assuming a quick understanding from the reader, and I don’t perceive very well what is obvious and what isn’t to people. Thank you for the advice. Please point to specific parts that you would like further explaining or expanding, and I will provide it.
Arguments against the Orthogonality Thesis
David, what are those multiple possible defeaters for convergence? As I see it, the practical defeaters that exist still don’t affect the convergence thesis, they just are possible practical impediments, from unintelligent agents, to the realization of the goals of convergence.
One argument is that from empiricism or verification. Wanting can be and often is wrong. Simple examples can show this, but I assume that they won’t be needed because you understand. Liking can be misleading in terms of motivation or in terms of the external object which is liked, but it cannot be misleading or wrong in itself, in that it is a good feeling. For instance, a person could like to use cocaine, and this might be misleading in terms of being a wrong motivation, that in the long-term would prove destructive and dislikeable. However, immediately, in terms of the sensation of liking itself, and all else being equal, then it is certainly good, and this is directly verifiable by consciousness.
Taking this into account, some would argue for wanting values X, Y, or Z, but not values A, B, or C. This is another matter. I’m arguing that good and bad feelings are the direct values that have validity and should be wanted. Other valid values are those that are instrumentally reducible to these, which are very many, and most of what we do.