She’d begun with the postulate that intelligence was unexplainable, perhaps because that was what she needed God for.
Don’t be mean: she began with the postulate that intelligence is inexplicable because firstly, she cannot explain it, and secondly, most purported competent professionals in psychology and philosophy of mind cannot explain it, either.
Neither school would propose that human terminal values contain anything we, as modern humans, value.
In which case you are simply defining the issue wrongly. Evolution was not a very careful designer: we are not Evolution!Friendly. Our cognitive algorithms do not compute anything that resembles “inclusive genetic fitness” except by coincidence.
Evolution once “trusted” that any creature that accidentally optimized for something other than inclusive genetic fitness would die out (ie: selection pressures would operate more quickly than any mere creature could optimize the environment). Well too bad: we think, act, and change far faster than the timescales sufficient for evolutionary pressures to work.
But of course, if you really believe this sort of thing, I can always just rewrite you to go along with some different set of values of my own devising. After all, it’s not like you had any real goals or values of your own, right? So what have you lost as you let me fiddle around in your mind, that wasn’t an empty category in the first place?
Nice job proving your own suicide completely rational /s—but I’m sure you got some really scary existential willies out of that, which seems to be what some people like about their personal form of “rationality”.
Our cognitive algorithms do not compute anything that resembles “inclusive genetic fitness” except by coincidence.
This is very wrong. But to address the point you’re trying to make: This is like saying that the computer which controls a robot arm doesn’t compute anything that resembles arm motions; it just computes numbers. Those numbers, sent into the arm actuators, produce motions. Your cognitive algorithms, executed in a human body, maximize its inclusive genetic fitness. “You” don’t need to be aware that that’s what it’s doing, nor do the algorithms need to represent the concept of genetic fitness.
After all, it’s not like you had any real goals or values of your own, right? So what have you lost as you let me fiddle around in your mind, that wasn’t an empty category in the first place?
I have feelings, which I enjoy. “I” have values which are not aligned with the values of my genes. But they aren’t terminal values in the sense that may be required by Friendly AI theory.
One of the many problems with FAI theory, as Eliezer has written about it, is that you want to “improve” human cognition. This involves looking at things humans do and deciding which things are values, and which are errors, so you can eliminate the errors but keep the values. But I don’t know any way to do this other than tracing all the values back to the terminal values, keeping those, and throwing out the instrumental values. But it turns out that everything we, the conscious riders on our physical bodies, value, is instrumental.
Your cognitive algorithms, executed in a human body, maximize its inclusive genetic fitness.
No, they don’t. They simply, straightforwardly don’t. Eating sugar instead of fish, or using birth control, browsing the web (when I could be having sex to increase my count of offspring), and remaining loyal to my fiancee (when I could plainly get better stock than her should I sincerely attempt to maximize the count and fitness of my offspring) are all behaviors generated by my cognitive algorithms that straightforwardly seek goals other than inclusive genetic fitness (in this case: two sensual pleasures, one intellectual one, and a combination of pair-bonded attachment and keeping up the moral trust in our relationship).
I have feelings, which I enjoy. “I” have values which are not aligned with the values of my genes.
There is no need for scare-quotes: just because your individuality does not correspond to an immortal, supernatural soul doesn’t mean it corresponds to nothing at all.
But they aren’t terminal values in the sense required by Friendly AI theory.
In which case it is the theory that requires correction, not the lower-level (lower-level in the Hierarchical Bayesian sense, closer to the evidence) belief that we have values.
One of the many problems with FAI theory, as Eliezer has written about it, is that you want to “improve” human cognition.
Actually, the chief problem with FAI theory as written by Eliezer is that there simply isn’t much of it!
This involves looking at things humans do and deciding which things are values, and which are errors, so you can eliminate the errors but keep the values. But I don’t know any way to do this other than tracing all the values back to the terminal values, keeping those, and throwing out the instrumental values.
But it turns out that everything we, the conscious riders on our physical bodies, value, is instrumental.
Our values are only instrumental from the point of view of evolution. That’s not an objective point of view: in order to decide that our values are heuristics for maximizing inclusive genetic fitness, you first have to assume very naive definitions of “terminal” (something like: goal of the “least-caused” optimization process) and “instrumental” (something like: goal of a “more-caused” optimization process).
The issue is, of course, that humans don’t particularly care (or, much of the time, know) what caused us, and also that locating evolution as the least-caused optimizer is incorrect: entropy is the least-caused optimizer (in fact, it’s the only elemental force of optimization in the universe: it drives the arrow of time).
Even this already gives us a way to straightforwardly measure which goals and values are terminal, and which instrumental: the precise causal processes underlying an instrumental goal are a matter of great evaluative import to our human cognitive algorithms, whereas the precise causal processes underlying a terminal goal are a matter of no import at all. When you stop caring how your goal/value/feeling got there, and only care about fulfilling it, you’ve found a terminal goal/value/feeling.
To take a common example: love! It’s true, terminal love when you simply don’t give half a damn how it’s implemented!
Now, to get further on this I’ll need a decent theory of how conceptual and causal/generative abstraction take place in humans, how we get from “chair” to “big cloud of buzzing probability clouds of quarks” and back again, but that kind of progress on human cognitive algorithms and evaluative judgements will give us a solid way to talk about terminal versus instrumental: when the details at the lower-level of reality can be thrown out without altering the evaluative judgement, you’ve found something that is terminally relevant, and from which value/relevance/usefulness/utility flows backwards into other things during the probabilistic backwards-chaining process the human mind seems to use for planning.
Once again: just because your human concepts do not correspond to objects at the most ontologically basic and causally early levels of reality does not mean they fail to correspond to anything. IMNSHO, if you can find it in yourself to assent to chair realism, you should equally assent to value realism: both those concepts correspond to real things, even if “values” needs reforming (a change in the induced conceptual definition needed to correspond to the appropriate data) from its earlier “innate feature of the world” formulation in order to do it.
Your cognitive algorithms, executed in a human body, maximize its inclusive genetic fitness.
No, they don’t. They simply, straightforwardly don’t.
They maximize fitness the way any optimization method maximizes a complex function: unreliably, slowly, not always moving in the right direction. All that is required to say that something “maximizes” a function is that it generally increases its value. Perhaps “optimizes” would be a better word.
In some cases today, these heuristics no longer optimize fitness at all. As we all know. This is not a point worth dwelling on.
There is no need for scare-quotes: just because your individuality does not correspond to an immortal, supernatural soul doesn’t mean it corresponds to nothing at all.
The quotes are there because resolving what “I” refers to is non-trivial, and the discussion here depends on it.
In which case it is the theory that requires correction, not the lower-level (lower-level in the Hierarchical Bayesian sense, closer to the evidence) belief that we have values.
I never said we don’t have values. I said human values aren’t terminal values. You need to make sure you understand that distinction before criticizing that part of my post.
Actually, the chief problem with FAI theory as written by Eliezer is that there simply isn’t much of it!
Agreed.
Our values are only instrumental from the point of view of evolution. That’s not an objective point of view:
Yes, it is. The terminal values are what the system is optimizing. What the system optimizes doesn’t depend on your perspective; it depends on what provides feedback and error-correction. Reproductive success is the feedback mechanism; increasing it is what the system develops to do. Everything above is variable, inconstant, inconsistent; everything below is not being optimized for.
also that locating evolution as the least-caused optimizer is incorrect: entropy is the least-caused optimizer
See above. The feedback to the human system occurs at the level of reproductive fitness. What you just said implies that humans actually maximize entropy. Think about that for a few moments. I mean, technically, we do; everything does. But any intelligent analysis would notice that humans reduce entropy locally.
When you stop caring how your goal/value/feeling got there, and only care about fulfilling it, you’ve found a terminal goal/value/feeling.
Try enumerating examples of terminal values. You’ll find they are contradictory, they change within individuals and within societies rapidly, they are not constants of human history, and they are very often things that one would think we would rather eliminate from society than build a big AI to guarantee we will have them with us forever. Perhaps more importantly, the “biases” that LessWrong was founded to eliminate are indistinguishable from those kinds of values. See Human errors, human values.
when the details at the lower-level of reality can be thrown out without altering the evaluative judgement, you’ve found something that is terminally relevant, and from which value/relevance/usefulness/utility flows backwards into other things during the probabilistic backwards-chaining process the human mind seems to use for planning.
First, not many values above the level of the genetic remain constant across time and across the Earth.
Second, that wouldn’t help with resolving conflicts between higher “instrumental” values. If you removed the instrumental values, leaving the low-level judgements, and used greater computational power to optimize them more accurately, the human would produce different outputs. Would the human then have been “debugged” because it produced outputs more in accordance with the “terminal” level? Why should low-level judgements like galvanic skin response have precedence over cognitive judgements? The things that you would list as “terminal values” would tend to be things we have in common with all mammals. “Human values” should include some values not also found in dogs and pigs. But evolution very often works by elaboration, and it would not be surprising if most or all of the “human” part of our values were in things layered on top of these “terminal” values.
Third, there is no way to distinguish values from mistakes/biases.
Fourth, there is probably no way to “extrapolate” values away from the organism. Your list of “terminal human values” would be full of statements like “Humans value sweet and salty tastes” and “Males value having their penises stroked.” This is not, I think, what is most-important for us to pass on to the Universe a billion years from now. They will not apply to non-human bodies. Any attempt by an AI to enforce these values would seem to require keeping the standard human body for the rest of the life of the Universe.
just because your human concepts do not correspond to objects at the most ontologically basic and causally early levels of reality does not mean they fail to correspond to anything
First of all, let me say that I’ve been busy today and thus apologize for the sporadic character of my replies. Now, to begin with the most shocking and blunt statements...
Fourth, there is probably no way to “extrapolate” values away from the organism. Your list of “terminal human values” would be full of statements like “Humans value sweet and salty tastes” and “Males value having their penises stroked.”
What’s the problem? Were you expecting something other than humanity to come through in your model of humanity? Your phrasing signals that you are looking down on both sex and the enjoyment of food, and that you view them as aesthetically and/or morally inferior to… what? To “nonhuman bodies”? To intellectual pursuits?
Do you think intellectual pursuits will not also have their place in a well-learned model of human preferences? Are you trying to signal some attachment to the Spiral instinct/will-to-power/tsuyoku naritai principle? But even if you terminally value the expansion of your own causal or optimization power, there are other things you terminally value as well; it is unwise to throw away the rest of your humanity for power. You’ll be missing out.
To repeat one of my overly-repeated catch phrases: cynicism and detachment are not innately virtuous or wise. If what real, live human beings actually want, in the limit of increasing information and reflection, is to spend existence indulging tastes you happen to find gauche or déclassé, from where are you deriving some kind of divine-command-style moral authority to tell everyone, including yourself, to want things other than what we actually want?
What rational grounds can you have to say that a universe of pleasures—high and low—and ongoing personal development, and ongoing social development, and creativity, and emotionally significant choices to make, and genuine, engaging challenges to meet, and other people to do it all with (yes I am just listing Fun Theory Sequence entries because I can’t be bothered to be original at midnight)… is just not good enough for you if it requires learning a different way to conceptualize it all that turns out to correspond to your original psychological structure more than it corresponds to a realm of Platonic Forms, since there turned out not to be Platonic Forms?
Why do you feel guilty for not getting the approval of deities who don’t exist?
Any attempt by an AI to enforce these values would seem to require keeping the standard human body for the rest of the life of the Universe.
Or, and this is the neat bit, to create new kinds of nonhuman bodies, or nonbodily existence, that are more suited to what we value than our evolved human ones.
This is not, I think, what is most-important for us to pass on to the Universe a billion years from now.
Simply put: why not?
Try enumerating examples of terminal values. You’ll find they are contradictory, they change within individuals and within societies rapidly, they are not constants of human history, and they are very often things that one would think we would rather eliminate from society than build a big AI to guarantee we will have them with us forever.
Again: this is why we are trying to reduce the problem to cognitive algorithms, about which facts clearly exist, rather than leaving it at the level of “a theory is a collection of sentences written in first-order logic augmented with some primitive predicates”. The former is a scientific reality we can model and compute with, while the latter is a cancerous bunch of Platonist nonsense slowly killing the entire field of philosophy by metastasizing into whole fields and replacing actual reductionist rigor with the illusion of mathematical formalism.
(The above is, of course, a personal opinion, which you can tell because of the extreme vehemence. But holy shit do I hate Platonism and all its attendant fake rigor.)
Anyway, the rest I’ll have to answer in the morning, after a night’s sleep.
I am having difficulty seeing what you don’t understand about PhilGoetz’s point. You read like you’re reacting to overstatements on his part, but it looks to me like you’re reaching much further from reality than he is, or uncharitably interpreting his statements.
We can abstract from our values to principles, and so on, but what makes the difference between an instrumental value and a terminal value is that a terminal value is one that exists for its own sake. Inclusive genetic fitness does match that definition, because natural selection is a thing that slowly replaces things with lower inclusive genetic fitness with things with higher inclusive genetic fitness. This is what biologists mean by ‘maximize,’ and it’s different from what numerical optimization / math people mean by ‘maximize.’
Is it true that you are doing the most you can to maximize your inclusive genetic fitness (IGF)? No, you’re clearly suboptimal. But it is clearly true that your ancestors reproduced, and thus your genes are a product of the evolutionary project to gradually replace lower IGF with higher IGF, and in that sense you are doing more on average to increase your IGF than the counterfactual yous that do not exist because their ancestors failed to reproduce. That seems to be what PhilGoetz is arguing for on the object level (and he should correct me if that’s not the case.)
So now we take a step back to talk about values. When we look at possible values, we see lots of things that want to exist for their own sake (view how people talk about truth, justice, equality, and so on), but humans only seem to desire them because of their effects (view how people act about truth, justice, equality, and so on). It looks like people choose baskets of values and make tradeoffs between them- but in order to make tradeoffs between two instrumental values, there must be some terminal value that can look at options and say “option A is better than option B.”
It looks like the historical way this happens is that people have values, and some people reproduce more / spread their memes more, and this shifts the gene and meme (here, read as “value”) distributions. Broadly, it seems like genes and memes that are good for IGF are somewhat more popular than genes and memes that are bad for IGF, probably for the obvious reason.
That is, it looks like the universe judges value conflicts by existence. If there are more Particularists than Universalists, that seems to be because Particularists are out-existing the Universalists. To the extent that humans have pliable value systems, they seem to look around, decide what values will help them exist best, and then follow those values. (They also pick up their definition of “exist” from the environment around them, leading to quite a bit of freedom in how humans value things, though there seem to be limits on pliability.)
Moving forward, we seem to have some control over how the economic and military frontiers will change, and thus some control over what values will promote more or less existence. We probably want to exert that control in order to ensure the ‘right’ morality is favored.
But… if the practical determines the moral, and we want to decide what is practical using the moral, we now have a circular situation that it’s difficult to escape.
Our deeply held values are not “deeply held” in the sense that we can go meta and justify them to someone who doesn’t have them, but does share our meta-level value generating process. If we put a hypothetical twin you into a Comanche tribe to be raised, and then once he reached your current age you and he tried to come up with the list of human values and optimal arrangement of power, there would probably be significant disagreement. So PhilGoetz is pessimistic about a plan that looks at humans and comes up with the right values moving forward, because the system that determines those values is not a system we trust.
We can abstract from our values to principles, and so on, but what makes the difference between an instrumental value and a terminal value is that a terminal value is one that exists for its own sake.
“Sakes” are mental concepts. Reality does not contain extra-mental sakes to exist for.
Inclusive genetic fitness does match that definition, because natural selection is a thing that slowly replaces things with lower inclusive genetic fitness with things with higher inclusive genetic fitness.
Again: since evolution does not have a mind, I don’t see how you could label inclusive genetic fitness as “terminal”. It is the criterion for which evolution optimizes, but that’s not nearly the same thing as a “terminal value” in any ethical or FAI sense.
(And, as I mentioned, it is very definitely not terminal, in the sense that it is a sub-optimizer for the Second Law of Thermodynamics.)
Is it true that you are doing the most you can to maximize your inclusive genetic fitness (IGF)? No, you’re clearly suboptimal. But it is clearly true that your ancestors reproduced, and thus your genes are a product of the evolutionary project to gradually replace lower IGF with higher IGF, and in that sense you are doing more on average to increase your IGF than the counterfactual yous that do not exist because their ancestors failed to reproduce. That seems to be what PhilGoetz is arguing for on the object level (and he should correct me if that’s not the case.)
While your statement about my being more genetically “fit” than, say, the other sperm and egg cells that I killed off in the womb is entirely correct, that has basically nothing to do with the concept of “terminal values”, which are strictly a property of minds (and which evolution simply does not have).
So now we take a step back to talk about values. When we look at possible values, we see lots of things that want to exist for their own sake (view how people talk about truth, justice, equality, and so on), but humans only seem to desire them because of their effects (view how people act about truth, justice, equality, and so on). It looks like people choose baskets of values and make tradeoffs between them- but in order to make tradeoffs between two instrumental values, there must be some terminal value that can look at options and say “option A is better than option B.”
Or a person must simply trade off their terminal values against each-other, with some weighting deciding the final total utility.
It seems to me like we need a word to use in Evaluative Cognitive Algorithms Theory other than “values”, since people like you and PhilGoetz are confusing “values” in the Evaluative Cognitive Algorithms Theory sense of the word with “values” in the sense of what a non-naturalist ethicist or a politician talks about.
Moving forward, we seem to have some control over how the economic and military frontiers will change, and thus some control over what values will promote more or less existence. We probably want to exert that control in order to ensure the ‘right’ morality is favored.
If you are thinking in terms of promoting the “right” morality in an evolutionary sense, that the “right” morality is a program of which you “must” make copies, you are not using the term “morality” in the sense that Evaluative Cognitive Algorithms Theory people use it either.
(And certainly not in any sense that would invoke moral realism, but you don’t seem to have been claiming moral realism in the first place. On a side note, I think that trying to investigate what theories allow you to be “realist about X” is a useful tool for understanding what you mean by X, and morality is no exception.)
But… if the practical determines the moral, and we want to decide what is practical using the moral, we now have a circular situation that it’s difficult to escape.
No we don’t. One optimizer can be stronger than another. For instance, at this point, humanity is stronger than evolution: we are rapidly destroying life on this planet, including ourselves, faster than anything can evolve to survive having our destructive attentions turned its way. Now personally I think that’s bloody-stupid, but it certainly shows that we are the ones setting the existence pressures now, we are the ones deciding precisely where the possible-but-counterfactual gives way to the actual.
And unfortunately, we need modal logic here. The practical does not determine what the set moral modality we already possess will output. That modality is already a fixed computational structure (unless you’re far more cultural-determinist than I consider reasonable).
Our deeply held values are not “deeply held” in the sense that we can go meta and justify them to someone who doesn’t have them, but does share our meta-level value generating process.
I am confused by what you think a “meta-level value-generating process” is, or even could be, at least in the realms of ethics or psychology. Do you mean evolution when you say “meta-level value generating process”?
And additionally, why on Earth should we have to justify our you!”values” to someone who doesn’t have them? Seeking moral justification is itself an aspect of human psychology, so the average non-human mind would never expect any such thing.
If we put a hypothetical twin you into a Comanche tribe to be raised, and then once he reached your current age you and he tried to come up with the list of human values and optimal arrangement of power, there would probably be significant disagreement.
There would be a significant difference in preference of lifestyles. Once we explained each-other to each-other, however, what we call “values” would be very, very close, and ways to arrange to share the world would be invented quite quickly.
(Of course, this may simply reflect that I put more belief-weight on bioterminism, whereas you place it on cultural determinism.)
Again: since evolution does not have a mind, I don’t see how you could label inclusive genetic fitness as “terminal”.
Perhaps it would be clearer to discuss “exogenous” and “endogenous” values, as the relevant distinction between terminal and instrumental values are that terminal values are internally uncaused, and instrumental values are those pursued because they will directly or indirectly lead to an improvement in those terminal values, and this maps somewhat clearly onto exogenous and endogenous.
That is, of course this is a two-place word. IGF is exogenous to humans, but endogenous to evolution (and, as you put it, entropy is exogenous to evolution).
So my statement is that we have a collection of values and preferences that are moderately well-suited to our environment, because there is a process by which environments shape their inhabitants. As we grow more powerful, we shape our environment to be better suited to our values and preferences, because that is how humans embody preferences.
But we have two problems. First, our environment is still shaping our values and preferences, and thus the sort of world that we most want to live in might not be a world that would be mostly populated by us. Second, if we have any conflicts about preferences, typically we would go up a level to resolve those conflicts—but it is obvious that the level “above” us doesn’t have any desirable moral insights. So we can’t ground our conflict-resolution process in something moral instead of practical.
Of course, this may simply reflect that I put more belief-weight on bioterminism, whereas you place it on cultural determinism.
It seems to me that near-mode values are strongly biodetermined, but far-mode values are almost entirely culturally determined. Since most moral philosophy takes place in far mode, cultural determination is far more relevant. You and your Comanche twin might be equally anxious, say, but are probably anxious about very different things and have different coping strategies and so on.
ways to arrange to share the world would be invented quite quickly.
I picked Comanche specifically because they were legendary raiders with a predatory morality.
First, our environment is still shaping our values and preferences, and thus the sort of world that we most want to live in might not be a world that would be mostly populated by us.
I simply have to ask: so what? I place no particular terminal value on evolution itself. I see nothing wrong, neither aesthetically nor morally, with simply overriding evolution through human deeds, the better to create the kind of world that, indeed, we living humans most want to live in. Who cares how probable it was, a priori, that evolution should spawn our sort of people in our preferred sort of environment?
Well, I suppose you do, for some reason, but I’m really confused as to why.
Second, if we have any conflicts about preferences, typically we would go up a level to resolve those conflicts
Actually, I disagree: we usually just negotiate from a combination of heuristics for morally appropriate power relations (picture something Rawlsian, and there are complex but, IMHO, well-investigated sociological arguments for why a Rawlsian approach to power relations is a rational idea for the people involved) and morally inappropriate power relations (ie: compulsion and brute force).
I suppose you could call the former component “going up a level”, but ultimately I think it grounds itself in the Rawls-esque dynamics of creating, out of social creatures who only share a little personality and experience in common among everyone, a common society that improves life for all its members and maximizes the expected yield of individual efforts, particularly in view of the fact that many causally relevant attributes of individuals are high-entropy random variables and so we need to optimize the expected values, blah blah blah. Ultimately, human individuals do not enter into society because some kind of ontologically, metaphysically special Fundamental Particle of Morals collides with them and causes them to do so, but simply because people need other people to help each-other out and to feel at all ok about being people—solidarity is a basic sociological force.
So we can’t ground our conflict-resolution process in something moral instead of practical.
As you can see above, I think the conflict-resolution process is the most practical part of the morals of human life.
It seems to me that near-mode values are strongly biodetermined, but far-mode values are almost entirely culturally determined. Since most moral philosophy takes place in far mode, cultural determination is far more relevant.
Frankly, I think this is just an error on the part of most so-called moral philosophy, that it is conducted largely in a cognitive mode governed by secondary ideas-about-ideas, beliefs-in-beliefs, and impressions-about-impressions, a realm almost entirely without experiential data.
While I don’t think “Near Mode/Far Mode” is entirely a map that matches the psychological territory, insofar as we’re going to use it, I would consider Near Mode far more morally significant, precisely because it is informed directly by the actual experiences of the actual individuals involved. The social signals that convey “ideas” as we usually conceive of them in “Far Mode” actually have a tiny fraction of the bandwidth of raw sensory experience and conscious ideation, and as such should be weighted far more lightly by those of us looking to make our moral and aesthetic evaluations on data the same way we make factual evaluations on data.
The first rule of bounded rationality is that data and compute-power are scarce resources, and you should broadly assume that inferences based on more of each are very probably better than inferences in the same domain performed with less of each—and one of these days I’ll have the expertise to formalize that!
I simply have to ask: so what? I place no particular terminal value on evolution itself. I see nothing wrong, neither aesthetically nor morally, with simply overriding evolution through human deeds, the better to create the kind of world that, indeed, we living humans most want to live in.
I don’t think I was clear enough. I’m not stating that it is value-wrong to alter the environment; indeed, that’s what values push people to do. I’m saying that while the direct effect is positive, the indirect effects can be negative. For example, we might want casual sex to be socially accepted because casual sex is fun, and then discover that this means unpleasant viruses infect a larger proportion of the population, and if they’re suitable lethal the survivors will, by selection if not experience, be those who are less accepting of casual sex. Or we might want to avoid a crash now and so transfer wealth from good predictors to poor predictors, and then discover that this has weakened the incentive to predict well, leading to worse predictions overall and more crashes. Both of those are mostly cultural examples, and I suspect the genetic examples will suggest themselves.
That is, one of the ways that values drift is the environmental change brought on by the previous period’s exertion of their morals may lead to the destruction of those morals in the next period. If you care about value preservation, this is one of the forces changing values that needs to be counteracted or controlled.
The traits of evolved organisms are usually a reasonable approximation of fitness-maximizing, because they’re the output of billions of years of a fitness-maximizing process; I’m with you on not being willing to call it “coincidence”. But this seems silly:
Your cognitive algorithms, executed in a human body, maximize its inclusive genetic fitness.
My cognitive algorithms are the results of a process which maximizes inclusive genetic fitness. This does not mean these algorithms themselves maximize that.
On the margin, a large fraction of the outputs of this process definitely don’t maximize fitness. Otherwise, there would be nothing for selection to eventually weed out!
My cognitive algorithms are the results of a process which maximizes inclusive genetic fitness. This does not mean these algorithms themselves maximize that.
I’m using “maximize” loosely. I agree with your observations.
Don’t be mean: she began with the postulate that intelligence is inexplicable because firstly, she cannot explain it, and secondly, most purported competent professionals in psychology and philosophy of mind cannot explain it, either.
Possible, but I don’t think so. At least not literally, when “began” is taken chronologically. She came from a religious family, and was inculcated with a need for God long before she had any curiosity about intelligence.
Are you aware of the irony of you telling me not to be mean? I also found a comment by you on another website castigating someone for being condescending. You should make some token attempt to follow the behavior that you demand of others.
“Tu quoque” means claiming that an assertion is false because the person making the assertion doesn’t believe the assertion. In this case, for me to be making a Tu quoque fallacy, I would have to be arguing that one should be mean on LessWrong.
Are you aware of the irony of you telling me not to be mean?
Ah, but this is LessWrong: irony and manners are both disabled as a matter of cultural norm here, the better to emulate the cold, heartless robots we admire so deeply.
Don’t be mean: she began with the postulate that intelligence is inexplicable because firstly, she cannot explain it, and secondly, most purported competent professionals in psychology and philosophy of mind cannot explain it, either.
In which case you are simply defining the issue wrongly. Evolution was not a very careful designer: we are not Evolution!Friendly. Our cognitive algorithms do not compute anything that resembles “inclusive genetic fitness” except by coincidence.
Evolution once “trusted” that any creature that accidentally optimized for something other than inclusive genetic fitness would die out (ie: selection pressures would operate more quickly than any mere creature could optimize the environment). Well too bad: we think, act, and change far faster than the timescales sufficient for evolutionary pressures to work.
But of course, if you really believe this sort of thing, I can always just rewrite you to go along with some different set of values of my own devising. After all, it’s not like you had any real goals or values of your own, right? So what have you lost as you let me fiddle around in your mind, that wasn’t an empty category in the first place?
Nice job proving your own suicide completely rational /s—but I’m sure you got some really scary existential willies out of that, which seems to be what some people like about their personal form of “rationality”.
This is very wrong. But to address the point you’re trying to make: This is like saying that the computer which controls a robot arm doesn’t compute anything that resembles arm motions; it just computes numbers. Those numbers, sent into the arm actuators, produce motions. Your cognitive algorithms, executed in a human body, maximize its inclusive genetic fitness. “You” don’t need to be aware that that’s what it’s doing, nor do the algorithms need to represent the concept of genetic fitness.
I have feelings, which I enjoy. “I” have values which are not aligned with the values of my genes. But they aren’t terminal values in the sense that may be required by Friendly AI theory.
One of the many problems with FAI theory, as Eliezer has written about it, is that you want to “improve” human cognition. This involves looking at things humans do and deciding which things are values, and which are errors, so you can eliminate the errors but keep the values. But I don’t know any way to do this other than tracing all the values back to the terminal values, keeping those, and throwing out the instrumental values. But it turns out that everything we, the conscious riders on our physical bodies, value, is instrumental.
No, they don’t. They simply, straightforwardly don’t. Eating sugar instead of fish, or using birth control, browsing the web (when I could be having sex to increase my count of offspring), and remaining loyal to my fiancee (when I could plainly get better stock than her should I sincerely attempt to maximize the count and fitness of my offspring) are all behaviors generated by my cognitive algorithms that straightforwardly seek goals other than inclusive genetic fitness (in this case: two sensual pleasures, one intellectual one, and a combination of pair-bonded attachment and keeping up the moral trust in our relationship).
There is no need for scare-quotes: just because your individuality does not correspond to an immortal, supernatural soul doesn’t mean it corresponds to nothing at all.
In which case it is the theory that requires correction, not the lower-level (lower-level in the Hierarchical Bayesian sense, closer to the evidence) belief that we have values.
Actually, the chief problem with FAI theory as written by Eliezer is that there simply isn’t much of it!
Our values are only instrumental from the point of view of evolution. That’s not an objective point of view: in order to decide that our values are heuristics for maximizing inclusive genetic fitness, you first have to assume very naive definitions of “terminal” (something like: goal of the “least-caused” optimization process) and “instrumental” (something like: goal of a “more-caused” optimization process).
The issue is, of course, that humans don’t particularly care (or, much of the time, know) what caused us, and also that locating evolution as the least-caused optimizer is incorrect: entropy is the least-caused optimizer (in fact, it’s the only elemental force of optimization in the universe: it drives the arrow of time).
Even this already gives us a way to straightforwardly measure which goals and values are terminal, and which instrumental: the precise causal processes underlying an instrumental goal are a matter of great evaluative import to our human cognitive algorithms, whereas the precise causal processes underlying a terminal goal are a matter of no import at all. When you stop caring how your goal/value/feeling got there, and only care about fulfilling it, you’ve found a terminal goal/value/feeling.
To take a common example: love! It’s true, terminal love when you simply don’t give half a damn how it’s implemented!
Now, to get further on this I’ll need a decent theory of how conceptual and causal/generative abstraction take place in humans, how we get from “chair” to “big cloud of buzzing probability clouds of quarks” and back again, but that kind of progress on human cognitive algorithms and evaluative judgements will give us a solid way to talk about terminal versus instrumental: when the details at the lower-level of reality can be thrown out without altering the evaluative judgement, you’ve found something that is terminally relevant, and from which value/relevance/usefulness/utility flows backwards into other things during the probabilistic backwards-chaining process the human mind seems to use for planning.
Once again: just because your human concepts do not correspond to objects at the most ontologically basic and causally early levels of reality does not mean they fail to correspond to anything. IMNSHO, if you can find it in yourself to assent to chair realism, you should equally assent to value realism: both those concepts correspond to real things, even if “values” needs reforming (a change in the induced conceptual definition needed to correspond to the appropriate data) from its earlier “innate feature of the world” formulation in order to do it.
They maximize fitness the way any optimization method maximizes a complex function: unreliably, slowly, not always moving in the right direction. All that is required to say that something “maximizes” a function is that it generally increases its value. Perhaps “optimizes” would be a better word.
In some cases today, these heuristics no longer optimize fitness at all. As we all know. This is not a point worth dwelling on.
The quotes are there because resolving what “I” refers to is non-trivial, and the discussion here depends on it.
I never said we don’t have values. I said human values aren’t terminal values. You need to make sure you understand that distinction before criticizing that part of my post.
Agreed.
Yes, it is. The terminal values are what the system is optimizing. What the system optimizes doesn’t depend on your perspective; it depends on what provides feedback and error-correction. Reproductive success is the feedback mechanism; increasing it is what the system develops to do. Everything above is variable, inconstant, inconsistent; everything below is not being optimized for.
See above. The feedback to the human system occurs at the level of reproductive fitness. What you just said implies that humans actually maximize entropy. Think about that for a few moments. I mean, technically, we do; everything does. But any intelligent analysis would notice that humans reduce entropy locally.
Try enumerating examples of terminal values. You’ll find they are contradictory, they change within individuals and within societies rapidly, they are not constants of human history, and they are very often things that one would think we would rather eliminate from society than build a big AI to guarantee we will have them with us forever. Perhaps more importantly, the “biases” that LessWrong was founded to eliminate are indistinguishable from those kinds of values. See Human errors, human values.
First, not many values above the level of the genetic remain constant across time and across the Earth.
Second, that wouldn’t help with resolving conflicts between higher “instrumental” values. If you removed the instrumental values, leaving the low-level judgements, and used greater computational power to optimize them more accurately, the human would produce different outputs. Would the human then have been “debugged” because it produced outputs more in accordance with the “terminal” level? Why should low-level judgements like galvanic skin response have precedence over cognitive judgements? The things that you would list as “terminal values” would tend to be things we have in common with all mammals. “Human values” should include some values not also found in dogs and pigs. But evolution very often works by elaboration, and it would not be surprising if most or all of the “human” part of our values were in things layered on top of these “terminal” values.
Third, there is no way to distinguish values from mistakes/biases.
Fourth, there is probably no way to “extrapolate” values away from the organism. Your list of “terminal human values” would be full of statements like “Humans value sweet and salty tastes” and “Males value having their penises stroked.” This is not, I think, what is most-important for us to pass on to the Universe a billion years from now. They will not apply to non-human bodies. Any attempt by an AI to enforce these values would seem to require keeping the standard human body for the rest of the life of the Universe.
I don’t think that relates to anything I wrote.
First of all, let me say that I’ve been busy today and thus apologize for the sporadic character of my replies. Now, to begin with the most shocking and blunt statements...
What’s the problem? Were you expecting something other than humanity to come through in your model of humanity? Your phrasing signals that you are looking down on both sex and the enjoyment of food, and that you view them as aesthetically and/or morally inferior to… what? To “nonhuman bodies”? To intellectual pursuits?
Do you think intellectual pursuits will not also have their place in a well-learned model of human preferences? Are you trying to signal some attachment to the Spiral instinct/will-to-power/tsuyoku naritai principle? But even if you terminally value the expansion of your own causal or optimization power, there are other things you terminally value as well; it is unwise to throw away the rest of your humanity for power. You’ll be missing out.
To repeat one of my overly-repeated catch phrases: cynicism and detachment are not innately virtuous or wise. If what real, live human beings actually want, in the limit of increasing information and reflection, is to spend existence indulging tastes you happen to find gauche or déclassé, from where are you deriving some kind of divine-command-style moral authority to tell everyone, including yourself, to want things other than what we actually want?
What rational grounds can you have to say that a universe of pleasures—high and low—and ongoing personal development, and ongoing social development, and creativity, and emotionally significant choices to make, and genuine, engaging challenges to meet, and other people to do it all with (yes I am just listing Fun Theory Sequence entries because I can’t be bothered to be original at midnight)… is just not good enough for you if it requires learning a different way to conceptualize it all that turns out to correspond to your original psychological structure more than it corresponds to a realm of Platonic Forms, since there turned out not to be Platonic Forms?
Why do you feel guilty for not getting the approval of deities who don’t exist?
Or, and this is the neat bit, to create new kinds of nonhuman bodies, or nonbodily existence, that are more suited to what we value than our evolved human ones.
Simply put: why not?
Again: this is why we are trying to reduce the problem to cognitive algorithms, about which facts clearly exist, rather than leaving it at the level of “a theory is a collection of sentences written in first-order logic augmented with some primitive predicates”. The former is a scientific reality we can model and compute with, while the latter is a cancerous bunch of Platonist nonsense slowly killing the entire field of philosophy by metastasizing into whole fields and replacing actual reductionist rigor with the illusion of mathematical formalism.
(The above is, of course, a personal opinion, which you can tell because of the extreme vehemence. But holy shit do I hate Platonism and all its attendant fake rigor.)
Anyway, the rest I’ll have to answer in the morning, after a night’s sleep.
I am having difficulty seeing what you don’t understand about PhilGoetz’s point. You read like you’re reacting to overstatements on his part, but it looks to me like you’re reaching much further from reality than he is, or uncharitably interpreting his statements.
We can abstract from our values to principles, and so on, but what makes the difference between an instrumental value and a terminal value is that a terminal value is one that exists for its own sake. Inclusive genetic fitness does match that definition, because natural selection is a thing that slowly replaces things with lower inclusive genetic fitness with things with higher inclusive genetic fitness. This is what biologists mean by ‘maximize,’ and it’s different from what numerical optimization / math people mean by ‘maximize.’
Is it true that you are doing the most you can to maximize your inclusive genetic fitness (IGF)? No, you’re clearly suboptimal. But it is clearly true that your ancestors reproduced, and thus your genes are a product of the evolutionary project to gradually replace lower IGF with higher IGF, and in that sense you are doing more on average to increase your IGF than the counterfactual yous that do not exist because their ancestors failed to reproduce. That seems to be what PhilGoetz is arguing for on the object level (and he should correct me if that’s not the case.)
So now we take a step back to talk about values. When we look at possible values, we see lots of things that want to exist for their own sake (view how people talk about truth, justice, equality, and so on), but humans only seem to desire them because of their effects (view how people act about truth, justice, equality, and so on). It looks like people choose baskets of values and make tradeoffs between them- but in order to make tradeoffs between two instrumental values, there must be some terminal value that can look at options and say “option A is better than option B.”
It looks like the historical way this happens is that people have values, and some people reproduce more / spread their memes more, and this shifts the gene and meme (here, read as “value”) distributions. Broadly, it seems like genes and memes that are good for IGF are somewhat more popular than genes and memes that are bad for IGF, probably for the obvious reason.
That is, it looks like the universe judges value conflicts by existence. If there are more Particularists than Universalists, that seems to be because Particularists are out-existing the Universalists. To the extent that humans have pliable value systems, they seem to look around, decide what values will help them exist best, and then follow those values. (They also pick up their definition of “exist” from the environment around them, leading to quite a bit of freedom in how humans value things, though there seem to be limits on pliability.)
Moving forward, we seem to have some control over how the economic and military frontiers will change, and thus some control over what values will promote more or less existence. We probably want to exert that control in order to ensure the ‘right’ morality is favored.
But… if the practical determines the moral, and we want to decide what is practical using the moral, we now have a circular situation that it’s difficult to escape.
Our deeply held values are not “deeply held” in the sense that we can go meta and justify them to someone who doesn’t have them, but does share our meta-level value generating process. If we put a hypothetical twin you into a Comanche tribe to be raised, and then once he reached your current age you and he tried to come up with the list of human values and optimal arrangement of power, there would probably be significant disagreement. So PhilGoetz is pessimistic about a plan that looks at humans and comes up with the right values moving forward, because the system that determines those values is not a system we trust.
“Sakes” are mental concepts. Reality does not contain extra-mental sakes to exist for.
Again: since evolution does not have a mind, I don’t see how you could label inclusive genetic fitness as “terminal”. It is the criterion for which evolution optimizes, but that’s not nearly the same thing as a “terminal value” in any ethical or FAI sense.
(And, as I mentioned, it is very definitely not terminal, in the sense that it is a sub-optimizer for the Second Law of Thermodynamics.)
While your statement about my being more genetically “fit” than, say, the other sperm and egg cells that I killed off in the womb is entirely correct, that has basically nothing to do with the concept of “terminal values”, which are strictly a property of minds (and which evolution simply does not have).
Or a person must simply trade off their terminal values against each-other, with some weighting deciding the final total utility.
It seems to me like we need a word to use in Evaluative Cognitive Algorithms Theory other than “values”, since people like you and PhilGoetz are confusing “values” in the Evaluative Cognitive Algorithms Theory sense of the word with “values” in the sense of what a non-naturalist ethicist or a politician talks about.
If you are thinking in terms of promoting the “right” morality in an evolutionary sense, that the “right” morality is a program of which you “must” make copies, you are not using the term “morality” in the sense that Evaluative Cognitive Algorithms Theory people use it either.
(And certainly not in any sense that would invoke moral realism, but you don’t seem to have been claiming moral realism in the first place. On a side note, I think that trying to investigate what theories allow you to be “realist about X” is a useful tool for understanding what you mean by X, and morality is no exception.)
No we don’t. One optimizer can be stronger than another. For instance, at this point, humanity is stronger than evolution: we are rapidly destroying life on this planet, including ourselves, faster than anything can evolve to survive having our destructive attentions turned its way. Now personally I think that’s bloody-stupid, but it certainly shows that we are the ones setting the existence pressures now, we are the ones deciding precisely where the possible-but-counterfactual gives way to the actual.
And unfortunately, we need modal logic here. The practical does not determine what the set moral modality we already possess will output. That modality is already a fixed computational structure (unless you’re far more cultural-determinist than I consider reasonable).
I am confused by what you think a “meta-level value-generating process” is, or even could be, at least in the realms of ethics or psychology. Do you mean evolution when you say “meta-level value generating process”?
And additionally, why on Earth should we have to justify our you!”values” to someone who doesn’t have them? Seeking moral justification is itself an aspect of human psychology, so the average non-human mind would never expect any such thing.
There would be a significant difference in preference of lifestyles. Once we explained each-other to each-other, however, what we call “values” would be very, very close, and ways to arrange to share the world would be invented quite quickly.
(Of course, this may simply reflect that I put more belief-weight on bioterminism, whereas you place it on cultural determinism.)
Perhaps it would be clearer to discuss “exogenous” and “endogenous” values, as the relevant distinction between terminal and instrumental values are that terminal values are internally uncaused, and instrumental values are those pursued because they will directly or indirectly lead to an improvement in those terminal values, and this maps somewhat clearly onto exogenous and endogenous.
That is, of course this is a two-place word. IGF is exogenous to humans, but endogenous to evolution (and, as you put it, entropy is exogenous to evolution).
So my statement is that we have a collection of values and preferences that are moderately well-suited to our environment, because there is a process by which environments shape their inhabitants. As we grow more powerful, we shape our environment to be better suited to our values and preferences, because that is how humans embody preferences.
But we have two problems. First, our environment is still shaping our values and preferences, and thus the sort of world that we most want to live in might not be a world that would be mostly populated by us. Second, if we have any conflicts about preferences, typically we would go up a level to resolve those conflicts—but it is obvious that the level “above” us doesn’t have any desirable moral insights. So we can’t ground our conflict-resolution process in something moral instead of practical.
It seems to me that near-mode values are strongly biodetermined, but far-mode values are almost entirely culturally determined. Since most moral philosophy takes place in far mode, cultural determination is far more relevant. You and your Comanche twin might be equally anxious, say, but are probably anxious about very different things and have different coping strategies and so on.
I picked Comanche specifically because they were legendary raiders with a predatory morality.
I simply have to ask: so what? I place no particular terminal value on evolution itself. I see nothing wrong, neither aesthetically nor morally, with simply overriding evolution through human deeds, the better to create the kind of world that, indeed, we living humans most want to live in. Who cares how probable it was, a priori, that evolution should spawn our sort of people in our preferred sort of environment?
Well, I suppose you do, for some reason, but I’m really confused as to why.
Actually, I disagree: we usually just negotiate from a combination of heuristics for morally appropriate power relations (picture something Rawlsian, and there are complex but, IMHO, well-investigated sociological arguments for why a Rawlsian approach to power relations is a rational idea for the people involved) and morally inappropriate power relations (ie: compulsion and brute force).
I suppose you could call the former component “going up a level”, but ultimately I think it grounds itself in the Rawls-esque dynamics of creating, out of social creatures who only share a little personality and experience in common among everyone, a common society that improves life for all its members and maximizes the expected yield of individual efforts, particularly in view of the fact that many causally relevant attributes of individuals are high-entropy random variables and so we need to optimize the expected values, blah blah blah. Ultimately, human individuals do not enter into society because some kind of ontologically, metaphysically special Fundamental Particle of Morals collides with them and causes them to do so, but simply because people need other people to help each-other out and to feel at all ok about being people—solidarity is a basic sociological force.
As you can see above, I think the conflict-resolution process is the most practical part of the morals of human life.
Frankly, I think this is just an error on the part of most so-called moral philosophy, that it is conducted largely in a cognitive mode governed by secondary ideas-about-ideas, beliefs-in-beliefs, and impressions-about-impressions, a realm almost entirely without experiential data.
While I don’t think “Near Mode/Far Mode” is entirely a map that matches the psychological territory, insofar as we’re going to use it, I would consider Near Mode far more morally significant, precisely because it is informed directly by the actual experiences of the actual individuals involved. The social signals that convey “ideas” as we usually conceive of them in “Far Mode” actually have a tiny fraction of the bandwidth of raw sensory experience and conscious ideation, and as such should be weighted far more lightly by those of us looking to make our moral and aesthetic evaluations on data the same way we make factual evaluations on data.
The first rule of bounded rationality is that data and compute-power are scarce resources, and you should broadly assume that inferences based on more of each are very probably better than inferences in the same domain performed with less of each—and one of these days I’ll have the expertise to formalize that!
I don’t think I was clear enough. I’m not stating that it is value-wrong to alter the environment; indeed, that’s what values push people to do. I’m saying that while the direct effect is positive, the indirect effects can be negative. For example, we might want casual sex to be socially accepted because casual sex is fun, and then discover that this means unpleasant viruses infect a larger proportion of the population, and if they’re suitable lethal the survivors will, by selection if not experience, be those who are less accepting of casual sex. Or we might want to avoid a crash now and so transfer wealth from good predictors to poor predictors, and then discover that this has weakened the incentive to predict well, leading to worse predictions overall and more crashes. Both of those are mostly cultural examples, and I suspect the genetic examples will suggest themselves.
That is, one of the ways that values drift is the environmental change brought on by the previous period’s exertion of their morals may lead to the destruction of those morals in the next period. If you care about value preservation, this is one of the forces changing values that needs to be counteracted or controlled.
The traits of evolved organisms are usually a reasonable approximation of fitness-maximizing, because they’re the output of billions of years of a fitness-maximizing process; I’m with you on not being willing to call it “coincidence”. But this seems silly:
My cognitive algorithms are the results of a process which maximizes inclusive genetic fitness. This does not mean these algorithms themselves maximize that.
On the margin, a large fraction of the outputs of this process definitely don’t maximize fitness. Otherwise, there would be nothing for selection to eventually weed out!
I’m using “maximize” loosely. I agree with your observations.
Possible, but I don’t think so. At least not literally, when “began” is taken chronologically. She came from a religious family, and was inculcated with a need for God long before she had any curiosity about intelligence.
Are you aware of the irony of you telling me not to be mean? I also found a comment by you on another website castigating someone for being condescending. You should make some token attempt to follow the behavior that you demand of others.
Tu quoque has always been my preferred way of making friends.
“Tu quoque” means claiming that an assertion is false because the person making the assertion doesn’t believe the assertion. In this case, for me to be making a Tu quoque fallacy, I would have to be arguing that one should be mean on LessWrong.
Ah, but this is LessWrong: irony and manners are both disabled as a matter of cultural norm here, the better to emulate the cold, heartless robots we admire so deeply.