Your cognitive algorithms, executed in a human body, maximize its inclusive genetic fitness.
No, they don’t. They simply, straightforwardly don’t.
They maximize fitness the way any optimization method maximizes a complex function: unreliably, slowly, not always moving in the right direction. All that is required to say that something “maximizes” a function is that it generally increases its value. Perhaps “optimizes” would be a better word.
In some cases today, these heuristics no longer optimize fitness at all. As we all know. This is not a point worth dwelling on.
There is no need for scare-quotes: just because your individuality does not correspond to an immortal, supernatural soul doesn’t mean it corresponds to nothing at all.
The quotes are there because resolving what “I” refers to is non-trivial, and the discussion here depends on it.
In which case it is the theory that requires correction, not the lower-level (lower-level in the Hierarchical Bayesian sense, closer to the evidence) belief that we have values.
I never said we don’t have values. I said human values aren’t terminal values. You need to make sure you understand that distinction before criticizing that part of my post.
Actually, the chief problem with FAI theory as written by Eliezer is that there simply isn’t much of it!
Agreed.
Our values are only instrumental from the point of view of evolution. That’s not an objective point of view:
Yes, it is. The terminal values are what the system is optimizing. What the system optimizes doesn’t depend on your perspective; it depends on what provides feedback and error-correction. Reproductive success is the feedback mechanism; increasing it is what the system develops to do. Everything above is variable, inconstant, inconsistent; everything below is not being optimized for.
also that locating evolution as the least-caused optimizer is incorrect: entropy is the least-caused optimizer
See above. The feedback to the human system occurs at the level of reproductive fitness. What you just said implies that humans actually maximize entropy. Think about that for a few moments. I mean, technically, we do; everything does. But any intelligent analysis would notice that humans reduce entropy locally.
When you stop caring how your goal/value/feeling got there, and only care about fulfilling it, you’ve found a terminal goal/value/feeling.
Try enumerating examples of terminal values. You’ll find they are contradictory, they change within individuals and within societies rapidly, they are not constants of human history, and they are very often things that one would think we would rather eliminate from society than build a big AI to guarantee we will have them with us forever. Perhaps more importantly, the “biases” that LessWrong was founded to eliminate are indistinguishable from those kinds of values. See Human errors, human values.
when the details at the lower-level of reality can be thrown out without altering the evaluative judgement, you’ve found something that is terminally relevant, and from which value/relevance/usefulness/utility flows backwards into other things during the probabilistic backwards-chaining process the human mind seems to use for planning.
First, not many values above the level of the genetic remain constant across time and across the Earth.
Second, that wouldn’t help with resolving conflicts between higher “instrumental” values. If you removed the instrumental values, leaving the low-level judgements, and used greater computational power to optimize them more accurately, the human would produce different outputs. Would the human then have been “debugged” because it produced outputs more in accordance with the “terminal” level? Why should low-level judgements like galvanic skin response have precedence over cognitive judgements? The things that you would list as “terminal values” would tend to be things we have in common with all mammals. “Human values” should include some values not also found in dogs and pigs. But evolution very often works by elaboration, and it would not be surprising if most or all of the “human” part of our values were in things layered on top of these “terminal” values.
Third, there is no way to distinguish values from mistakes/biases.
Fourth, there is probably no way to “extrapolate” values away from the organism. Your list of “terminal human values” would be full of statements like “Humans value sweet and salty tastes” and “Males value having their penises stroked.” This is not, I think, what is most-important for us to pass on to the Universe a billion years from now. They will not apply to non-human bodies. Any attempt by an AI to enforce these values would seem to require keeping the standard human body for the rest of the life of the Universe.
just because your human concepts do not correspond to objects at the most ontologically basic and causally early levels of reality does not mean they fail to correspond to anything
First of all, let me say that I’ve been busy today and thus apologize for the sporadic character of my replies. Now, to begin with the most shocking and blunt statements...
Fourth, there is probably no way to “extrapolate” values away from the organism. Your list of “terminal human values” would be full of statements like “Humans value sweet and salty tastes” and “Males value having their penises stroked.”
What’s the problem? Were you expecting something other than humanity to come through in your model of humanity? Your phrasing signals that you are looking down on both sex and the enjoyment of food, and that you view them as aesthetically and/or morally inferior to… what? To “nonhuman bodies”? To intellectual pursuits?
Do you think intellectual pursuits will not also have their place in a well-learned model of human preferences? Are you trying to signal some attachment to the Spiral instinct/will-to-power/tsuyoku naritai principle? But even if you terminally value the expansion of your own causal or optimization power, there are other things you terminally value as well; it is unwise to throw away the rest of your humanity for power. You’ll be missing out.
To repeat one of my overly-repeated catch phrases: cynicism and detachment are not innately virtuous or wise. If what real, live human beings actually want, in the limit of increasing information and reflection, is to spend existence indulging tastes you happen to find gauche or déclassé, from where are you deriving some kind of divine-command-style moral authority to tell everyone, including yourself, to want things other than what we actually want?
What rational grounds can you have to say that a universe of pleasures—high and low—and ongoing personal development, and ongoing social development, and creativity, and emotionally significant choices to make, and genuine, engaging challenges to meet, and other people to do it all with (yes I am just listing Fun Theory Sequence entries because I can’t be bothered to be original at midnight)… is just not good enough for you if it requires learning a different way to conceptualize it all that turns out to correspond to your original psychological structure more than it corresponds to a realm of Platonic Forms, since there turned out not to be Platonic Forms?
Why do you feel guilty for not getting the approval of deities who don’t exist?
Any attempt by an AI to enforce these values would seem to require keeping the standard human body for the rest of the life of the Universe.
Or, and this is the neat bit, to create new kinds of nonhuman bodies, or nonbodily existence, that are more suited to what we value than our evolved human ones.
This is not, I think, what is most-important for us to pass on to the Universe a billion years from now.
Simply put: why not?
Try enumerating examples of terminal values. You’ll find they are contradictory, they change within individuals and within societies rapidly, they are not constants of human history, and they are very often things that one would think we would rather eliminate from society than build a big AI to guarantee we will have them with us forever.
Again: this is why we are trying to reduce the problem to cognitive algorithms, about which facts clearly exist, rather than leaving it at the level of “a theory is a collection of sentences written in first-order logic augmented with some primitive predicates”. The former is a scientific reality we can model and compute with, while the latter is a cancerous bunch of Platonist nonsense slowly killing the entire field of philosophy by metastasizing into whole fields and replacing actual reductionist rigor with the illusion of mathematical formalism.
(The above is, of course, a personal opinion, which you can tell because of the extreme vehemence. But holy shit do I hate Platonism and all its attendant fake rigor.)
Anyway, the rest I’ll have to answer in the morning, after a night’s sleep.
They maximize fitness the way any optimization method maximizes a complex function: unreliably, slowly, not always moving in the right direction. All that is required to say that something “maximizes” a function is that it generally increases its value. Perhaps “optimizes” would be a better word.
In some cases today, these heuristics no longer optimize fitness at all. As we all know. This is not a point worth dwelling on.
The quotes are there because resolving what “I” refers to is non-trivial, and the discussion here depends on it.
I never said we don’t have values. I said human values aren’t terminal values. You need to make sure you understand that distinction before criticizing that part of my post.
Agreed.
Yes, it is. The terminal values are what the system is optimizing. What the system optimizes doesn’t depend on your perspective; it depends on what provides feedback and error-correction. Reproductive success is the feedback mechanism; increasing it is what the system develops to do. Everything above is variable, inconstant, inconsistent; everything below is not being optimized for.
See above. The feedback to the human system occurs at the level of reproductive fitness. What you just said implies that humans actually maximize entropy. Think about that for a few moments. I mean, technically, we do; everything does. But any intelligent analysis would notice that humans reduce entropy locally.
Try enumerating examples of terminal values. You’ll find they are contradictory, they change within individuals and within societies rapidly, they are not constants of human history, and they are very often things that one would think we would rather eliminate from society than build a big AI to guarantee we will have them with us forever. Perhaps more importantly, the “biases” that LessWrong was founded to eliminate are indistinguishable from those kinds of values. See Human errors, human values.
First, not many values above the level of the genetic remain constant across time and across the Earth.
Second, that wouldn’t help with resolving conflicts between higher “instrumental” values. If you removed the instrumental values, leaving the low-level judgements, and used greater computational power to optimize them more accurately, the human would produce different outputs. Would the human then have been “debugged” because it produced outputs more in accordance with the “terminal” level? Why should low-level judgements like galvanic skin response have precedence over cognitive judgements? The things that you would list as “terminal values” would tend to be things we have in common with all mammals. “Human values” should include some values not also found in dogs and pigs. But evolution very often works by elaboration, and it would not be surprising if most or all of the “human” part of our values were in things layered on top of these “terminal” values.
Third, there is no way to distinguish values from mistakes/biases.
Fourth, there is probably no way to “extrapolate” values away from the organism. Your list of “terminal human values” would be full of statements like “Humans value sweet and salty tastes” and “Males value having their penises stroked.” This is not, I think, what is most-important for us to pass on to the Universe a billion years from now. They will not apply to non-human bodies. Any attempt by an AI to enforce these values would seem to require keeping the standard human body for the rest of the life of the Universe.
I don’t think that relates to anything I wrote.
First of all, let me say that I’ve been busy today and thus apologize for the sporadic character of my replies. Now, to begin with the most shocking and blunt statements...
What’s the problem? Were you expecting something other than humanity to come through in your model of humanity? Your phrasing signals that you are looking down on both sex and the enjoyment of food, and that you view them as aesthetically and/or morally inferior to… what? To “nonhuman bodies”? To intellectual pursuits?
Do you think intellectual pursuits will not also have their place in a well-learned model of human preferences? Are you trying to signal some attachment to the Spiral instinct/will-to-power/tsuyoku naritai principle? But even if you terminally value the expansion of your own causal or optimization power, there are other things you terminally value as well; it is unwise to throw away the rest of your humanity for power. You’ll be missing out.
To repeat one of my overly-repeated catch phrases: cynicism and detachment are not innately virtuous or wise. If what real, live human beings actually want, in the limit of increasing information and reflection, is to spend existence indulging tastes you happen to find gauche or déclassé, from where are you deriving some kind of divine-command-style moral authority to tell everyone, including yourself, to want things other than what we actually want?
What rational grounds can you have to say that a universe of pleasures—high and low—and ongoing personal development, and ongoing social development, and creativity, and emotionally significant choices to make, and genuine, engaging challenges to meet, and other people to do it all with (yes I am just listing Fun Theory Sequence entries because I can’t be bothered to be original at midnight)… is just not good enough for you if it requires learning a different way to conceptualize it all that turns out to correspond to your original psychological structure more than it corresponds to a realm of Platonic Forms, since there turned out not to be Platonic Forms?
Why do you feel guilty for not getting the approval of deities who don’t exist?
Or, and this is the neat bit, to create new kinds of nonhuman bodies, or nonbodily existence, that are more suited to what we value than our evolved human ones.
Simply put: why not?
Again: this is why we are trying to reduce the problem to cognitive algorithms, about which facts clearly exist, rather than leaving it at the level of “a theory is a collection of sentences written in first-order logic augmented with some primitive predicates”. The former is a scientific reality we can model and compute with, while the latter is a cancerous bunch of Platonist nonsense slowly killing the entire field of philosophy by metastasizing into whole fields and replacing actual reductionist rigor with the illusion of mathematical formalism.
(The above is, of course, a personal opinion, which you can tell because of the extreme vehemence. But holy shit do I hate Platonism and all its attendant fake rigor.)
Anyway, the rest I’ll have to answer in the morning, after a night’s sleep.