As you mention, so far every attempt by humans to have a self-consistent value system (the process also known as decompartmentalization) results in less-than-desirable outcomes. What if the end goal of having a thriving long-lasting (super-)human(-like) society is self-contradictory, and there is no such thing as both “nice” and “self-referentially stable”? Maybe some effort should be put into figuring out how to live, and thrive, while managing the unstable self-reference and possibly avoid convergence altogether.
A thought I’ve been thinking of lately, derived from a reinforcement learning view of values, and also somewhat inspired by Nate’s recent post on resting in motion… - value convergence seems to suggest a static endpoint, with some set of “ultimate values” we’ll eventually reach and have ever after. But so far societies have never reached such a point, and if our values are an adaptation to our environment (including the society and culture we live in), then it would suggest that as long as we keep evolving and developing, our values will keep changing and evolving with us, without there being any meaningful endpoint.
There will always (given our current understanding of physics) be only a finite amount of resources available, and unless we either all merge into one enormous hivemind or get turned into paperclips, there will likely be various agents with differing preferences on what exactly to do with those resources. As the population keeps changing and evolving, the various agents will keep acquiring new kinds of values, and society will keep rearranging itself to a new compromise between all those different values. (See: the whole history of the human species so far.)
Possibly we shouldn’t so much try to figure out what we’d prefer the final state to look like, but rather what we’d prefer the overall process to look like.
(The bias towards trying to figure out a convergent end-result for morality might have come from LW’s historical tendency to talk and think in terms of utility functions, which implicitly assume a static and unchanging set of preferences, glossing over the fact that human preferences keep constantly changing.)
Possibly we shouldn’t so much try to figure out what we’d prefer the final state to look like, but rather what we’d prefer the overall process to look like.
Well, the general Good Idea in that model is that events or actions shouldn’t be optimized to drift faster or more discontinuously than people’s valuations of those events, so that the society existing at any given time is more-or-less getting what it wants while also evolving towards something else.
Of course, a compromise between the different “values” (scare-quotes because I don’t think the moral-philosophy usage of the word points at anything real) of society’s citizens is still a vast improvement on “a few people dominate everyone else and impose their own desires by force and indoctrination”, which is what we still have to a great extent.
This sounds like Robin Hanson’s idea of the future. Eliezer would probably agree that in theory this would happen, except that he expects one superintelligent AI to take over everything and impose its values on the entire future of everything. If Eliezer’s future is definitely going to happen, then even if there is no truly ideal set of values, we would still have to make sure that the values that are going to be imposed on everything are at least somewhat acceptable.
“Evolve” has (at least) two meanings. One is the Darwinian one where heritable variation and selection lead to (typically) ever-better-adapted entities. But “evolve” can also just mean “vary gradually”. It could be that values aren’t (or wouldn’t be, in a posthuman era) subject to anything much like biological evolution; but they still might vary. (In biological terms, I suppose that would be neutral drift.)
I’m not sure we are talking specifically about the Darwinian meaning, actually. Well, I guess you are, given your comment above! But I don’t think the rest of the discussion was so specific. Kaj_Sotala said:
if our values are an adaptation to our environment (including the society and culture we live in), then it would suggest that as long as we keep evolving and developing, our values will keep changing and evolving with us, without there being any meaningful endpoint.
which seems to me to describe a situation of gradual change in our values that doesn’t need to be driven by anything much like biological evolution. (E.g., it could happen because each generation’s people constantly make small more-or-less-deliberate adjustments in their values to suit the environment they find themselves in.)
(Kaj’s comment does actually describe a resource-constrained situation, but the resource constraints aren’t directly driving the evolution of values he describes.)
We’re descending into nit-pickery. The question of whether values will change in the future is a silly one, as the answer “Yes” is obvious. The question of whether values will evolve in the Darwinian sense in the posthuman era (with its presumed lack of scarcity, etc.) is considerably more interesting.
I’m not sure that’s true. Imagine some glorious postbiological future in which people (or animals or ideas or whatever) can reproduce without limit. There are two competing replicators A and B, and the only difference is that A replicates slightly faster than B. After a while there will be vastly more of A around than of B, even if nothing dies. For many purposes, that might be enough.
In your earlier comment you said “evolution requires selection pressure”. There is of course selection pressure in memetic evolution. Completely eliminating memetic selection pressure is not even wrong—because memetic selection is closely connected to learning or knowledge creation. You can’t get rid of it.
Yes, yes it is. Even once you can order all the central examples of thriving, the “mere addition” operation will tip them toward the noncentral repugnant ones. Hence why one might have to live with the lack of self-consistency.
You could just not be utilitarian, especially in the specific form of not maximizing a metaphysical quantity like “happy experience”, thus leaving you with no moral obligations to counterfactual (ie: nonexistent) people, thus eliminating the Mere Addition Paradox.
Ok, I know that given the chemistry involved in “happy”, it’s not exactly a metaphysical or non-natural quantity, but it bugs me that utilitarianism says to “maximize Happy” even when, precisely as in the Mere Addition Paradox, no individual consciousness will actually experience the magnitude of Happy attained via utilitarian policies. How can a numerical measure of a subjective state of consciousness be valuable if nobody experiences the total numerical measure? It seems more sensible to restrict yourself to only moralizing about people who already exist, thus winding up closer to post-hoc consequentialism than traditional utilitarianism.
How can a numerical measure of a subjective state of consciousness be valuable if nobody experiences the total numerical measure?
The mere addition paradox also manifests for a single person. Imagine the state you are in. Now imagine if it can be (subjectively) improved by some means (e.g. fame, company, drugs, …). Keep going. Odds are, you would not find a maximum, not even a local one. After a while, you might notice that, despite incremental improvements, the state you are in is actually inferior to the original, if you compare them directly. Mathematically, one might model this as the improvement drive being non-conservative and so no scalar map from states to scalar utility exists. Whether it is worth pushing this analogy any further, I am not sure.
The mere addition paradox also manifests for a single person. Imagine the state you are in. Now imagine if it can be (subjectively) improved by some means (e.g. fame, company, drugs, …). Keep going. Odds are, you would not find a maximum, not even a local one.
Hill climbing always finds a local maximum, but that might well look very disappointing, wasteful of effort, and downright stupid when compared to some smarter means of spending the effort on finding a way to live a better life.
As you mention, so far every attempt by humans to have a self-consistent value system (the process also known as decompartmentalization) results in less-than-desirable outcomes. What if the end goal of having a thriving long-lasting (super-)human(-like) society is self-contradictory, and there is no such thing as both “nice” and “self-referentially stable”? Maybe some effort should be put into figuring out how to live, and thrive, while managing the unstable self-reference and possibly avoid convergence altogether.
A thought I’ve been thinking of lately, derived from a reinforcement learning view of values, and also somewhat inspired by Nate’s recent post on resting in motion… - value convergence seems to suggest a static endpoint, with some set of “ultimate values” we’ll eventually reach and have ever after. But so far societies have never reached such a point, and if our values are an adaptation to our environment (including the society and culture we live in), then it would suggest that as long as we keep evolving and developing, our values will keep changing and evolving with us, without there being any meaningful endpoint.
There will always (given our current understanding of physics) be only a finite amount of resources available, and unless we either all merge into one enormous hivemind or get turned into paperclips, there will likely be various agents with differing preferences on what exactly to do with those resources. As the population keeps changing and evolving, the various agents will keep acquiring new kinds of values, and society will keep rearranging itself to a new compromise between all those different values. (See: the whole history of the human species so far.)
Possibly we shouldn’t so much try to figure out what we’d prefer the final state to look like, but rather what we’d prefer the overall process to look like.
(The bias towards trying to figure out a convergent end-result for morality might have come from LW’s historical tendency to talk and think in terms of utility functions, which implicitly assume a static and unchanging set of preferences, glossing over the fact that human preferences keep constantly changing.)
Well, the general Good Idea in that model is that events or actions shouldn’t be optimized to drift faster or more discontinuously than people’s valuations of those events, so that the society existing at any given time is more-or-less getting what it wants while also evolving towards something else.
Of course, a compromise between the different “values” (scare-quotes because I don’t think the moral-philosophy usage of the word points at anything real) of society’s citizens is still a vast improvement on “a few people dominate everyone else and impose their own desires by force and indoctrination”, which is what we still have to a great extent.
This sounds like Robin Hanson’s idea of the future. Eliezer would probably agree that in theory this would happen, except that he expects one superintelligent AI to take over everything and impose its values on the entire future of everything. If Eliezer’s future is definitely going to happen, then even if there is no truly ideal set of values, we would still have to make sure that the values that are going to be imposed on everything are at least somewhat acceptable.
This. Values evolve, like everything else. Evolution will continue in the posthuman era.
Evolution requires selection pressure. The failures have to die out. What will provide the selection pressure in the posthuman era?
“Evolve” has (at least) two meanings. One is the Darwinian one where heritable variation and selection lead to (typically) ever-better-adapted entities. But “evolve” can also just mean “vary gradually”. It could be that values aren’t (or wouldn’t be, in a posthuman era) subject to anything much like biological evolution; but they still might vary. (In biological terms, I suppose that would be neutral drift.)
Well, we are talking about the Darwinian meaning, aren’t we? “Vary gradually”, aka “drift” is not contentious at all.
I’m not sure we are talking specifically about the Darwinian meaning, actually. Well, I guess you are, given your comment above! But I don’t think the rest of the discussion was so specific. Kaj_Sotala said:
which seems to me to describe a situation of gradual change in our values that doesn’t need to be driven by anything much like biological evolution. (E.g., it could happen because each generation’s people constantly make small more-or-less-deliberate adjustments in their values to suit the environment they find themselves in.)
(Kaj’s comment does actually describe a resource-constrained situation, but the resource constraints aren’t directly driving the evolution of values he describes.)
We’re descending into nit-pickery. The question of whether values will change in the future is a silly one, as the answer “Yes” is obvious. The question of whether values will evolve in the Darwinian sense in the posthuman era (with its presumed lack of scarcity, etc.) is considerably more interesting.
I agree that it’s more interesting. But I’m not sure it was the question actually under discussion.
I’m not sure that’s true. Imagine some glorious postbiological future in which people (or animals or ideas or whatever) can reproduce without limit. There are two competing replicators A and B, and the only difference is that A replicates slightly faster than B. After a while there will be vastly more of A around than of B, even if nothing dies. For many purposes, that might be enough.
So, in this scenario, what evolved?
The distribution of A and B in the population.
I don’t think this is an appropriate use of the word “evolution”.
Why not? It’s a standard one in the biological context. E.g.,
which according to a talk.origins FAQ is from this textbook: Helena Curtis and N. Sue Barnes, Biology, 5th ed. 1989 Worth Publishers, p.974
Economics. Posthumans still require mass/energy to store/compute their thoughts.
If there are mistakes made or the environment requires adaptation, a sufficiently flexible intelligence can mediate the selection pressure.
The end result still has to be for the failures to die or be castrated.
There is no problem with saying that values in future will “change” or “drift”, but “evolve” is more specific and I’m not sure how will it work.
Memetic evolution, not genetic.
I understand that. Memes can die or be castrated, too :-/
In your earlier comment you said “evolution requires selection pressure”. There is of course selection pressure in memetic evolution. Completely eliminating memetic selection pressure is not even wrong—because memetic selection is closely connected to learning or knowledge creation. You can’t get rid of it.
Godsdammit, people, “thrive” is the whole problem.
Yes, yes it is. Even once you can order all the central examples of thriving, the “mere addition” operation will tip them toward the noncentral repugnant ones. Hence why one might have to live with the lack of self-consistency.
You could just not be utilitarian, especially in the specific form of not maximizing a metaphysical quantity like “happy experience”, thus leaving you with no moral obligations to counterfactual (ie: nonexistent) people, thus eliminating the Mere Addition Paradox.
Ok, I know that given the chemistry involved in “happy”, it’s not exactly a metaphysical or non-natural quantity, but it bugs me that utilitarianism says to “maximize Happy” even when, precisely as in the Mere Addition Paradox, no individual consciousness will actually experience the magnitude of Happy attained via utilitarian policies. How can a numerical measure of a subjective state of consciousness be valuable if nobody experiences the total numerical measure? It seems more sensible to restrict yourself to only moralizing about people who already exist, thus winding up closer to post-hoc consequentialism than traditional utilitarianism.
The mere addition paradox also manifests for a single person. Imagine the state you are in. Now imagine if it can be (subjectively) improved by some means (e.g. fame, company, drugs, …). Keep going. Odds are, you would not find a maximum, not even a local one. After a while, you might notice that, despite incremental improvements, the state you are in is actually inferior to the original, if you compare them directly. Mathematically, one might model this as the improvement drive being non-conservative and so no scalar map from states to scalar utility exists. Whether it is worth pushing this analogy any further, I am not sure.
Hill climbing always finds a local maximum, but that might well look very disappointing, wasteful of effort, and downright stupid when compared to some smarter means of spending the effort on finding a way to live a better life.