The term ‘consciousness’ carries the fact that while we still don’t know exactly what the Magic Token of Moral Worth is, we know it’s a mental feature possessed by humans. This distinguishes us from, say, the Euthyphro-type moral theory where the Magic Token is a bit set by god and is epiphenomenal and only detectable because god gives us a table of what he set the bit on.
I am suspicious of this normative sense of ‘consciousness’. I think it’s basically a mistake of false reduction to suppose that moral worth is monotonic increasing in descriptive-sense-of-the-word-consciousness. This monotonicity seems to be a premise upon which this normative sense of the word ‘consciousness’ is based. In fact, even the metapremise that ‘moral worth’ is a thing seems like a fake reduction. On a high level, the idea of consciousness as a measure of moral worth looks really really strongly like a fake utility function.
A specific example: A superintelligent (super?)conscious paperclip maximizer is five light-minutes away from Earth. Omega has given you a button that you can press which will instantly destroy the paperclip maximizer. If you do not press it within five minutes, then the paperclip maximizer shall paperclip Earth.
I would destroy the paperclip maximizer without any remorse. Just like I would destroy Skynet without remorse. (Terminator: Salvation Skynet at least seems to be not only smart but also have developed feelings so is probably conscious.)
I could go on about why consciousness as moral worth (or even the idea of moral worth in the first place) seems massively confused, but I intend to do that eventually as a post or Sequence (Why I Am Not An Ethical Vegetarian), so shall hold off for now on the assumption you get my general point.
Pretty much. Start with the prior that everyone is a potential future ally, and has just enough information about your plans to cause serious trouble if you give them a reason to (such as those plans being bad for their own interests), and a set of behaviors known colloquially as “not being a dick” are the logical result.
The “agents with shared values” angle is interesting, and likely worth isolating as a distinct concept. But agents with shared values don’t seem either sufficient or necessary for much of what we refer to as ethics.
This description bothers me, because it pattern matches to bad reductionisms, which tend to have the form:
X (which is hard to understand) is really just Y (which we already understand).
A stock criticism of things reduced in this way is this:
If we understand Y so well, why are we still in the dark about X?
So, if ethics is just game theory between agents who share values (which reads to me as ‘ethics is game theory’), then why doesn’t game theory produce really good answers to otherwise really hard ethical questions? Or does it, and I just haven’t noticed? Or am I overestimating how much we understand game theory?
Game theory has been applied to some problems related to morality. In a strict sense we cannot prove such conclusions because universal laws are uncertain
Well as I said: we don’t have maths for this so-called reduction, so its trustworthiness is questionable. We know about game theory, but I don’t know of a game-theoretic formalism allowing for agents to win something other than generic “dollars” or “points”, such that we can encode in the formalism that agents share some values but not others, and have tradeoffs among their different values.
I don’t know of a game-theoretic formalism allowing for agents to win something other than generic “dollars” or “points”, such that we can encode in the formalism that agents share some values but not others, and have tradeoffs among their different values.
I suspect this isn’t the main obstacle to reducing ethics to game theory. Once I’m willing to represent agents’ preferences with utility functions in the first place, I can operationalize “agents share some values” as some features of the world contributing positively to the utility functions of multiple agents, while an agent having “tradeoffs among their different values” is encoded in the same way as any other tradeoff they face between two things — as a ratio of marginal utilities arising from a marginal change in either of the two things.
Well yes, of course. It’s the “share some values but not others” that’s currently not formalized, as in current game-theory agents are (to my knowledge) only paid in “money”, denoted as a single scalar dimension measuring utility as a function of the agent’s experiences of game outcomes (rather than as a function of states of the game construed as an external world the agent cares about).
A useful concept here (which I picked up from a pro player of Magic: The Gathering, but exists in many other environments) is “board state.” A lot of the research I’ve seen in game theory deals with very simple games, only a handful of decision-points followed by a payout. How much research has there been about games where there are variables (like capital investments, or troop positions, or land which can be sown with different plants or left fallow), which can be manipulated by the players and whose values affect the relative payoffs of different strategies?
Altruism can be more than just directly aiding someone you personally like; there’s also manipulating the environment to favor your preferred strategy in the long term, which costs you resources in the short term but benefits everyone who uses the same strategy as you, including your natural allies.
Honestly, it would just be much better to open up “shared-value game theory” as a formal subject and then see how well that elaborated field actually matches our normal conceptions of ethics.
Largely because, in my opinion, it explains the real world much, much better than a “selfish” game theory.
Using selfish game theories, “generous” or “altruistic” strategies can evolve to dominate in iterated games and evolved populations (there’s a link somewhere upthread to the paper). You’re still then left with the question of: if they do, why did evolution build us to place fundamental emotional and normative value on conforming to what any rational selfish agent will figure out?
Using theories in which agents share some of their values, “generous” or “altruistic” strategies become the natural, obvious result: shared values are nonrivalrous in the first place. Evolution builds us to feel Good and Moral about creatures who share our values because that’s a sign they probably have similar genes (though I just made that up now, so it’s probably totally wrong) (also, because nothing had time to evolve to fake human moral behavior, so the kin-signal remained reasonably strong).
Using selfish game theories, “generous” or “altruistic” strategies can evolve to dominate in iterated games and evolved populations (there’s a link somewhere upthread to the paper). You’re still then left with the question of: if they do, why did evolution build us to place fundamental emotional and normative value on conforming to what any rational selfish agent will figure out?
Because we’re adaptation executors, not fitness maximizers. Evolution gets us to do useful things by having us derive emotional value directly from doing those things, not by introducing the extra indirect step of moulding us into rational calculators who first have to consciously compute what’s most useful.
why did evolution build us to place fundamental emotional and normative value on conforming to what any rational selfish agent will figure out?
If you’re running some calculation involving a lot of logarithms, and portable electronics haven’t been invented yet, would you rather take a week to derive the exact answer with an abacus, and another three weeks hunting down a boneheaded sign error, or ten seconds for the first two or three decimal places on a slide rule?
Rational selfishness is expensive to set up, expensive to run, and can break down catastrophically at the worst possible times. Evolution tends to prefer error-tolerant systems.
Could agents who share no values recognize each other as agents? I may just be unimaginative, but it occurs to me that my imagining an agent just is my imagining it has having (at least some of) the same values as me. I’m not sure how to move forward on this question.
The comment to which you’re replying can be seen as providing a counterexample to the principle that goodness or utility is monotonic increasing in consciousness or conscious beings. Also a refutation of, as you mention, any deontological rule that might forbid destroying it.
The counterexample I’m proposing is that one should destroy a paperclip maximiser, even if it’s conscious, even though doing so will reduce the sum total of consciousness; goodness is outright increased by destroying it. (This holds even if we don’t suppose that the paperclipper is more conscious than a human; we need only for it to be at all conscious.)
(I suspect that some people who worry about utility monsters might just claim they really would lay down and die. Such a response feels like it would be circular, but I couldn’t immediately rigorously pin down why it would.)
I am asking HOW it is a countrexample. As far as I can see, you would have to make an assumption about how .consciousness relates to morality specifically, as in my second and third questions.
For instance,suppose conscious beings are morally relevant just means don’t kill conscious beings without good reason..
I think I get what you’re saying, but I’m not sure I agree. If the paperclip maximizer worked by simulating trillions of human-like agents doing fulfilling intellectual tasks, I’d be very sad to press the button. If I were convinced that pressing the button would result in less agent-eudaimonia-time over the universe’s course, I wouldn’t press it at all.
...so I’m probably a pretty ideal target audience for your post/sequence. Looking forward to it!
This is nuking the hypothetical. For any action that someone claims to be a good idea, one can specify a world where taking that action causes some terrible outcome.
If the paperclip maximizer worked by simulating trillions of human-like agents doing fulfilling intellectual tasks, I’d be very sad to press the button.
If you would be sad because and only because it were simulating humans (rather than because the paperclipper were conscious), my point goes through.
The term ‘consciousness’ carries the fact that while we still don’t know exactly what the Magic Token of Moral Worth is, we know it’s a mental feature possessed by humans. This distinguishes us from, say, the Euthyphro-type moral theory where the Magic Token is a bit set by god and is epiphenomenal and only detectable because god gives us a table of what he set the bit on.
I am suspicious of this normative sense of ‘consciousness’. I think it’s basically a mistake of false reduction to suppose that moral worth is monotonic increasing in descriptive-sense-of-the-word-consciousness. This monotonicity seems to be a premise upon which this normative sense of the word ‘consciousness’ is based. In fact, even the metapremise that ‘moral worth’ is a thing seems like a fake reduction. On a high level, the idea of consciousness as a measure of moral worth looks really really strongly like a fake utility function.
A specific example: A superintelligent (super?)conscious paperclip maximizer is five light-minutes away from Earth. Omega has given you a button that you can press which will instantly destroy the paperclip maximizer. If you do not press it within five minutes, then the paperclip maximizer shall paperclip Earth.
I would destroy the paperclip maximizer without any remorse. Just like I would destroy Skynet without remorse. (Terminator: Salvation Skynet at least seems to be not only smart but also have developed feelings so is probably conscious.)
I could go on about why consciousness as moral worth (or even the idea of moral worth in the first place) seems massively confused, but I intend to do that eventually as a post or Sequence (Why I Am Not An Ethical Vegetarian), so shall hold off for now on the assumption you get my general point.
Blatant because-I-felt-like-it speculation: “ethics” is really game theory for agents who share some of their values.
That’s about the size of it. I’m starting to think I should just pay you to write this sequence for me. :P
Pretty much. Start with the prior that everyone is a potential future ally, and has just enough information about your plans to cause serious trouble if you give them a reason to (such as those plans being bad for their own interests), and a set of behaviors known colloquially as “not being a dick” are the logical result.
I like this description.
I prefer the descriptions from your previous speculations on the subject.
The “agents with shared values” angle is interesting, and likely worth isolating as a distinct concept. But agents with shared values don’t seem either sufficient or necessary for much of what we refer to as ethics.
Now if only we had actual maths for it.
This description bothers me, because it pattern matches to bad reductionisms, which tend to have the form:
A stock criticism of things reduced in this way is this:
So, if ethics is just game theory between agents who share values (which reads to me as ‘ethics is game theory’), then why doesn’t game theory produce really good answers to otherwise really hard ethical questions? Or does it, and I just haven’t noticed? Or am I overestimating how much we understand game theory?
http://pnas.org/content/early/2013/08/28/1306246110
Game theory has been applied to some problems related to morality. In a strict sense we cannot prove such conclusions because universal laws are uncertain
Well as I said: we don’t have maths for this so-called reduction, so its trustworthiness is questionable. We know about game theory, but I don’t know of a game-theoretic formalism allowing for agents to win something other than generic “dollars” or “points”, such that we can encode in the formalism that agents share some values but not others, and have tradeoffs among their different values.
I suspect this isn’t the main obstacle to reducing ethics to game theory. Once I’m willing to represent agents’ preferences with utility functions in the first place, I can operationalize “agents share some values” as some features of the world contributing positively to the utility functions of multiple agents, while an agent having “tradeoffs among their different values” is encoded in the same way as any other tradeoff they face between two things — as a ratio of marginal utilities arising from a marginal change in either of the two things.
Well yes, of course. It’s the “share some values but not others” that’s currently not formalized, as in current game-theory agents are (to my knowledge) only paid in “money”, denoted as a single scalar dimension measuring utility as a function of the agent’s experiences of game outcomes (rather than as a function of states of the game construed as an external world the agent cares about).
So yeah.
A useful concept here (which I picked up from a pro player of Magic: The Gathering, but exists in many other environments) is “board state.” A lot of the research I’ve seen in game theory deals with very simple games, only a handful of decision-points followed by a payout. How much research has there been about games where there are variables (like capital investments, or troop positions, or land which can be sown with different plants or left fallow), which can be manipulated by the players and whose values affect the relative payoffs of different strategies?
Altruism can be more than just directly aiding someone you personally like; there’s also manipulating the environment to favor your preferred strategy in the long term, which costs you resources in the short term but benefits everyone who uses the same strategy as you, including your natural allies.
If ethics is game theoretic , it is not so to an extent where we could calculate exact outcomes.
It may still be game theoretic in some fuzzy or intractable way.
The claim that ethics is game theoretic could therefore be a philosophy-grade truth even if it is not a science-garde truth.
Honestly, it would just be much better to open up “shared-value game theory” as a formal subject and then see how well that elaborated field actually matches our normal conceptions of ethics.
Why assume some values have to be shared? If decision theoretic ethics canoe made to work without shared values, that would be interesting.
And decision theoretic ethics is already extant.
Largely because, in my opinion, it explains the real world much, much better than a “selfish” game theory.
Using selfish game theories, “generous” or “altruistic” strategies can evolve to dominate in iterated games and evolved populations (there’s a link somewhere upthread to the paper). You’re still then left with the question of: if they do, why did evolution build us to place fundamental emotional and normative value on conforming to what any rational selfish agent will figure out?
Using theories in which agents share some of their values, “generous” or “altruistic” strategies become the natural, obvious result: shared values are nonrivalrous in the first place. Evolution builds us to feel Good and Moral about creatures who share our values because that’s a sign they probably have similar genes (though I just made that up now, so it’s probably totally wrong) (also, because nothing had time to evolve to fake human moral behavior, so the kin-signal remained reasonably strong).
Because we’re adaptation executors, not fitness maximizers. Evolution gets us to do useful things by having us derive emotional value directly from doing those things, not by introducing the extra indirect step of moulding us into rational calculators who first have to consciously compute what’s most useful.
If you’re running some calculation involving a lot of logarithms, and portable electronics haven’t been invented yet, would you rather take a week to derive the exact answer with an abacus, and another three weeks hunting down a boneheaded sign error, or ten seconds for the first two or three decimal places on a slide rule?
Rational selfishness is expensive to set up, expensive to run, and can break down catastrophically at the worst possible times. Evolution tends to prefer error-tolerant systems.
Isn’t that what usually is known as “trade”?
Could agents who share no values recognize each other as agents? I may just be unimaginative, but it occurs to me that my imagining an agent just is my imagining it has having (at least some of) the same values as me. I’m not sure how to move forward on this question.
I don’t follow your example.
Are you taking the Clippie to be conscious?
Are you taking the Clippie s consciousness to imply a deontological rule not to destroy it?
Are you talking the Clippie level of consciousness to be so huge it implies a utilitarian weighting in its favour?
The comment to which you’re replying can be seen as providing a counterexample to the principle that goodness or utility is monotonic increasing in consciousness or conscious beings. Also a refutation of, as you mention, any deontological rule that might forbid destroying it.
The counterexample I’m proposing is that one should destroy a paperclip maximiser, even if it’s conscious, even though doing so will reduce the sum total of consciousness; goodness is outright increased by destroying it. (This holds even if we don’t suppose that the paperclipper is more conscious than a human; we need only for it to be at all conscious.)
(I suspect that some people who worry about utility monsters might just claim they really would lay down and die. Such a response feels like it would be circular, but I couldn’t immediately rigorously pin down why it would.)
I am asking HOW it is a countrexample. As far as I can see, you would have to make an assumption about how .consciousness relates to morality specifically, as in my second and third questions.
For instance,suppose conscious beings are morally relevant just means don’t kill conscious beings without good reason..
I think I get what you’re saying, but I’m not sure I agree. If the paperclip maximizer worked by simulating trillions of human-like agents doing fulfilling intellectual tasks, I’d be very sad to press the button. If I were convinced that pressing the button would result in less agent-eudaimonia-time over the universe’s course, I wouldn’t press it at all.
...so I’m probably a pretty ideal target audience for your post/sequence. Looking forward to it!
This is nuking the hypothetical. For any action that someone claims to be a good idea, one can specify a world where taking that action causes some terrible outcome.
If you would be sad because and only because it were simulating humans (rather than because the paperclipper were conscious), my point goes through.
Ta!