I think this is the clearest case where our moral theories differ. If the paperclipper suffers, I don’t see any reason not to care about that experience. Or, rather, I don’t fully understand why you lack care for the paperclipper.
If the paper-clipper even can “suffer” … I suspect a more useful word to describe the state of the paperclipper is “unclippy”. Or maybe not...let’s not think about these labels for now. The question is, regardless of the label, what is the underlying morally relevant feature?
I would hazard to guess that many of the supercomputers running our google searches, calculating best-fit molecular models, etc… have enough processing power to simulate a fish that behaves exactly like other fishes. If one wished, one could model these as agents with preference functions. But it doesn’t mean anything to “torture” a google-search algorithm, whereas it does mean something to torture a fish, or to torture a simulation of a fish.
You could model something as simple as a light switch as an agent with a preference function but it would be a waste of time. In the case of an algorithm which finds solutions in a search space it is actually useful to model it as an agent who prefers to maximize some elements of a solution, as this allows you to predict its behavior without knowing details of how it works. But, just like the light switch, just because you are modelling it as an agent doesn’t mean you have to respect its preferences.
“rational agent” explores the search space of possible actions it can take, and chooses the actions which maximize its preferences—the “correct solution” is when all preferences are maximized. An agent is fully rational if it made the best-possible choice given the data at hand. There are no rational agents, but it’s useful to model things which act approximately in this way as agents.
Paperclippers, molecular modelers, search engines, seek to maximize a simple set of preferences (number of paperclips, best fit model, best search). They have “preferences”, but not morally relevant ones.
A human (or, hopefully one day a friendly AI) seeks to fulfill an extremely complex set of preference...as does a fish. They have preferences which carry moral weight.
It’s not specific receptors or any particular algorithm that captures what is morally relevant to me about other agent’s preferences. If you took a human and replaced its brain with a search algorithm which found the motor output solutions which maximized the original human’s preferences, I’d consider this search algorithm to fit the definition of a person (though not necessarily the same person). I’d respect the search algorithm’s preferences the same way I respected the preferences of the human it replaced. This new sort of person might instrumentally prefer not having its arms chopped off, or terminally prefer that you not read its diary, but it might not show any signs of pain when you did these things unless showing signs of pain was instrumentally valuable. Violation of this being’s preferences may or may not be called “suffering” depending on how you define “suffering”...but either way, I think this being’s preferences are just as morally relevant as a humans.
So the question I would turn back to you is...under what conditions could a paper clipper suffer? Do all paper clippers suffer? What does this mean for other sorts of solution-maximizing algorithms, like search engines and molecular modelers?
My case is essentially that it is something about the composition of an agent’s preference function which contains the morally relevant component with regards to whether or not we should respect its preferences. The specific nature of the algorithm it uses to carry this preference function out—like whether it involves pain receptors or something—is not morally relevant.
Just as a data-point about intuition frequency, I found your intuitions about “a search algorithm which found the motor output solutions which maximized the original human’s preference” to be very surprising
Do you mean that the idea itself is weird and surprising to consider?
Or do you mean that my intuition that this search algorithm fits the definition of a “person” and is imbued with moral weight is surprising and does not match your moral intuition?
Thanks for the well-thought out comment. It helps me think through the issue of suffering a lot more.
~
If you took a human and replaced its brain with a search algorithm which found the motor output solutions which maximized the original human’s preferences, I’d consider this search algorithm to fit the definition of a person (though not necessarily the same person). [...] Violation of this being’s preferences may or may not be called “suffering” depending on how you define “suffering”...but either way, I think this being’s preferences are just as morally relevant as a humans. [...]
The question is, regardless of the label, what is the underlying morally relevant feature?
I think this is a good thought experiment and it does push me more toward preference satisfaction theories of well-being, which I have long been sympathetic to. I still don’t know much myself about what I view as suffering. I’d like to read and think more on the issue—I have bookmarked some of Brian Tomasik’s essays to read (he’s become more preference-focused recently) as well as an interview with Peter Singer where he explains why he’s abandoned preference utilitarianism for something else. So I’m not sure I can answer your question yet.
There are interesting problems with desires, such as formalizing it (what is a desire and what makes a desire stronger or weaker, etc.), population ethics (do we care about creating new beings with preferences, etc.) and others that we would have to deal with as well.
~
Paperclippers, molecular modelers, search engines, seek to maximize a simple set of preferences (number of paperclips, best fit model, best search). They have “preferences”, but not morally relevant ones. A human (or, hopefully one day a friendly AI) seeks to fulfill an extremely complex set of preference...as does a fish. They have preferences which carry moral weight.
So it seems like, to you, an entity’s welfare matters when it has preferences, weighted based on the complexity of those preferences, with a certain zero threshold somewhere (so thermostat preferences don’t count).
I don’t think complexity is the key driver for me, but I can’t tell you what is.
~
I haven’t seen any behavioral evidence of fish doing problem solving, being empathetic towards each other, exhibiting cognitive capacities beyond very basic associative learning & memory, or that sort of thing.
Likewise, I don’t think this is much of a concern for me, and it seems inconsistent with the rest of what you’ve been saying.
Why are problem solving and empathy important? Surely I could imagine a non-empathetic program without the ability to solve most problems, that still has the kind of robust preferences you’ve been talking about.
And what level of empathy and problem solving are you looking for? Notably, fish engage in cleaning symbiosis (which seems to be in the lower-tier of the empathy skill tree) and Wikipedia seems to indicate (though perhaps unreliably) that fish have pretty good learning capabilities.
~
I don’t think so, but I might be wrong...Is risk aversion in the face of uncertainty actually rational in this scenario? Seems to me that there are certain scenarios where risk aversion makes sense (personal finance, for example) and scenarios where it doesn’t (effective altruism, for example) and this decision seems to fall in the latter camp.
an entity’s welfare matters when it has preferences, weighted based on the complexity of those preferences
No, it’s not complexity, but content of the preferences that make a difference. Sorry for mentioning the complexity—i didn’t mean to imply that it was the morally relevant feature.
I’m not yet sure what sort of preferences give an agent morally weighty status...the only thing I’m pretty sure about is that the morally relevant component is contained somewhere within the preferences, with intelligence as a possible mediating or enabling factor.
Here’s one pattern I think I’ve identified:
I belong within reference Class X.
All beings in Reference Class X care about other beings in Reference Class X, when you extrapolate their volition.
When I hear about altruistic mice, it is evidence that the mouse’s extrapolated volition would cause it to care about Class X-being’s preferences to the extent that it can comprehend them. The cross-species altruism of dogs and dolphins and elephants is an especially strong indicator of Class X membership.
On the other hand, the within-colony altruism of bees (basically identical to Reference Class X except it only applies to members of the colony and I do not belong in it), or the swarms and symbiosis of fishes or bacterial gut flora, wouldn’t count...being in Reference Class X is clearly not the factor behind the altruism in those cases.
...which sounds awfully like reciprocal altruism in practice, doesn’t it? Except that, rather than looking at the actual act of reciprocation of altruism, I’d be extrapolating the agent’s preferences for altruism. Perhaps Class X would be better named “Friendly”, in the “Friendly AI” sense—all beings within the class are to some extent Friendly towards each other.
This is at the rough edge of my thinking though—the ideas as just stated are experimental and I don’t have well defined notions about which preferences matter yet.
Edit: Another (very poorly thought out) trend which seems to emerge is that agents which have a certain sort of awareness are entitled to a sort of bodily autonomy … because it seems immoral to sit around torturing insects if one has no instrumental reason to do so. (But is it immoral in the sense that there are a certain number of insects which morally outweigh a a human? Or is it immoral in a virtue ethic-y, “this behavior signals sadism” sort of way?)
My main point is that I’m mildly guessing that it’s probably safe to narrow down the problem to some combination of preference functions and level of awareness. In any case, I’m almost certain that there exist preference functions are sufficient (but maybe not necessary?) to confer moral weight onto an agent...and though there may be other factors unrelated to preference or intelligence that play a role, preference function is the only thing with a concrete definition that I’ve identified so far.
...which sounds awfully like reciprocal altruism in practice, doesn’t it? Except that, rather than looking at the actual act of reciprocation of altruism, I’d be extrapolating the agent’s preferences for altruism. Perhaps Class X would be better named “Friendly”, in the “Friendly AI” sense—all beings within the class are to some extent Friendly towards each other.
Just so I understand you better, how would you compare and contrast this kind of pro-X “kin” altruism with utilitarianism?
Utilitarianism has never made much sense to me except as a handy way to talk about things abstractly when precision isn’t important
...but I suppose X would be a class of agents who consider each other’s preferences when they make utilitarian calculations? I pretty much came up with the pro-X idea less than a month ago, and haven’t thought it through very carefully.
Oh, here’s a good example of where preference utilitarianism fails which illustrates it:
10^100 intelligent people terminally prefer that 1 person is tortured. Preference utilitarianism says “do the torture”. My moral instinct says “no, it’s still wrong, no matter how many people prefer it”.
Perhaps under the pro-X system, the reason we can ignore the preferences of 10^100 people is that the preference which they have expressed lies strictly outside category X and therefore that preference can be ignored?
Whereas, if you have a Friendly Paperclipper (cares about X-agents and paperclips with some weight on each), the Friendly moral values put it within X...which means that we should now be willing to cater to its morally neutral paper-clip preferences as well.
(If this reads sloppy, it’s because my thoughts on the matter currently are sloppy)
So...I guess there’s sort of a taxonomy of moral-good, neutral-selfish, and evil preferences...and part of being good means caring about other people’s selfish preferences? And part of being evil means valuing the violation of other’s preferences? And, good agents can simply ignore evil preferences.
And (under the pro-X system), good agents can also ignore the preferences of agents that aren’t in any way good...which seems like it might not be correct, which is why I say that there might be other factors in addition to pro-X that make an agent worth caring about for my moral instincts, but if they exist I don’t know what they are.
Are you perhaps confusing ‘morally wrong’ with ‘a sucky tradeoff that I would prefer not to be bound by’?
Just because torturing one person sucks, just because we find it abhorrent, does not mean that it isn’t the best outcome in various situations. If your definition of ‘moral’ is “best outcome when all things are considered, even though aspects of it suck a lot and are far from ideal”, then yes, torturing someone can in fact be moral. If your definition of ‘moral’ is “those things which I find reprehensible”, then quite probably you can never find torturing someone to be moral. However, there are scenarios where it may still be necessary, or the best option.
Are you perhaps confusing ‘morally wrong’ with ‘a sucky tradeoff that I would prefer not to be bound by’?
Nope...because ..
quite probably you can never find torturing someone to be moral. However, there are scenarios where it may still be necessary, or the best option.
...because I believe that torturing someone could still instrumentally be the right thing to do on a consequential grounds.
In this scenario, 10^100 people terminally value torturing one person, but I do not care about their preferences, because it is an evil preference.
However, in an alternate scenario, if I had to choose between 10^100 people getting mildly hurt or 1 person getting tortured, I’d choose the one person getting tortured.
In these two scenarios, the preference weights are identical, but in the first scenario the preference of the 10^100 people is evil and therefore irrelevant in my calculations, whereas in the second scenario the needs of 10^100 outweigh the needs of the one.
This is less a discussion about torture, and more a discussion about whose/which preferences matter. Sadistic preferences (involving real harm, not the consensual kink), for example, don’t matter morally—there’s no moral imperative to fulfill those preferences, no “good” done when those preferences are fulfilled and no “evil” resulting from thwarting those preferences.
I think you should temporarily taboo ‘moral’, ‘morality’, and ‘evil’, and simply look at the utility calculations. 10^100 people terminally value something that you ascribe zero or negative value to; therefore, their preferences do not matter to you or will make your universe worse from the standpoint of your utility function.
Which preferences matter? Yours matter to you, and thiers matter to them. There’s no ‘good’ or ‘evil’ in any absolute sense, merely different utility functions that happen to conflict. There’s no utility function which is ‘correct’, except by some arbitrary metric, of which there are many.
Consider another hypothetical utility function: The needs of the 10^100 don’t outweigh the needs of the one, so we let the entire 10^100 suffer when we could eliminate it by inconveniencing one single entity. Neither you nor the 10^100 are happy with this one, but the person about to be tortured may think it’s just fine and dandy...
...I don’t denotatively disagree with anything you’ve said, but I also think you’re sort of missing the point and forgetting the context of the conversation as it was in the preceding comments.
We all have preferences, but we do not always know what our own preferences are. A subset of our preferences (generally those which do not directly reference ourselves) are termed “moral preferences”. The preceding discussion between me and Peter Hurford is an attempt to figure out what our preferences are.
In the above conversation, words like “matter”, “should” and “moral” is understood to mean “the shared preferences of Ishaan, Dentin, and Peter_Hurford which they agree to define as moral”. Since we are all human (and similar in many other ways beyond that), we probably have very similar moral preferences...so any disagreement that arises between us is usually due to one or both of us inaccurately understanding our own preferences.
There’s no ‘good’ or ‘evil’ in any absolute sense
This is technically true, but it’s also often a semantic stopsign which derails discussions of morality. The fact is that the three of us humans have a very similar notion of “good”, and can speak meaningfully about what it is...the implicitly understood background truths of moral nihilism notwithstanding.
It doesn’t do to exclaim “but wait! good and evil are relative!” during every moral discussion...because here, between us three humans, our moral preferences are pretty much in agreement and we’d all be well served by figuring out exactly those preferences are. It’s not like we’re negotiating morality with aliens.
Which preferences matter? Yours matter to you
Precisely...my preferences are all that matter to me, and our preferences are all that matter to us. So if 10^100 sadistic aliens want to torture...so what? We don’t care if they like torture, because we dislike torture and our preferences are all that matter. Who cares about overall utility? “Morality”, for all practical purposes, means shared human morality...or, at least, the shared morality of the humans who are having the discussion.
“Utility” is kind of like “paperclips”...yes, I understand that in the best case scenario it might be possible to create some sort of construct which measures how much “utility” various agent-like objects get from various real world outcomes, but maximizing utility for all agents within this framework is not necessarily my goal...just like maximizing paperclips is not my goal.
For the purposes of this conversation at least. I’ve largely got them taboo’d in general because I find them confusing and full of political connotations; I suspect at least some of that is the problem here as well.
10^100 intelligent people terminally prefer that 1 person is tortured. Preference utilitarianism says “do the torture”. My moral instinct says “no, it’s still wrong, no matter how many people prefer it”.
Yet your moral instinct is perfectly fine with having a justice system that puts innocent people in jail with a greater than 1 in 10^100 error rate.
Paperclippers, molecular modelers, search engines, seek to maximize a simple set of preferences (number of paperclips, best fit model, best search). They have “preferences”, but not morally relevant ones.
Usually people speak of preferences when there is a possibility of choice—the agent can meaningfully choose between doing A and doing B.
This is not the case with respect to molecular models, search engines, and light switches.
At least for search engines, I would say there exist a meaningful level of description where it can be said that the search engine chooses which results to display in response to a query, approximately maximizing some kind of scoring function.
there exist a meaningful level of description where it can be said that the search engine chooses which results to display in response to a query
I don’t think it is meaningful in the current context. The search engine is not an autonomous agent and doesn’t choose anything any more than, say, the following bit of pseudocode: if (rnd() > 0.5) { print “Ha!” } else { print “Ooops!” }
“If you search for “potatoes” the engine could choose to return results for “tomatoes” instead...but will choose to return results for potatoes because it (roughly speaking) wants to maximize the usefulness of the search results.”
“If I give you a dollar you could choose to tear it to shreds, but you instead will choose to put it in your wallet because (roughly speaking) you want to xyz...”
When you flip the light switch “on” it could choose to not allow current through the system, but it will flow current through the system because it wants current to flow through the system when it is in the “on” position.
Except for degree of complexity, what’s the difference? “Choice” can be applied to anything modeled as an Agent.
When you flip the light switch “on” it could choose to not allow current through the system, but it will flow current through the system because it wants current to flow through the system when it is in the “on” position.
Sorry, I read this as nonsense. What does it mean for a light switch to “want”?
To determine the “preferences” of objects which you are modeling as agents, see what occurs, and construct a preference function that explains those occurrences.
Example: This amoeba appears to be engaging in a diverse array of activities which I do not understand at all, but they all end up resulting in the maintenance of its physical body. I will therefore model it as “preferring not to die”, and use that model to make predictions about how the amoeba will respond to various situations.
I think the light switch example is far fetched, but the search engine isn’t. The point is whether there exist a meaningful level of description where framing the system behavior in terms of making choices to satisfy certain preferences is informative.
The distinction you are making between the input-output function of a human as a “choice” vs. the input-output of a machine as “not-a-choice” sounds very reminiscent of the traditional naive / confused model of free will that people commonly have before dissolving the question...but you’re a frequent poster here, so perhaps I’ve misunderstood your meaning. Are you using a specialized definition of the word “choice”?
I have no wish for this to develop into a debate about free will. Let me point out that just because I know the appropriate part of the Sequences does not necessarily mean I agree with it.
As a practical matter, speaking about choices of light switches seems silly. Given this, I don’t see why speaking about choices of search engines is not silly. It might be useful conversational shorthand in some contexts, but I don’t think it is useful in the context of talking about morality.
Let me point out that just because I know the appropriate part of the Sequences does not necessarily mean I agree with it.
Ah, ok—sorry. The materialist, dissolved view of free will related questions has been a strongly held view of mine since a very young age, so my prior for a person who is aware of thesel yet subscribes to what I’ll call the “naive view” for lack of the better word is very low.
It’s not really the particulars of the sequences here which are in question—the people who say free will doesn’t exist, and the people who say it does but redefine free will in funny ways, the pan-psychists, the compatiblists and non-compatiblists, all share in common a non-dualist view which does not allow them to label the search engine’s processes and the human’s processes as fundamentally, qualitatively different processes. This is a deep philosophical divide that has been debated for, as far as I am aware, at least two thousand years.
As a practical matter, speaking about choices of light switches seems silly. Given this, I don’t see why speaking about choices of search engines is not silly.
By analogy, speaking of choices of humans seems silly, since humans are made of the same basic laws.
The fundamental disagreement here runs rather deeply—it’s not going to be possible to talk about this without diving into free will.
If I understood the causal mechanisms underlying the actions of humans as well as I do those underlying lightswitches, talking about the former as “choices” would seem as silly to me as talking that way about the latter does.
But I don’t, so it doesn’t.
I assume you don’t understand the causal mechanisms underlying the actions of humans either. So why does talking about them as “choices” seem silly to you?
I agree with you. Whether we model something as an agent or an object is a feature of our map, not the territory. It’s not useful to model light switches as agents because they are too simple, and looking at them through the lens of preferences is not simple or informative. Meanwhile, it is useful to model humans partially as preference maximizing agents to make approximations.
However, in the context of the larger discussion, I interpret Lumifer as treating the distinction between “choice” and “event” as a feature of the territory itself, and positing a fundamental qualitative difference between a “choice” and other sorts of events. My reply should be seen as an assertion that such qualitative differences are not features of the map—if it’s impossible to model a light switch as having choices, then it’s also impossible to model a human as having choices. (My actual belief is that it’s possible to model both as having choices or not having them)
Is your actual belief that there are equivalent grounds for modeling both either way?
If so, I disagree… from my own perspective, modeling people as preference-maximizing agents is significantly more justified (due to differences in the territory) than modeling a light switch that way.
If not, to what do you attribute the differential?
Is your actual belief that there are equivalent grounds for modeling both either way?
...it is possible to model things either way, but it is more useful for some objects than others.
It’s not useful to model light switches as agents because they are too simple, and looking at them through the lens of preferences is not simple or informative. Meanwhile, it is useful to model humans partially as preference maximizing agents to make approximations.
Modeling an object as agents is useful when the object exhibits a pattern of behavior which is roughly consistent with preference maximizing. A search engine is well modeled as an agent. A human is very well modeled as an agent.
A light switch is very poorly modeled as an agent. Thinking of it in terms of preference pattern doesn’t make it any easier to predict its behavior. But you can model it as an agent, if you’d like.
I am willing to adopt “useful” in place of “justified” if it makes this conversation easier. In which case my question could be rephrased “Is it equally useful to model both either way?”
To which your answer seems to be no… it’s more useful to model a human as an agent than it is a light-switch. (I’m inferring, because despite introducing the “useful” language, what you actually say instead introduces the language of something being “well-modeled.” But I’m assuming that by “well-modeled” you mean “useful.”)
And your answer to the followup question is because the pattern of behavior of a light switch is different from that of a search engine or a human, such that adopting an intentional stance towards the former doesn’t make it easier to predict.
Yup. Modeling something as a preference maximizing agent is generally useful to adopt for things which systematically behave in ways that maximize certain outcomes in a diverse array of situations. It allows you to make accurate predictions even when you don’t fully understand the mechanics of the events that occur in generating the events you are predicting.
(I distinguished useful and justified because I wasn’t sure if “justified” had moral connotations in your usage)
Edit: On reading the wiki, I tend to agree with the views that the wiki attributes to Dennett. Thanks for the reference and the word “intentional stance”.
OK. So, having clarified that, I return to your initial comment:
The distinction you are making between the input-output function of a human as a “choice” vs. the input-output of a machine as “not-a-choice” sounds very reminiscent of the traditional naive / confused model of free will that people commonly have before dissolving the question
...and am as puzzled by it as I was in the first place.
You agree that the input-output function of a human differs from the input-output of a machine like a light switch in ways that make it more useful to model the former but not the latter as maximizing preferences. (To adopt the intentional stance towards the former and the design stance towards the latter, in Dennett’s terminology.)
So, given that, what is your objection to Lumifer’s distinction? “Choice” seems like a perfectly reasonable word to use when taking an intentional stance, and to not use when taking a design stance.
When I asked earlier, you explained that your objection had to do with attributing “territory-level” differences to humans and machines, when it’s really a “map-level” objection… that it’s possible to talk about a light-switch’s choices, or not talk about a human’s choices, so it’s not really a difference in the system at all, just a difference in the speaker.
But given that you agree that there’s a salient “territory-level” difference between the two systems (specifically, the differences which make the intentional stance more useful than the design stance wrt humans, but not wrt light-switches), I don’t quite get the objection. Sure, it’s possible to take either stance towards either system, but it’s more useful to take the intentional stance towards humans, and that’s a “fact about the territory.”
Because in the preceding comment, I was demonstrating that we should not morally care about light switches, search engines, and paperclippers...whereas we should morally care about fishes, dogs, and humans… because of differences in the preference profiles of these beings when they are modeled as agents.
Peter Hurford disagreed with me on the non-moral status of the paper-clipper. I was demonstrating the non-moral status of a being which cared only for paper clips by analogy to a search engine (a being which only cares about bringing up the best search result).
Whereas what Lumifer was saying is that the very premise that a search engine could have choices was fundamentally flawed (which, if true, would cause the whole analogy to break down).
The thing is, it’s not fundamentally flawed to thing of a search engine as having choices. Sure, search engines are a little less usefully modeled as agent-like when compared to humans, but it’s just a matter of degree.
the input-output function of a human as a “choice” vs. the input-output of a machine as “not-a-choice”
I was objecting to his hard, qualitative binary, not your and Dennet’s soft/qualitative spectrum.
Just for the sake of completeness, I’ll wait for you to follow-up on this before continuing our discussion here.
If the paper-clipper even can “suffer” … I suspect a more useful word to describe the state of the paperclipper is “unclippy”. Or maybe not...let’s not think about these labels for now. The question is, regardless of the label, what is the underlying morally relevant feature?
I would hazard to guess that many of the supercomputers running our google searches, calculating best-fit molecular models, etc… have enough processing power to simulate a fish that behaves exactly like other fishes. If one wished, one could model these as agents with preference functions. But it doesn’t mean anything to “torture” a google-search algorithm, whereas it does mean something to torture a fish, or to torture a simulation of a fish.
You could model something as simple as a light switch as an agent with a preference function but it would be a waste of time. In the case of an algorithm which finds solutions in a search space it is actually useful to model it as an agent who prefers to maximize some elements of a solution, as this allows you to predict its behavior without knowing details of how it works. But, just like the light switch, just because you are modelling it as an agent doesn’t mean you have to respect its preferences.
“rational agent” explores the search space of possible actions it can take, and chooses the actions which maximize its preferences—the “correct solution” is when all preferences are maximized. An agent is fully rational if it made the best-possible choice given the data at hand. There are no rational agents, but it’s useful to model things which act approximately in this way as agents.
Paperclippers, molecular modelers, search engines, seek to maximize a simple set of preferences (number of paperclips, best fit model, best search). They have “preferences”, but not morally relevant ones.
A human (or, hopefully one day a friendly AI) seeks to fulfill an extremely complex set of preference...as does a fish. They have preferences which carry moral weight.
It’s not specific receptors or any particular algorithm that captures what is morally relevant to me about other agent’s preferences. If you took a human and replaced its brain with a search algorithm which found the motor output solutions which maximized the original human’s preferences, I’d consider this search algorithm to fit the definition of a person (though not necessarily the same person). I’d respect the search algorithm’s preferences the same way I respected the preferences of the human it replaced. This new sort of person might instrumentally prefer not having its arms chopped off, or terminally prefer that you not read its diary, but it might not show any signs of pain when you did these things unless showing signs of pain was instrumentally valuable. Violation of this being’s preferences may or may not be called “suffering” depending on how you define “suffering”...but either way, I think this being’s preferences are just as morally relevant as a humans.
So the question I would turn back to you is...under what conditions could a paper clipper suffer? Do all paper clippers suffer? What does this mean for other sorts of solution-maximizing algorithms, like search engines and molecular modelers?
My case is essentially that it is something about the composition of an agent’s preference function which contains the morally relevant component with regards to whether or not we should respect its preferences. The specific nature of the algorithm it uses to carry this preference function out—like whether it involves pain receptors or something—is not morally relevant.
Just as a data-point about intuition frequency, I found your intuitions about “a search algorithm which found the motor output solutions which maximized the original human’s preference” to be very surprising
Do you mean that the idea itself is weird and surprising to consider?
Or do you mean that my intuition that this search algorithm fits the definition of a “person” and is imbued with moral weight is surprising and does not match your moral intuition?
Thanks for the well-thought out comment. It helps me think through the issue of suffering a lot more.
~
I think this is a good thought experiment and it does push me more toward preference satisfaction theories of well-being, which I have long been sympathetic to. I still don’t know much myself about what I view as suffering. I’d like to read and think more on the issue—I have bookmarked some of Brian Tomasik’s essays to read (he’s become more preference-focused recently) as well as an interview with Peter Singer where he explains why he’s abandoned preference utilitarianism for something else. So I’m not sure I can answer your question yet.
There are interesting problems with desires, such as formalizing it (what is a desire and what makes a desire stronger or weaker, etc.), population ethics (do we care about creating new beings with preferences, etc.) and others that we would have to deal with as well.
~
So it seems like, to you, an entity’s welfare matters when it has preferences, weighted based on the complexity of those preferences, with a certain zero threshold somewhere (so thermostat preferences don’t count).
I don’t think complexity is the key driver for me, but I can’t tell you what is.
~
Likewise, I don’t think this is much of a concern for me, and it seems inconsistent with the rest of what you’ve been saying.
Why are problem solving and empathy important? Surely I could imagine a non-empathetic program without the ability to solve most problems, that still has the kind of robust preferences you’ve been talking about.
And what level of empathy and problem solving are you looking for? Notably, fish engage in cleaning symbiosis (which seems to be in the lower-tier of the empathy skill tree) and Wikipedia seems to indicate (though perhaps unreliably) that fish have pretty good learning capabilities.
~
That makes sense to me.
No, it’s not complexity, but content of the preferences that make a difference. Sorry for mentioning the complexity—i didn’t mean to imply that it was the morally relevant feature.
I’m not yet sure what sort of preferences give an agent morally weighty status...the only thing I’m pretty sure about is that the morally relevant component is contained somewhere within the preferences, with intelligence as a possible mediating or enabling factor.
Here’s one pattern I think I’ve identified:
I belong within reference Class X.
All beings in Reference Class X care about other beings in Reference Class X, when you extrapolate their volition.
When I hear about altruistic mice, it is evidence that the mouse’s extrapolated volition would cause it to care about Class X-being’s preferences to the extent that it can comprehend them. The cross-species altruism of dogs and dolphins and elephants is an especially strong indicator of Class X membership.
On the other hand, the within-colony altruism of bees (basically identical to Reference Class X except it only applies to members of the colony and I do not belong in it), or the swarms and symbiosis of fishes or bacterial gut flora, wouldn’t count...being in Reference Class X is clearly not the factor behind the altruism in those cases.
...which sounds awfully like reciprocal altruism in practice, doesn’t it? Except that, rather than looking at the actual act of reciprocation of altruism, I’d be extrapolating the agent’s preferences for altruism. Perhaps Class X would be better named “Friendly”, in the “Friendly AI” sense—all beings within the class are to some extent Friendly towards each other.
This is at the rough edge of my thinking though—the ideas as just stated are experimental and I don’t have well defined notions about which preferences matter yet.
Edit: Another (very poorly thought out) trend which seems to emerge is that agents which have a certain sort of awareness are entitled to a sort of bodily autonomy … because it seems immoral to sit around torturing insects if one has no instrumental reason to do so. (But is it immoral in the sense that there are a certain number of insects which morally outweigh a a human? Or is it immoral in a virtue ethic-y, “this behavior signals sadism” sort of way?)
My main point is that I’m mildly guessing that it’s probably safe to narrow down the problem to some combination of preference functions and level of awareness. In any case, I’m almost certain that there exist preference functions are sufficient (but maybe not necessary?) to confer moral weight onto an agent...and though there may be other factors unrelated to preference or intelligence that play a role, preference function is the only thing with a concrete definition that I’ve identified so far.
Just so I understand you better, how would you compare and contrast this kind of pro-X “kin” altruism with utilitarianism?
Utilitarianism has never made much sense to me except as a handy way to talk about things abstractly when precision isn’t important
...but I suppose X would be a class of agents who consider each other’s preferences when they make utilitarian calculations? I pretty much came up with the pro-X idea less than a month ago, and haven’t thought it through very carefully.
Oh, here’s a good example of where preference utilitarianism fails which illustrates it:
10^100 intelligent people terminally prefer that 1 person is tortured. Preference utilitarianism says “do the torture”. My moral instinct says “no, it’s still wrong, no matter how many people prefer it”.
Perhaps under the pro-X system, the reason we can ignore the preferences of 10^100 people is that the preference which they have expressed lies strictly outside category X and therefore that preference can be ignored?
Whereas, if you have a Friendly Paperclipper (cares about X-agents and paperclips with some weight on each), the Friendly moral values put it within X...which means that we should now be willing to cater to its morally neutral paper-clip preferences as well.
(If this reads sloppy, it’s because my thoughts on the matter currently are sloppy)
So...I guess there’s sort of a taxonomy of moral-good, neutral-selfish, and evil preferences...and part of being good means caring about other people’s selfish preferences? And part of being evil means valuing the violation of other’s preferences? And, good agents can simply ignore evil preferences.
And (under the pro-X system), good agents can also ignore the preferences of agents that aren’t in any way good...which seems like it might not be correct, which is why I say that there might be other factors in addition to pro-X that make an agent worth caring about for my moral instincts, but if they exist I don’t know what they are.
Are you perhaps confusing ‘morally wrong’ with ‘a sucky tradeoff that I would prefer not to be bound by’?
Just because torturing one person sucks, just because we find it abhorrent, does not mean that it isn’t the best outcome in various situations. If your definition of ‘moral’ is “best outcome when all things are considered, even though aspects of it suck a lot and are far from ideal”, then yes, torturing someone can in fact be moral. If your definition of ‘moral’ is “those things which I find reprehensible”, then quite probably you can never find torturing someone to be moral. However, there are scenarios where it may still be necessary, or the best option.
Nope...because ..
...because I believe that torturing someone could still instrumentally be the right thing to do on a consequential grounds.
In this scenario, 10^100 people terminally value torturing one person, but I do not care about their preferences, because it is an evil preference.
However, in an alternate scenario, if I had to choose between 10^100 people getting mildly hurt or 1 person getting tortured, I’d choose the one person getting tortured.
In these two scenarios, the preference weights are identical, but in the first scenario the preference of the 10^100 people is evil and therefore irrelevant in my calculations, whereas in the second scenario the needs of 10^100 outweigh the needs of the one.
This is less a discussion about torture, and more a discussion about whose/which preferences matter. Sadistic preferences (involving real harm, not the consensual kink), for example, don’t matter morally—there’s no moral imperative to fulfill those preferences, no “good” done when those preferences are fulfilled and no “evil” resulting from thwarting those preferences.
I think you should temporarily taboo ‘moral’, ‘morality’, and ‘evil’, and simply look at the utility calculations. 10^100 people terminally value something that you ascribe zero or negative value to; therefore, their preferences do not matter to you or will make your universe worse from the standpoint of your utility function.
Which preferences matter? Yours matter to you, and thiers matter to them. There’s no ‘good’ or ‘evil’ in any absolute sense, merely different utility functions that happen to conflict. There’s no utility function which is ‘correct’, except by some arbitrary metric, of which there are many.
Consider another hypothetical utility function: The needs of the 10^100 don’t outweigh the needs of the one, so we let the entire 10^100 suffer when we could eliminate it by inconveniencing one single entity. Neither you nor the 10^100 are happy with this one, but the person about to be tortured may think it’s just fine and dandy...
...I don’t denotatively disagree with anything you’ve said, but I also think you’re sort of missing the point and forgetting the context of the conversation as it was in the preceding comments.
We all have preferences, but we do not always know what our own preferences are. A subset of our preferences (generally those which do not directly reference ourselves) are termed “moral preferences”. The preceding discussion between me and Peter Hurford is an attempt to figure out what our preferences are.
In the above conversation, words like “matter”, “should” and “moral” is understood to mean “the shared preferences of Ishaan, Dentin, and Peter_Hurford which they agree to define as moral”. Since we are all human (and similar in many other ways beyond that), we probably have very similar moral preferences...so any disagreement that arises between us is usually due to one or both of us inaccurately understanding our own preferences.
This is technically true, but it’s also often a semantic stopsign which derails discussions of morality. The fact is that the three of us humans have a very similar notion of “good”, and can speak meaningfully about what it is...the implicitly understood background truths of moral nihilism notwithstanding.
It doesn’t do to exclaim “but wait! good and evil are relative!” during every moral discussion...because here, between us three humans, our moral preferences are pretty much in agreement and we’d all be well served by figuring out exactly those preferences are. It’s not like we’re negotiating morality with aliens.
Precisely...my preferences are all that matter to me, and our preferences are all that matter to us. So if 10^100 sadistic aliens want to torture...so what? We don’t care if they like torture, because we dislike torture and our preferences are all that matter. Who cares about overall utility? “Morality”, for all practical purposes, means shared human morality...or, at least, the shared morality of the humans who are having the discussion.
“Utility” is kind of like “paperclips”...yes, I understand that in the best case scenario it might be possible to create some sort of construct which measures how much “utility” various agent-like objects get from various real world outcomes, but maximizing utility for all agents within this framework is not necessarily my goal...just like maximizing paperclips is not my goal.
So, I’m curious… can you unpack what you mean by “temporarily” in this comment?
For the purposes of this conversation at least. I’ve largely got them taboo’d in general because I find them confusing and full of political connotations; I suspect at least some of that is the problem here as well.
Yet your moral instinct is perfectly fine with having a justice system that puts innocent people in jail with a greater than 1 in 10^100 error rate.
Sure, on instrumental grounds for consequentialist reasons. Not a terminal preference.
Usually people speak of preferences when there is a possibility of choice—the agent can meaningfully choose between doing A and doing B.
This is not the case with respect to molecular models, search engines, and light switches.
At least for search engines, I would say there exist a meaningful level of description where it can be said that the search engine chooses which results to display in response to a query, approximately maximizing some kind of scoring function.
I don’t think it is meaningful in the current context. The search engine is not an autonomous agent and doesn’t choose anything any more than, say, the following bit of pseudocode: if (rnd() > 0.5) { print “Ha!” } else { print “Ooops!” }
“If you search for “potatoes” the engine could choose to return results for “tomatoes” instead...but will choose to return results for potatoes because it (roughly speaking) wants to maximize the usefulness of the search results.”
“If I give you a dollar you could choose to tear it to shreds, but you instead will choose to put it in your wallet because (roughly speaking) you want to xyz...”
When you flip the light switch “on” it could choose to not allow current through the system, but it will flow current through the system because it wants current to flow through the system when it is in the “on” position.
Except for degree of complexity, what’s the difference? “Choice” can be applied to anything modeled as an Agent.
Sorry, I read this as nonsense. What does it mean for a light switch to “want”?
To determine the “preferences” of objects which you are modeling as agents, see what occurs, and construct a preference function that explains those occurrences.
Example: This amoeba appears to be engaging in a diverse array of activities which I do not understand at all, but they all end up resulting in the maintenance of its physical body. I will therefore model it as “preferring not to die”, and use that model to make predictions about how the amoeba will respond to various situations.
I think the light switch example is far fetched, but the search engine isn’t. The point is whether there exist a meaningful level of description where framing the system behavior in terms of making choices to satisfy certain preferences is informative.
Don’t forget that the original context was morality.
You don’t think it is far-fetched to speak of morality of search engines?
Yes, it is.
The distinction you are making between the input-output function of a human as a “choice” vs. the input-output of a machine as “not-a-choice” sounds very reminiscent of the traditional naive / confused model of free will that people commonly have before dissolving the question...but you’re a frequent poster here, so perhaps I’ve misunderstood your meaning. Are you using a specialized definition of the word “choice”?
I have no wish for this to develop into a debate about free will. Let me point out that just because I know the appropriate part of the Sequences does not necessarily mean I agree with it.
As a practical matter, speaking about choices of light switches seems silly. Given this, I don’t see why speaking about choices of search engines is not silly. It might be useful conversational shorthand in some contexts, but I don’t think it is useful in the context of talking about morality.
Ah, ok—sorry. The materialist, dissolved view of free will related questions has been a strongly held view of mine since a very young age, so my prior for a person who is aware of thesel yet subscribes to what I’ll call the “naive view” for lack of the better word is very low.
It’s not really the particulars of the sequences here which are in question—the people who say free will doesn’t exist, and the people who say it does but redefine free will in funny ways, the pan-psychists, the compatiblists and non-compatiblists, all share in common a non-dualist view which does not allow them to label the search engine’s processes and the human’s processes as fundamentally, qualitatively different processes. This is a deep philosophical divide that has been debated for, as far as I am aware, at least two thousand years.
By analogy, speaking of choices of humans seems silly, since humans are made of the same basic laws.
The fundamental disagreement here runs rather deeply—it’s not going to be possible to talk about this without diving into free will.
Philosophical disagreements aside, that doesn’t seem to be a good way to construct priors for other people’s views.
If I understood the causal mechanisms underlying the actions of humans as well as I do those underlying lightswitches, talking about the former as “choices” would seem as silly to me as talking that way about the latter does.
But I don’t, so it doesn’t.
I assume you don’t understand the causal mechanisms underlying the actions of humans either. So why does talking about them as “choices” seem silly to you?
I agree with you. Whether we model something as an agent or an object is a feature of our map, not the territory. It’s not useful to model light switches as agents because they are too simple, and looking at them through the lens of preferences is not simple or informative. Meanwhile, it is useful to model humans partially as preference maximizing agents to make approximations.
However, in the context of the larger discussion, I interpret Lumifer as treating the distinction between “choice” and “event” as a feature of the territory itself, and positing a fundamental qualitative difference between a “choice” and other sorts of events. My reply should be seen as an assertion that such qualitative differences are not features of the map—if it’s impossible to model a light switch as having choices, then it’s also impossible to model a human as having choices. (My actual belief is that it’s possible to model both as having choices or not having them)
Is your actual belief that there are equivalent grounds for modeling both either way?
If so, I disagree… from my own perspective, modeling people as preference-maximizing agents is significantly more justified (due to differences in the territory) than modeling a light switch that way.
If not, to what do you attribute the differential?
...it is possible to model things either way, but it is more useful for some objects than others.
Modeling an object as agents is useful when the object exhibits a pattern of behavior which is roughly consistent with preference maximizing. A search engine is well modeled as an agent. A human is very well modeled as an agent.
A light switch is very poorly modeled as an agent. Thinking of it in terms of preference pattern doesn’t make it any easier to predict its behavior. But you can model it as an agent, if you’d like.
By “justified” do you mean “useful”?
I am willing to adopt “useful” in place of “justified” if it makes this conversation easier. In which case my question could be rephrased “Is it equally useful to model both either way?”
To which your answer seems to be no… it’s more useful to model a human as an agent than it is a light-switch. (I’m inferring, because despite introducing the “useful” language, what you actually say instead introduces the language of something being “well-modeled.” But I’m assuming that by “well-modeled” you mean “useful.”)
And your answer to the followup question is because the pattern of behavior of a light switch is different from that of a search engine or a human, such that adopting an intentional stance towards the former doesn’t make it easier to predict.
Have I understood you correctly?
Yup. Modeling something as a preference maximizing agent is generally useful to adopt for things which systematically behave in ways that maximize certain outcomes in a diverse array of situations. It allows you to make accurate predictions even when you don’t fully understand the mechanics of the events that occur in generating the events you are predicting.
(I distinguished useful and justified because I wasn’t sure if “justified” had moral connotations in your usage)
Edit: On reading the wiki, I tend to agree with the views that the wiki attributes to Dennett. Thanks for the reference and the word “intentional stance”.
OK. So, having clarified that, I return to your initial comment:
...and am as puzzled by it as I was in the first place.
You agree that the input-output function of a human differs from the input-output of a machine like a light switch in ways that make it more useful to model the former but not the latter as maximizing preferences. (To adopt the intentional stance towards the former and the design stance towards the latter, in Dennett’s terminology.)
So, given that, what is your objection to Lumifer’s distinction? “Choice” seems like a perfectly reasonable word to use when taking an intentional stance, and to not use when taking a design stance.
When I asked earlier, you explained that your objection had to do with attributing “territory-level” differences to humans and machines, when it’s really a “map-level” objection… that it’s possible to talk about a light-switch’s choices, or not talk about a human’s choices, so it’s not really a difference in the system at all, just a difference in the speaker.
But given that you agree that there’s a salient “territory-level” difference between the two systems (specifically, the differences which make the intentional stance more useful than the design stance wrt humans, but not wrt light-switches), I don’t quite get the objection. Sure, it’s possible to take either stance towards either system, but it’s more useful to take the intentional stance towards humans, and that’s a “fact about the territory.”
No?
Because in the preceding comment, I was demonstrating that we should not morally care about light switches, search engines, and paperclippers...whereas we should morally care about fishes, dogs, and humans… because of differences in the preference profiles of these beings when they are modeled as agents.
Peter Hurford disagreed with me on the non-moral status of the paper-clipper. I was demonstrating the non-moral status of a being which cared only for paper clips by analogy to a search engine (a being which only cares about bringing up the best search result).
Whereas what Lumifer was saying is that the very premise that a search engine could have choices was fundamentally flawed (which, if true, would cause the whole analogy to break down).
The thing is, it’s not fundamentally flawed to thing of a search engine as having choices. Sure, search engines are a little less usefully modeled as agent-like when compared to humans, but it’s just a matter of degree.
I was objecting to his hard, qualitative binary, not your and Dennet’s soft/qualitative spectrum.
Thanks for clarifying.