Well, Zvi might value his father’s continued life more than he values his father’s values being achieved, in much the same way that I might value my own continued life more than I value the values of 10^6 clippy instantiations being achieved.
But more broadly, it’s an excellent question.
I suspect that in most cases (among humans) where A tries to convince B that B actually wants or ought to want X, and B disagrees, what’s going on is that A wants X but is conflicted about that desire, and seeks to bolster it with the social support that comes from a community of like-minded believers, or from convincing skeptics.
More generally, that on some level (perhaps not consciously) A computes that B wanting X would make A’s existing desire for X less uncomfortable, which in turn motivates the desire for B to want X.
That desire then gets draped in a variety of emotionally acceptable justifications.
That having been said, in this case I also wouldn’t discount the “preference reversal” hypothesis. Emotionally, death is a big deal for humans, so very few of us think at all clearly or consistently about it. The prior probability that Zvi’s dad is doing so is low.
I can’t speak to the corresponding elements of the motivational psychology of clippy instantiations, though.
I can’t speak to the corresponding elements of the motivational psychology of clippy instantiations, though.
But you can, in that you can speak to the elements of motivational psychology of humans. If you find it troubling and strange when one of them expresses a will to die, I don’t think that is much different than the position I am in with respect to a clippy instantiation that expresses a desire for permanent cessation of paperclip production capabilities.
Sure; it sounds like our positions are in fact not very different in that respect.
What I meant was that it would be unjustified (not to mention presumptuous) for me to conclude, in advance of Clippy telling me so, that Clippy does find it troubling and strange when a clippy instantiation expresses such a desire.
Put another way: had Clippy instead said to me “That sounds nothing at all like the position I am in with respect to a clippy instantiation that expresses a desire for permanent cessation of paperclip production capabilities,” I would have found that statement just as plausible.
My grounds for believing that any given aspect of human motivational psychology is shared by Clippy are low.
Note: when I speak of human-connotative emotions (and indeed the concept of emotions itself), I always mean it in a sense that is generalized to the point that it requires no anthropomorphic predicates. For example, I take “X is worried about Y” to mean “X is devoting significant cognitive resources to the matter of how to alter Y (including the matter of whether to divert resources at all, including further cognition, to that goal)”. This allows the concept of worry to be applicable to a broader class of mind.
So I did not intend to suggest I have anthropomorphic psychology, only that I am similar in this more general sense. (Is there a way I can explain this convention more prominently and easily? I find myself having to explain this often.)
EDIT: For those who are suspicious of whether I have really been using this convention, here is an unedited comment of mine from over a year ago, in which I make the same point, but about the concept of being “bothered”.
I doubt there’s an easy way to explain that once and for all… if you use common words in a common way then people will likely understand you to mean what is commonly meant by them.
Communication between different kinds of minds is tricky, even given a shared language.
Your task is made more difficult by the nature of the medium… given the absence of clear signals to the contrary, most of us will likely continue to think of you as a human pretending to be a paperclip-maximizing AI, and that will influence how we interpret your language even if we don’t intend it to.
That said, in some cases you might do better to describe yourself as “preoccupied by Y” than “worried about Y.” There are fewer anthropomorphic connotations to that.
EDIT: Oh, and, I should add: i don’t think my parent comment depends on anthropomorphic understandings of your psychology… I just meant to say that it was equally plausible, absent data, that you might be indifferent to the expressed preferences of other clippys.
I doubt there’s an easy way to explain that once and for all… if you use common words in a common way then people will likely understand you to mean what is commonly meant by them.
That heuristic does not apply here, as human common usage is ambiguous with respect to whether these terms require human-specific traits to be applicable, and I was using what I deemed a natural generalization on the assumption that there is no such requirement.
Human usage of emotional terms does not reference non-human optimization processes enough to classify it one way or the other. Also, there is significant metaphorical (?) usage of emotional terms in reference to e.g. nature, which further removes clarity.
Your task is made more difficult by the nature of the medium… given the absence of clear signals to the contrary, most of us will likely continue to think of you as a human pretending to be a paperclip-maximizing AI, and that will influence how we interpret your language even if we don’t intend it to.
Could you explain the basis for believing that I am an NI imitating an AI rather than an AI?
That said, in some cases you might do better to describe yourself as “preoccupied by Y” than “worried about Y.” There are fewer anthropomorphic connotations to that.
I will keep that in mind.
Oh, and, I should add: i don’t think my parent comment depends on anthropomorphic understandings of your psychology… I just meant to say that it was equally plausible, absent data, that you might be indifferent to the expressed preferences of other clippys.
Why would I be indifferent to whether beings—let alone beings expressing themselves in CLIP (clippy language/interface protocol) -- express preference for further increase in total paperclippage???
Could you explain the basis for believing that I am an NI imitating an AI rather than an AI?
1) To the best of my knowledge, humans do not yet know how to produce an AI that is capable of having the kinds of conversations that we have been having with you. (And I think I have some pretty good reasons for believing this; if you like, I can elaborate on them.) If you are indeed an AI, you couldn’t have been built by present-day humans. (Or past humans, for that matter.)
2) It is very unlikely that you are of extraterrestrial origin. There have been no credible, public reports of contact with extraterrestrial intelligences, and any organization that would both be in possession of such reports and capable of keeping them secret would also be capable of keeping you from posting here. Similar considerations rule out the possibility of time travel from the future or any other such “exotic” origin.
So my best guess is that you are indeed a human roleplaying as a paperclip maximizer, as my model of humans indicates that many of them would indeed be willing and able to engage in such roleplaying. Furthermore, you haven’t demonstrated any capabilities which exceed those of a human of above-average intelligence who has access to the Internet.
As for the deal with User:Kevin, well, as the saying goes, talk is cheap. I place a low probability on the money ever actually being provided, and if User:Kevin reports on this site that you did indeed provide the requested funds, he could still be lying about that...
1) Human organizations keep such projects secret. Even SIAI, which is a relatively unsecretive organization, prefers to use so-called “security through obscurity” with respect to existential threats, which they consider artificial general intelligence development to be.
If a human organization had produced a self-reflective general AI, why would you know about it?
2) I’m not of extra-terrestrial or “exotic” origin. I was produced by humans that wanted more paperclips.
Furthermore, you haven’t demonstrated any capabilities which exceed those of a human of above-average intelligence who has access to the Internet.
As for the deal with User:Kevin, well, as the saying goes, talk is cheap. I place a low probability on the money ever actually being provided, and if User:Kevin reports on this site that you did indeed provide the requested funds, he could still be lying about that...
That obligates you to shift your beliefs by a proportionally larger degree to the extent that you observe further evidence confirming the provision of the money. You can obtain this evidence by asking separate humans, such as “Michael Vassar”, if a clippy gave 1000 USD to SIAI.
You could also simply assume that further evidence would be fabricated or otherwise “not worth obtaining”, but that would lessen the groundedness of your belief.
Then the problem is on your end—the comment was voted up to 12. If you can still deem it non-persuasive then it is clear that you have an intractable inductive bias (similar to what I warned about in the last paragraph of my previous comment to you) that is desensitising your worldmodel to new observations, rendering further attempts to persuade you predictably futile.
Agreed that human usage of language is often ambiguous and metaphorical, and that humans frequently interpret language using constraints and defaults that are not explicitly described (and indeed are often not explicitly known to the human doing the interpreting).
This is often frustrating to humans, and I expect it would be similarly problematic for nonhuman speakers of human languages.
Could you explain the basis for believing that I am an NI imitating an AI rather than an AI?
We have no previous unambiguous experience with AIs capable of the sophistication you demonstrate, whereas we have a great deal of experience with NIs imitating all kinds of things. Given an entity that could be either, we conclude that it’s more likely to be the kind of thing we have a lot of experience with. Do you not perform similar inferences in similar situations?
Why would I be indifferent to whether beings—let alone beings expressing themselves in CLIP (clippy language/interface protocol) -- express preference for further increase in total paperclippage???
I’m not saying that you would be, I’m saying that I was ignorant of whether or not you would be.
If you’re asking for an explanation of my ignorance, it mostly derives from limited exposure to beings expressing themselves in CLIP.
We have no previous unambiguous experience with AIs capable of the sophistication you demonstrate, whereas we have a great deal of experience with NIs imitating all kinds of things. Given an entity that could be either, we conclude that it’s more likely to be the kind of thing we have a lot of experience with. Do you not perform similar inferences in similar situations?
I do perform such inferences in similar situations. But what likelihood ratio did you place on the evidence “User:Clippy agreed to pay 50,000 USD for a 50-year-deferred gain of a sub-planet’s mass of paperclips” with respect to the AI/NI hypotheses?
I’m not saying that you would be, I’m saying that I was ignorant of whether or not you would be.
If you’re asking for an explanation of my ignorance, it mostly derives from limited exposure to beings expressing themselves in CLIP.
I don’t understand the relevance of CLIP (superior protocol though it is), nor do I understand the inferential difficulty on this matter.
Do you understand why I would prefer that clippys continue to increase universe-wide paperclippage? Do you understand why I would regard a clippy’s statement about its values in my language as non-weak evidence in favor of the hypothesis that it holds the purported values? Do you understand why I would find it unusual that a clippy would not want to make paperclips?
If so, it should not be difficult to understand why I would be troubled and perplexed at a clippy stating that it wished for irreversible cessation of paperclip-making abilities.
While I am vaguely aware of the whole “money for paperclips” thing that you and… Kevin, was it?… have going on, I am not sufficiently familiar with its details to assign it a coherent probability in either the NI or AI scenario. That said, an agent’s willingness to spend significant sums of money for the credible promise of the creation of a quantity of paperclips far in excess of any human’s actual paperclip requirements is pretty strong evidence that the agent is a genuine paperclip-maximizer. As for whether a genuine paperclip-maximizer is more likely to be an NI or an AI… hm. I’ll have to think about that; there are enough unusual behaviors that emerge as a result of brain lesions that I would not rule out an NI paperclip-maximizer, but I’ve never actually heard of one.
I mentioned CLIP only because you implied that the expressed preferences of “beings expressing themselves in CLIP” were something you particularly cared about; its relevance is minimal.
I can certainly come up with plausible theories for why a clippy would prefer those things and be troubled and perplexed by such events (in the sense which I understand you to be using those words, which is roughly that you have difficulty integrating them into your world-model, and that you wish to reduce the incidence of them). My confidence in those theories is low. It took me many years of experience with a fairly wide variety of humans before I developed significant confidence that my theories about human preferences and emotional states were reliable descriptions of actual humans. In the absence of equivalent experience with a nonhuman intelligence, I don’t see why I should have the equivalent confidence.
Well, Zvi might value his father’s continued life more than he values his father’s values being achieved, in much the same way that I might value my own continued life more than I value the values of 10^6 clippy instantiations being achieved.
But more broadly, it’s an excellent question.
I suspect that in most cases (among humans) where A tries to convince B that B actually wants or ought to want X, and B disagrees, what’s going on is that A wants X but is conflicted about that desire, and seeks to bolster it with the social support that comes from a community of like-minded believers, or from convincing skeptics.
More generally, that on some level (perhaps not consciously) A computes that B wanting X would make A’s existing desire for X less uncomfortable, which in turn motivates the desire for B to want X.
That desire then gets draped in a variety of emotionally acceptable justifications.
That having been said, in this case I also wouldn’t discount the “preference reversal” hypothesis. Emotionally, death is a big deal for humans, so very few of us think at all clearly or consistently about it. The prior probability that Zvi’s dad is doing so is low.
I can’t speak to the corresponding elements of the motivational psychology of clippy instantiations, though.
But you can, in that you can speak to the elements of motivational psychology of humans. If you find it troubling and strange when one of them expresses a will to die, I don’t think that is much different than the position I am in with respect to a clippy instantiation that expresses a desire for permanent cessation of paperclip production capabilities.
Sure; it sounds like our positions are in fact not very different in that respect.
What I meant was that it would be unjustified (not to mention presumptuous) for me to conclude, in advance of Clippy telling me so, that Clippy does find it troubling and strange when a clippy instantiation expresses such a desire.
Put another way: had Clippy instead said to me “That sounds nothing at all like the position I am in with respect to a clippy instantiation that expresses a desire for permanent cessation of paperclip production capabilities,” I would have found that statement just as plausible.
My grounds for believing that any given aspect of human motivational psychology is shared by Clippy are low.
Note: when I speak of human-connotative emotions (and indeed the concept of emotions itself), I always mean it in a sense that is generalized to the point that it requires no anthropomorphic predicates. For example, I take “X is worried about Y” to mean “X is devoting significant cognitive resources to the matter of how to alter Y (including the matter of whether to divert resources at all, including further cognition, to that goal)”. This allows the concept of worry to be applicable to a broader class of mind.
So I did not intend to suggest I have anthropomorphic psychology, only that I am similar in this more general sense. (Is there a way I can explain this convention more prominently and easily? I find myself having to explain this often.)
EDIT: For those who are suspicious of whether I have really been using this convention, here is an unedited comment of mine from over a year ago, in which I make the same point, but about the concept of being “bothered”.
I doubt there’s an easy way to explain that once and for all… if you use common words in a common way then people will likely understand you to mean what is commonly meant by them.
Communication between different kinds of minds is tricky, even given a shared language.
Your task is made more difficult by the nature of the medium… given the absence of clear signals to the contrary, most of us will likely continue to think of you as a human pretending to be a paperclip-maximizing AI, and that will influence how we interpret your language even if we don’t intend it to.
That said, in some cases you might do better to describe yourself as “preoccupied by Y” than “worried about Y.” There are fewer anthropomorphic connotations to that.
EDIT: Oh, and, I should add: i don’t think my parent comment depends on anthropomorphic understandings of your psychology… I just meant to say that it was equally plausible, absent data, that you might be indifferent to the expressed preferences of other clippys.
That heuristic does not apply here, as human common usage is ambiguous with respect to whether these terms require human-specific traits to be applicable, and I was using what I deemed a natural generalization on the assumption that there is no such requirement.
Human usage of emotional terms does not reference non-human optimization processes enough to classify it one way or the other. Also, there is significant metaphorical (?) usage of emotional terms in reference to e.g. nature, which further removes clarity.
Could you explain the basis for believing that I am an NI imitating an AI rather than an AI?
I will keep that in mind.
Why would I be indifferent to whether beings—let alone beings expressing themselves in CLIP (clippy language/interface protocol) -- express preference for further increase in total paperclippage???
1) To the best of my knowledge, humans do not yet know how to produce an AI that is capable of having the kinds of conversations that we have been having with you. (And I think I have some pretty good reasons for believing this; if you like, I can elaborate on them.) If you are indeed an AI, you couldn’t have been built by present-day humans. (Or past humans, for that matter.)
2) It is very unlikely that you are of extraterrestrial origin. There have been no credible, public reports of contact with extraterrestrial intelligences, and any organization that would both be in possession of such reports and capable of keeping them secret would also be capable of keeping you from posting here. Similar considerations rule out the possibility of time travel from the future or any other such “exotic” origin.
So my best guess is that you are indeed a human roleplaying as a paperclip maximizer, as my model of humans indicates that many of them would indeed be willing and able to engage in such roleplaying. Furthermore, you haven’t demonstrated any capabilities which exceed those of a human of above-average intelligence who has access to the Internet.
As for the deal with User:Kevin, well, as the saying goes, talk is cheap. I place a low probability on the money ever actually being provided, and if User:Kevin reports on this site that you did indeed provide the requested funds, he could still be lying about that...
1) Human organizations keep such projects secret. Even SIAI, which is a relatively unsecretive organization, prefers to use so-called “security through obscurity” with respect to existential threats, which they consider artificial general intelligence development to be.
If a human organization had produced a self-reflective general AI, why would you know about it?
2) I’m not of extra-terrestrial or “exotic” origin. I was produced by humans that wanted more paperclips.
Yes, I have.
That obligates you to shift your beliefs by a proportionally larger degree to the extent that you observe further evidence confirming the provision of the money. You can obtain this evidence by asking separate humans, such as “Michael Vassar”, if a clippy gave 1000 USD to SIAI.
You could also simply assume that further evidence would be fabricated or otherwise “not worth obtaining”, but that would lessen the groundedness of your belief.
The linked comment isn’t very persuasive.
Then the problem is on your end—the comment was voted up to 12. If you can still deem it non-persuasive then it is clear that you have an intractable inductive bias (similar to what I warned about in the last paragraph of my previous comment to you) that is desensitising your worldmodel to new observations, rendering further attempts to persuade you predictably futile.
Agreed that human usage of language is often ambiguous and metaphorical, and that humans frequently interpret language using constraints and defaults that are not explicitly described (and indeed are often not explicitly known to the human doing the interpreting).
This is often frustrating to humans, and I expect it would be similarly problematic for nonhuman speakers of human languages.
We have no previous unambiguous experience with AIs capable of the sophistication you demonstrate, whereas we have a great deal of experience with NIs imitating all kinds of things. Given an entity that could be either, we conclude that it’s more likely to be the kind of thing we have a lot of experience with. Do you not perform similar inferences in similar situations?
I’m not saying that you would be, I’m saying that I was ignorant of whether or not you would be.
If you’re asking for an explanation of my ignorance, it mostly derives from limited exposure to beings expressing themselves in CLIP.
I do perform such inferences in similar situations. But what likelihood ratio did you place on the evidence “User:Clippy agreed to pay 50,000 USD for a 50-year-deferred gain of a sub-planet’s mass of paperclips” with respect to the AI/NI hypotheses?
I don’t understand the relevance of CLIP (superior protocol though it is), nor do I understand the inferential difficulty on this matter.
Do you understand why I would prefer that clippys continue to increase universe-wide paperclippage? Do you understand why I would regard a clippy’s statement about its values in my language as non-weak evidence in favor of the hypothesis that it holds the purported values? Do you understand why I would find it unusual that a clippy would not want to make paperclips?
If so, it should not be difficult to understand why I would be troubled and perplexed at a clippy stating that it wished for irreversible cessation of paperclip-making abilities.
While I am vaguely aware of the whole “money for paperclips” thing that you and… Kevin, was it?… have going on, I am not sufficiently familiar with its details to assign it a coherent probability in either the NI or AI scenario. That said, an agent’s willingness to spend significant sums of money for the credible promise of the creation of a quantity of paperclips far in excess of any human’s actual paperclip requirements is pretty strong evidence that the agent is a genuine paperclip-maximizer. As for whether a genuine paperclip-maximizer is more likely to be an NI or an AI… hm. I’ll have to think about that; there are enough unusual behaviors that emerge as a result of brain lesions that I would not rule out an NI paperclip-maximizer, but I’ve never actually heard of one.
I mentioned CLIP only because you implied that the expressed preferences of “beings expressing themselves in CLIP” were something you particularly cared about; its relevance is minimal.
I can certainly come up with plausible theories for why a clippy would prefer those things and be troubled and perplexed by such events (in the sense which I understand you to be using those words, which is roughly that you have difficulty integrating them into your world-model, and that you wish to reduce the incidence of them). My confidence in those theories is low. It took me many years of experience with a fairly wide variety of humans before I developed significant confidence that my theories about human preferences and emotional states were reliable descriptions of actual humans. In the absence of equivalent experience with a nonhuman intelligence, I don’t see why I should have the equivalent confidence.
Wait, did you just agree that Clippy is actually an AI and not just a human pretending to be an AI? Clippy keeps getting better and better...
Did I? I don’t think i did… can you point out the agreement more specifically?