Recently, I learned that the original meaning of “tl;dr” has stuck in people’s mind such that they don’t consider it a polite response. That’s good to know.
Some things that bother me are:
Referring to life extension as “immortality”.
Referring to AIs that don’t want to kill humans as “friendly”.
Referring to AIs that want to kill humans as simply “unfriendly”.
Expressing disagreement as false lack of understanding, e.g. “I don’t know how you could possibly think that.”
Referring to “the singularity” instead of “a singularity”.
I’m not going to pretend that referring to women as”girls” inherently bothers me, but it bothers other people, so it by extension bothers me and I wouldn’t want it excluded from this discussion.
Expressing disagreement as false lack of understanding, e.g. “I don’t know how you could possibly think that.”
This, more than the others, is a sign of a pernicious pattern of thought. By affirming that someone’s point of view is alien, we fail to use our curiosity, we set up barriers to communication, and we can make any opposing viewpoint seem less reasonable.
Referring to AIs that don’t want to kill humans as “friendly”.
Referring to AIs that want to kill humans as simply “unfriendly”.
“Friendly” as I’ve seen it used on here means “an AI that creates a world we won’t regret having created,” or something like that. It might be good to link to an explanation every time the term is used for the benefit of new readers, but I don’t think it’s necessary. “Unfriendly” means “any AI that doesn’t meet the definition of Friendly,” or “an AI that we would regret creating (usually because it destroys the world).” I think these are good, consistent ways of using the term.
Most possible AIs have no particular desire either to kill humans or to avoid doing so. They are generally called “Unfriendly” because creating one would be A Bad Thing. Many possible AIs that want to avoid killing humans are also Unfriendly because they have no problem doing other things we don’t want. The important thing, when classifying potential AIs, is whether it would be a very good idea or a very bad idea to create one. That’s what the Friendly/Unfriendly distinction should mean.
Expressing disagreement as false lack of understanding, e.g. “I don’t know how you could possibly think that.”
I’ve found that saying, “I don’t think I understand what you mean by that” or “I don’t see why you’re saying so” is a useful tactic when somebody says something apparently nonsensical. The other person usually clarifies their position without being much offended, and one of two things happens. Either they were saying something true which I misinterpreted, or they really did mean something I disagree with, at which point I can say so.
Referring [to] an “individual’s CEV”.
I think this is a good idea, because humans aren’t expected utility maximizers. We have different desires at different times, we don’t always want what we like, etc. An individual’s CEV would be the coherent combination of all that person’s inconsistent drives: what that person is like at reflective equilibrium.
Referring to “the singularity” instead of “a singularity”. referring to women as”girls”
These ones bother me too, and I support not doing them.
Expressing disagreement as a false lack of understanding
I’ve found that saying, “I don’t think I understand what you mean by that” or “I don’t see why you’re saying so” is a useful tactic when somebody says something apparently nonsensical.
Yes, when you actually don’t understand, saying that you don’t understand is rarely a bad idea. It’s when you understand but disagree that proclaiming an inability to comprehend the other’s viewpoint is ill-advised.
Referring [to] an “individual’s CEV”.
I think this is a good idea, because humans aren’t expected utility maximizers.
I could be wrong, but this may be a terminology issue.
Coherence: Strong agreement between many extrapolated individual volitions …
Coherence: Strong agreement between many extrapolated individual volitions …
It would indeed appear that EY originally defined coherence that way. I think it’s legitimate to extend the meaning of the term to “strong agreement among the different utility functions an individual maximizes in different situations.” You don’t necessarily agree, and that’s fine, because this is partly a subjective issue. What, if anything, would you suggest instead of “CEV” to refer to a person’s utility function at reflective equilibrium? Just “eir EV” could work, and I think I’ve seen that around here before.
I think it’s legitimate to extend the meaning of the term to “strong agreement among the different utility functions an individual maximizes in different situations.”
Me too. I consider the difference in coherency issues between CEV(humanity) and CEV(pedanterrific) to be one of degree, not kind. I just thought that might be what lessdazed was objecting to, that’s all.
Referring to AIs that don’t want to kill humans as “friendly”.
Necessary, within the an infinitesimal subset of mindspace around friendliness but not quite sufficient. Examples of cases where this is a problem include when people go around saying “a friendly AI may torture ”. Because that is by definition not friendly. Any other example of “what if a friendly AI thing did ” is also a misuse of the idea.
Referring to AIs that want to kill humans as simply “unfriendly”.
That seems entirely legitimate. uFAI is rather useful and well established name for an overwhelmingly important concept. I honestly think you just need to learn more about how the concept of Unfriendly AI is used because this is not a problem term. AIs that want to kill humans (ie. most of them) are unfriendly.
Referring to life extension as “immortality”.
Do people even do that? I haven’t seen it. People attempting immortality (living indefinitely) will obviously use whatever life extension practices they think will help achieve that end. Yet if anyone ever said, for example, “I’m having daily resveratrol for immortality” then I suggest they were being tongue-in-cheek.
AIs that want to kill humans (ie. most of them) are unfriendly.
Just as “want” does not unambiguously exclude instrumental values in English, “unfriendly” does not unambiguously include instrumental values in English. As for the composite technical term “Unfriendly Artificial Intelligence”...
If you write “Unfriendly Artificial Intelligence” alone, regardless of other context, you are technically correct. If you want to be correct again, type it again, in wingdings if the mood strikes you, you will still be technically correct, though with even less of a chance at communicating. In the context of entire papers, there is other supporting context, so it’s not a problem. In the context of secondary discussions, consider those liable to be confused or you can consider them confused.
We might disagree about the extent of confusion around here, we might disagree as to how important that is, we might disagree as to how much of that is caused by unclear forum discussions, and we might disagree about the cost of various solutions.
Regarding the first point, those confident enough to post their thoughts on the issue make mistakes. Regarding the fourth point, assume I’m not advocating an inane extreme solution such as requiring you to define words every comment you make, but rather thoughtfulness.
Examples of cases where this is a problem include when people go around saying “a friendly AI may torture ”. Because that is by definition not friendly. Any other example of “what if a friendly AI thing did ” is also a misuse of the idea.
No torture? You’re guessing as to what you want, what people want, what you value, what there is to know...etc. Guessing reasonably, but it’s still just conjecture and not a necessary ingredient in the definition (as I gather it’s usually used).
Or, you’re using “friendly” in the colloquial rather than strictly technical sense, which is the opposite of how you criticized how I said not to speak about unfriendly AI! My main point is that care to should be taken to explain what is meant when navigating among differing conceptions within and between colloquial and technical senses.
Or, you’re using “friendly” in the colloquial rather than strictly technical sense
No, you’re wrong about the dichotomy there. The words were used legitimately with respect to a subjectively objective concept. But never mind that.
Of all the terms in “Unfriendly Artificial Intelligence” I’d say the ‘unfriendly’ is the most straightforward. I encourage folks to go ahead and use it. Elaborate further on what specifically they are referring to as the context makes necessary.
I encourage folks to go ahead and use it. Elaborate further on what specifically they are referring to as the context makes necessary.
This implies I’m discouraging use of the term, which I’m not, when I raised the issue to point out that for this subject specificity is often not supplied by context alone and needs to be made explicit.
What is confusing is when people describe a scenario in which it is central that an AI has human suffering as a positive terminal value, and they use “unfriendly” alone as a label to discuss it. The vast majority of possible minds are the ones most overlooked: the indifferent ones. If something applies to malicious minds but not indifferent or benevolent ones, one can do better than describing the malicious minds as “either indifferent or malicious”, i.e. “unfriendly”.
I would also discourage calling blenders “not-apples” when specifically referring to machines that make apple sauce. Obviously, calling a blender a “not-apple” will never be wrong. There’s nothing wrong with talking about non-apples in general, nor talking about distinguishing them from apples, nor with saying that a blender is an example of a non-apple, nor with saying that a blender is a special kind of non-apple that, unlike other non-apples, is an anti-apple.
But when someone describes a blender and just calls it a “non-apple”, and someone else starts talking about how almost nothing is a non-apple because most things don’t pulverize apples, and every few times the subject is raised someone assumes a “non-apple” is something that pulverizes apples, it’s time for the first person to implement low-cost clarifications to his or her communication in certain contexts.
What is confusing is when people describe a scenario in which it is central that an AI has human suffering as a positive terminal value, and they use “unfriendly” alone as a label to discuss it. The vast majority of possible minds are the ones most overlooked: the indifferent ones. If something applies to malicious minds but not indifferent or benevolent ones, one can do better than describing the malicious minds as “either indifferent or malicious”, i.e. “unfriendly”.
I would use malicious in that context. A specific kind of uFAI requires a more specific word if you expect people to distinguish it from all other uFAIs.
Referring to AIs that want to kill humans as simply “unfriendly”.
That seems entirely legitimate. … AIs that want to kill humans (ie. most of them) are unfriendly.
I interpreted lessdazed as objecting to the lack of emotional impact—calling it “unfriendly artificial intelligence” has a certain blasé quality, particularly when compared to “all-powerful abomination bent on our destruction”, say.
People are bothered by some words and phrases.
Recently, I learned that the original meaning of “tl;dr” has stuck in people’s mind such that they don’t consider it a polite response. That’s good to know.
Some things that bother me are:
Referring to life extension as “immortality”.
Referring to AIs that don’t want to kill humans as “friendly”.
Referring to AIs that want to kill humans as simply “unfriendly”.
Expressing disagreement as false lack of understanding, e.g. “I don’t know how you could possibly think that.”
Referring an “individual’s CEV”.
Referring to “the singularity” instead of “a singularity”.
I’m not going to pretend that referring to women as”girls” inherently bothers me, but it bothers other people, so it by extension bothers me and I wouldn’t want it excluded from this discussion.
Some say to say not “complexity” or “emergence”.
This, more than the others, is a sign of a pernicious pattern of thought. By affirming that someone’s point of view is alien, we fail to use our curiosity, we set up barriers to communication, and we can make any opposing viewpoint seem less reasonable.
“Friendly” as I’ve seen it used on here means “an AI that creates a world we won’t regret having created,” or something like that. It might be good to link to an explanation every time the term is used for the benefit of new readers, but I don’t think it’s necessary. “Unfriendly” means “any AI that doesn’t meet the definition of Friendly,” or “an AI that we would regret creating (usually because it destroys the world).” I think these are good, consistent ways of using the term.
Most possible AIs have no particular desire either to kill humans or to avoid doing so. They are generally called “Unfriendly” because creating one would be A Bad Thing. Many possible AIs that want to avoid killing humans are also Unfriendly because they have no problem doing other things we don’t want. The important thing, when classifying potential AIs, is whether it would be a very good idea or a very bad idea to create one. That’s what the Friendly/Unfriendly distinction should mean.
I’ve found that saying, “I don’t think I understand what you mean by that” or “I don’t see why you’re saying so” is a useful tactic when somebody says something apparently nonsensical. The other person usually clarifies their position without being much offended, and one of two things happens. Either they were saying something true which I misinterpreted, or they really did mean something I disagree with, at which point I can say so.
I think this is a good idea, because humans aren’t expected utility maximizers. We have different desires at different times, we don’t always want what we like, etc. An individual’s CEV would be the coherent combination of all that person’s inconsistent drives: what that person is like at reflective equilibrium.
These ones bother me too, and I support not doing them.
Yes, when you actually don’t understand, saying that you don’t understand is rarely a bad idea. It’s when you understand but disagree that proclaiming an inability to comprehend the other’s viewpoint is ill-advised.
I could be wrong, but this may be a terminology issue.
It would indeed appear that EY originally defined coherence that way. I think it’s legitimate to extend the meaning of the term to “strong agreement among the different utility functions an individual maximizes in different situations.” You don’t necessarily agree, and that’s fine, because this is partly a subjective issue. What, if anything, would you suggest instead of “CEV” to refer to a person’s utility function at reflective equilibrium? Just “eir EV” could work, and I think I’ve seen that around here before.
Me too. I consider the difference in coherency issues between CEV(humanity) and CEV(pedanterrific) to be one of degree, not kind. I just thought that might be what lessdazed was objecting to, that’s all.
Okay, so I’m not the only one. Lessdazed: is that your objection to “individual CEV” or were you talking about something else?
Necessary, within the an infinitesimal subset of mindspace around friendliness but not quite sufficient. Examples of cases where this is a problem include when people go around saying “a friendly AI may torture ”. Because that is by definition not friendly. Any other example of “what if a friendly AI thing did ” is also a misuse of the idea.
That seems entirely legitimate. uFAI is rather useful and well established name for an overwhelmingly important concept. I honestly think you just need to learn more about how the concept of Unfriendly AI is used because this is not a problem term. AIs that want to kill humans (ie. most of them) are unfriendly.
Do people even do that? I haven’t seen it. People attempting immortality (living indefinitely) will obviously use whatever life extension practices they think will help achieve that end. Yet if anyone ever said, for example, “I’m having daily resveratrol for immortality” then I suggest they were being tongue-in-cheek.
Just as “want” does not unambiguously exclude instrumental values in English, “unfriendly” does not unambiguously include instrumental values in English. As for the composite technical term “Unfriendly Artificial Intelligence”...
If you write “Unfriendly Artificial Intelligence” alone, regardless of other context, you are technically correct. If you want to be correct again, type it again, in wingdings if the mood strikes you, you will still be technically correct, though with even less of a chance at communicating. In the context of entire papers, there is other supporting context, so it’s not a problem. In the context of secondary discussions, consider those liable to be confused or you can consider them confused.
We might disagree about the extent of confusion around here, we might disagree as to how important that is, we might disagree as to how much of that is caused by unclear forum discussions, and we might disagree about the cost of various solutions.
Regarding the first point, those confident enough to post their thoughts on the issue make mistakes. Regarding the fourth point, assume I’m not advocating an inane extreme solution such as requiring you to define words every comment you make, but rather thoughtfulness.
No torture? You’re guessing as to what you want, what people want, what you value, what there is to know...etc. Guessing reasonably, but it’s still just conjecture and not a necessary ingredient in the definition (as I gather it’s usually used).
Or, you’re using “friendly” in the colloquial rather than strictly technical sense, which is the opposite of how you criticized how I said not to speak about unfriendly AI! My main point is that care to should be taken to explain what is meant when navigating among differing conceptions within and between colloquial and technical senses.
No, you’re wrong about the dichotomy there. The words were used legitimately with respect to a subjectively objective concept. But never mind that.
Of all the terms in “Unfriendly Artificial Intelligence” I’d say the ‘unfriendly’ is the most straightforward. I encourage folks to go ahead and use it. Elaborate further on what specifically they are referring to as the context makes necessary.
This implies I’m discouraging use of the term, which I’m not, when I raised the issue to point out that for this subject specificity is often not supplied by context alone and needs to be made explicit.
What is confusing is when people describe a scenario in which it is central that an AI has human suffering as a positive terminal value, and they use “unfriendly” alone as a label to discuss it. The vast majority of possible minds are the ones most overlooked: the indifferent ones. If something applies to malicious minds but not indifferent or benevolent ones, one can do better than describing the malicious minds as “either indifferent or malicious”, i.e. “unfriendly”.
I would also discourage calling blenders “not-apples” when specifically referring to machines that make apple sauce. Obviously, calling a blender a “not-apple” will never be wrong. There’s nothing wrong with talking about non-apples in general, nor talking about distinguishing them from apples, nor with saying that a blender is an example of a non-apple, nor with saying that a blender is a special kind of non-apple that, unlike other non-apples, is an anti-apple.
But when someone describes a blender and just calls it a “non-apple”, and someone else starts talking about how almost nothing is a non-apple because most things don’t pulverize apples, and every few times the subject is raised someone assumes a “non-apple” is something that pulverizes apples, it’s time for the first person to implement low-cost clarifications to his or her communication in certain contexts.
I would use malicious in that context. A specific kind of uFAI requires a more specific word if you expect people to distinguish it from all other uFAIs.
I interpreted lessdazed as objecting to the lack of emotional impact—calling it “unfriendly artificial intelligence” has a certain blasé quality, particularly when compared to “all-powerful abomination bent on our destruction”, say.
Edit: Apparently I was incorrect.
People get away with this and more disingenuous forms of this far too often.