There just isn’t any reason that the former implies the latter. Either kind of caring is possible but they are not the same thing (and the second is likely more complex than the first).
(Very hastily written:) The former doesn’t imply the latter, it’s just that both interpreting denotation and interpreting connotation are within an order of magnitude as difficult as each other and they aren’t going to be represented by a djinn or an AGI as two distinct classes of interpretation, there’s no natural boundary between them. I mean I guess the fables can make the djinns weirdly stunted in that way, but then the analogy to AGIs breaks down, because interpreting denotation but not connotation is unnatural and you’d have to go out of your way to make an AGI that does that. By hypothesis the AGI is already interpreting natural speech, not compiling code. I mean you can argue that denotation and connotation actually are totally different beasts and we should expect minds-in-general to treat them that way, but my impression is that what we know of linguistics suggests that isn’t the case. (ETA: And I mean even just interpreting the “denotation” requires a lot of context already, obviously; why are we taking that subset of context for granted while leaving out only the most important context? Makes sense for a moralistic djinn fable, doesn’t make sense by analogy to AGI.) (ETA2: Annoyed that this purely epistemic question is going to get bogged down in and interpreted in the light of political boo- / yay-AI-risk-prevention stances, arguments-as-soldiers style.)
The former doesn’t imply the latter, it’s just that both interpreting denotation and interpreting connotation are within an order of magnitude as difficult as each other
This much is true. It is somewhat more difficult to implement a connotation honoring genie (because that requires more advanced referencing and interpretation) but both tasks fall under already defined areas of narrow AI. The difference in difficulty is small enough that I more or less ignore it as a trivial ‘implementation detail’. People could create (either as fiction or as AI) either of these things and each have different problems.
Annoyed that this purely epistemic question is going to get bogged down in and interpreted in the light of political boo- / yay-AI-risk-prevention stances, arguments-as-soldiers style.
Your mind reading is in error. To be honest this seems fairly orthogonal to AI-risk-prevention stances. From what I can tell someone with a particular AI stance hasn’t got an incentive either way because both these types of genie are freaking dangerous in their own way. The only difference acknowledging the possibility of connotation honouring genies makes is perhaps to determine which particular failure mode you potentially end up in. Having a connotation honouring genie may be an order of magnitude safer than a literal genie but unless there is almost-FAI-complete code in there in the background as a a safeguard it’s still something I’d only use if I was absolutely desperate. I round off the safety difference between the two to negligible in approximately the same way I round off the implementation difficulty difference.
As a ‘purely epistemic question’ your original claim is just plain false. However, as another valid point that is somewhat which we have both skirted around the edges of explaining adequately. I (think that I) more or less agree with what you are saying in this follow up comment. I suggest that the main way that AI interest influence this conversation is that it promotes (and is also caused by) interest in being accurate about precisely what the expected outcomes of goal systems are and just what the problems of a given system happen to be.
Sorry, didn’t mean to imply you’d be the one mind-killed, just the general audience. From previous interactions I know you’re too rational for that kind of perversion.
Having a connotation honouring genie may be an order of magnitude safer than a literal genie
I actually think it’s many, many orders of magnitude safer, but that’s only because a denotation honoring genie is just egregiously stupid. A connotation honoring genie still isn’t safe unless “connotation-honoring” implies something at least as extensive and philosophically justifiable as causal validity semantics. I honestly expect the average connotation-honoring genie will lie in-between a denotation-honoring genie and a bona fide justifiable AGI—i.e., it will respect human wishes about as much as humans respect, say, alligator wishes, or the wishes of their long-deceased ancestors. On average I expect an Antichrist, not a Clippy. But even if such an AGI doesn’t kill all of us and maybe even helps us on average, the opportunity cost of such an AGI is extreme, and so I nigh-wholeheartedly support the moralistic intuitions that traditionally lead people to use djinn analogies. Still, I worry that the underlying political question really is poisoning the epistemic question in a way that might bleed over into poor policy decisions re AGI. (Drunk again, apologies for typos et cetera.)
Sorry, didn’t mean to imply you’d be the one mind-killed, just the general audience. From previous interactions I know you’re too rational for that kind of perversion.
Thank you for your generosity but in all honesty I have to deny that. I at times notice in myself the influence of social political incentives. I infer from what I do notice (and, where appropriate, resist) that there are other influences that I do not detect.
I honestly expect the average connotation-honoring genie will lie in-between a denotation-honoring genie and a bona fide justifiable AGI—i.e., it will respect human wishes about as much as humans respect, say, alligator wishes, or the wishes of their long-deceased ancestors.
That seems reasonable.
But even if such an AGI doesn’t kill all of us and maybe even helps us on average, the opportunity cost of such an AGI is extreme, and so I nigh-wholeheartedly support the moralistic intuitions that traditionally lead people to use djinn analogies.
I agree that there is potentially significant opportunity cost but perhaps if anything it sounds like I may be more willing to accept this kind of less-than-ideal outcome. For example if right now I was forced to make a choice whether to accept this failed utopia based on a fully connotative honoring artificial djinn or to leave things exactly as they are I suspect I would accept it. It fails as a utopia but it may still be better than the (expected) future we have right now.
I think you have a point Will (an AI that interprets speech like a squish djinn would require deliberate effort and is proposed by no one), but I think that it is possible to construct a valid squish djinn/AI analogy (a squish djinn interpreting a command would be roughly analogous to an AI that is hard coded to execute that command).
Sorry to everyone for the repetitive statements and the resulting wall of text (that unexpectedly needed to be posted as multiple comments since it was to long). Predicting how people will interpret something is non trivial, and explaining concepts redundantly is sometimes a useful way of making people hear what you want them to hear.
Squish djinn is here used to denote a mind that honestly believes that it was actually instructed to squish the speaker (in order to remove regret for example), not a djinn that wants to hurt the speaker and is looking for a loophole. The squish djinn only care about doing what it is requested to do, and does not care at all about the well being of the requester, so it could certainly be referred to as hostile to the speaker (since it will not hesitate to hurt the speaker in order to achieve its goal (of fulfilling the request)). A cartoonish internal monologue of the squish djinn would be: “the speaker clearly does not want to be squished, but I don’t care what the speaker wants, and I see no relation between what the speaker wants and what it is likely to request, so I determine that the speaker requested to be squished, so I will squish” (which sounds very hostile, but contains no will to hurt the speaker). The typical story djinn is unlikely to be a squish djinn (they usually have a motive to hurt or help the speaker, but is restricted by rules (a clever djinn that wants to hurt the speaker might still squish, but not for the same reasons as a squish djinn (such a djinn would be a valid analogy when opposing a proposal of the type “lets build some unsafe mind with selfish goals and impose rules on it” (such a project can never succeed, and the proposer is probably fundamentally confused, but a simple and correct and sufficient counter argument is: “if the project did succeed, the result would be very bad”)))).
To expand on you having a point. I have obviously not seen every AI proposal on the internet, but as far as I know, no one is proposing to build a wish granting AI that parses speech like a squish djinn (and ending up with such an AI would require a deliberate effort). So I don’t think the squish djinn is a valid argument against proposed wish granting AIs. Any proposed or realistic speech interpreting AI would (as you say) parse english speech as english speech. An AI that makes arbitrary distinctions between different types of meaning would need serious deliberate effort, and as far as I know, no one is proposing to do this. This makes the squish djinn analogy invalid as an argument against proposals to build a wish granting AI. It is a basic fact that statements does not have specified “meanings” attached to them, and AI proposals takes this into account. To take an extreme example to make this very clear would be Bill saying: “Steve is an idiot” to two listeners where one listener will predictably think of one Steve and the other listener will predictable think of some other Steve (or a politician making a speech that different demographics will interpret differently and to their own liking). Bill (or the politician) does not have a specific meaning of which Steve (or which message) they are referring to. This speaker is deliberately making a statement in order to have different effects on different audiences. Another standard example is responding to a question about the location of an object with: “look behind you” (anyone that is able to understand english and has no serious mental deficiencies would be able to guess that the meaning is that the object is/might be behind them (as opposed to following the order and be surprised to see the object lying there and think “what a strange coincidence”)). Building an AI that would parse “look behind you” without understanding that the person is actually saying “it is/might be behind you” would require deliberate effort as it would be necessary to painstakingly avoid using most information while trying to understand speech. Tone of voice, body language, eye gaze, context, prior knowledge of the speaker, models of people in general, etc, etc all provide valuable information when parsing speech. And needing to prevent an AI from using this information (even indirectly, for example through models of “what sentences usually mean”) would put enormous additional burdens on an AI project. An example in the current context would be writing: “It is possible to communicate in a way so that one class of people will infer one meaning and take the speaker seriously and another class of people will infer another meaning and dismiss it as nonsense. This could be done by relying on the fact that people differ in their prior knowledge of the speaker and in their ability to understand certain concepts. One can use non standard vocabulary, take non standard strong positions, describe non common concepts, or otherwise give signals indicating that the speaker is a person that should not be taken seriously so that the speaker is dismissed by most people as talking nonsense. But people that knows the speaker would see a discrepancy and look closer (and if they are familiar with the non standard concepts behind all the “don’t listen to me” signs they might infer a completely different message).”.
To expand on the valid AI squish djinn analogy. I think that hard coding an AI that executes a command is practically impossible. But if it did succeeded, it would act sort of like a squish djinn given that command. And this argument/analogy is a valid and sufficient argument against trying to hard code such a command, making it relevant as long as there exists people that propose to hardcode such commands. If someone tried to hardcode an AI to execute such a command, and they succeeded in creating something that had a real world impact, I predict this represents a failure to implement the command (it would result in an AI that does something other than the squish djinn and something other than what the builders expect it to do). So the squish djinn is not a realistic outcome. But it is what would happen if they succeeded, and thus the squish djinn analogy is a valid argument against “command hard coding” projects. I can’t predict what such an AI would actually do since that depends on how the project failed. Intuitively the situation where confused researchers fail to build a squish djinn does not feel very optimal, but making an argument on this basis is more vague, and require that the proposing researchers accepts their own limited technical ability (saying “doing x is clearly technically possible, but you are not clever enough to succeed” to the typical enthusiastic project proposer (that considers themselves to be clever enough to maybe be the first in the world to create a real AI) might not be the most likely argument to succeed (here I assume that the intent is to be understood, and not to lay the groundworks for later smugly saying “I pointed that out a long time ago” (if one later wants to be smug, then one should optimize for being loud, taking clear and strong positions, and not being understood))). The squish djinn analogy is simply a simpler argument. “Either you fail or you get a squish djinn” is true and simple and sufficient to argue against a project. When presenting this argument, you do spend most of the time arguing about what would happen in a situation that will never actually happen (project success). This might sound very strange to an outside observer, but the strangeness is introduced by the project proposers (invalid) assumption that the project can succeed (analogous to some atheist saying: “if god exists, and is omnipotent, then he is not nice, cuz there is suffering”).
(I’m arrogantly/wisely staying neutral on the question of whether or not it is at all useful to in any way engage with the sort of people whose project proposals can be validly argued against using squish djinn analogies)
(jokes often work by deliberately being understood in different ways at different times by the same listener (the end of the joke deliberately changes the interpretation of the beginning of the joke (in a way that makes fun of someone)). In this case the meaning of the beginning of the joke is not one thing or the other thing. The listener is not first failing to understand what was said and then, after hearing the end, succeeding to understand it. The speaker is intending the listener to understand the first meaning until reaching the end, so the listener is not “first failing to encode the transmission”. There is no inherently true meaning of the beginning of the joke, no inherently true person that this speaker is actually truly referring to. Just a speaker that intends to achieve certain effects on an audience by saying things (and if the speaker is successful, then at the beginning of the joke the listener infers a different meaning from what it infers after hearing the end of the joke). One way to illuminate the concepts discussed above would be to write: “on a somewhat related note, I once considered creating the username “New_Willsome” and to start posting things that sounded like you (for the purpose of demonstrating that if you counter a ban by using sock puppets, you loose your ability to stop people from speaking in your name (I was considering the options of actually acting like I think you would have acted, and the option of including subtle distortions to what I think you would have said, and the option of doing my best to give better explanations of the concepts that you talk about)). But then a bunch of usernames similar to yours showed up and were met with hostility, and I was in a hurry, and drunk, and bat shit crazy, and God told me not to do it, and I was busy writing fanfic, so I decided not to do it (the last sentence is jokingly false. I was not actually in a hurry … :) … )”)
(Very hastily written:) The former doesn’t imply the latter, it’s just that both interpreting denotation and interpreting connotation are within an order of magnitude as difficult as each other and they aren’t going to be represented by a djinn or an AGI as two distinct classes of interpretation, there’s no natural boundary between them. I mean I guess the fables can make the djinns weirdly stunted in that way, but then the analogy to AGIs breaks down, because interpreting denotation but not connotation is unnatural and you’d have to go out of your way to make an AGI that does that. By hypothesis the AGI is already interpreting natural speech, not compiling code. I mean you can argue that denotation and connotation actually are totally different beasts and we should expect minds-in-general to treat them that way, but my impression is that what we know of linguistics suggests that isn’t the case. (ETA: And I mean even just interpreting the “denotation” requires a lot of context already, obviously; why are we taking that subset of context for granted while leaving out only the most important context? Makes sense for a moralistic djinn fable, doesn’t make sense by analogy to AGI.) (ETA2: Annoyed that this purely epistemic question is going to get bogged down in and interpreted in the light of political boo- / yay-AI-risk-prevention stances, arguments-as-soldiers style.)
This much is true. It is somewhat more difficult to implement a connotation honoring genie (because that requires more advanced referencing and interpretation) but both tasks fall under already defined areas of narrow AI. The difference in difficulty is small enough that I more or less ignore it as a trivial ‘implementation detail’. People could create (either as fiction or as AI) either of these things and each have different problems.
Your mind reading is in error. To be honest this seems fairly orthogonal to AI-risk-prevention stances. From what I can tell someone with a particular AI stance hasn’t got an incentive either way because both these types of genie are freaking dangerous in their own way. The only difference acknowledging the possibility of connotation honouring genies makes is perhaps to determine which particular failure mode you potentially end up in. Having a connotation honouring genie may be an order of magnitude safer than a literal genie but unless there is almost-FAI-complete code in there in the background as a a safeguard it’s still something I’d only use if I was absolutely desperate. I round off the safety difference between the two to negligible in approximately the same way I round off the implementation difficulty difference.
As a ‘purely epistemic question’ your original claim is just plain false. However, as another valid point that is somewhat which we have both skirted around the edges of explaining adequately. I (think that I) more or less agree with what you are saying in this follow up comment. I suggest that the main way that AI interest influence this conversation is that it promotes (and is also caused by) interest in being accurate about precisely what the expected outcomes of goal systems are and just what the problems of a given system happen to be.
Sorry, didn’t mean to imply you’d be the one mind-killed, just the general audience. From previous interactions I know you’re too rational for that kind of perversion.
I actually think it’s many, many orders of magnitude safer, but that’s only because a denotation honoring genie is just egregiously stupid. A connotation honoring genie still isn’t safe unless “connotation-honoring” implies something at least as extensive and philosophically justifiable as causal validity semantics. I honestly expect the average connotation-honoring genie will lie in-between a denotation-honoring genie and a bona fide justifiable AGI—i.e., it will respect human wishes about as much as humans respect, say, alligator wishes, or the wishes of their long-deceased ancestors. On average I expect an Antichrist, not a Clippy. But even if such an AGI doesn’t kill all of us and maybe even helps us on average, the opportunity cost of such an AGI is extreme, and so I nigh-wholeheartedly support the moralistic intuitions that traditionally lead people to use djinn analogies. Still, I worry that the underlying political question really is poisoning the epistemic question in a way that might bleed over into poor policy decisions re AGI. (Drunk again, apologies for typos et cetera.)
Thank you for your generosity but in all honesty I have to deny that. I at times notice in myself the influence of social political incentives. I infer from what I do notice (and, where appropriate, resist) that there are other influences that I do not detect.
That seems reasonable.
I agree that there is potentially significant opportunity cost but perhaps if anything it sounds like I may be more willing to accept this kind of less-than-ideal outcome. For example if right now I was forced to make a choice whether to accept this failed utopia based on a fully connotative honoring artificial djinn or to leave things exactly as they are I suspect I would accept it. It fails as a utopia but it may still be better than the (expected) future we have right now.
I think you have a point Will (an AI that interprets speech like a squish djinn would require deliberate effort and is proposed by no one), but I think that it is possible to construct a valid squish djinn/AI analogy (a squish djinn interpreting a command would be roughly analogous to an AI that is hard coded to execute that command).
Sorry to everyone for the repetitive statements and the resulting wall of text (that unexpectedly needed to be posted as multiple comments since it was to long). Predicting how people will interpret something is non trivial, and explaining concepts redundantly is sometimes a useful way of making people hear what you want them to hear.
Squish djinn is here used to denote a mind that honestly believes that it was actually instructed to squish the speaker (in order to remove regret for example), not a djinn that wants to hurt the speaker and is looking for a loophole. The squish djinn only care about doing what it is requested to do, and does not care at all about the well being of the requester, so it could certainly be referred to as hostile to the speaker (since it will not hesitate to hurt the speaker in order to achieve its goal (of fulfilling the request)). A cartoonish internal monologue of the squish djinn would be: “the speaker clearly does not want to be squished, but I don’t care what the speaker wants, and I see no relation between what the speaker wants and what it is likely to request, so I determine that the speaker requested to be squished, so I will squish” (which sounds very hostile, but contains no will to hurt the speaker). The typical story djinn is unlikely to be a squish djinn (they usually have a motive to hurt or help the speaker, but is restricted by rules (a clever djinn that wants to hurt the speaker might still squish, but not for the same reasons as a squish djinn (such a djinn would be a valid analogy when opposing a proposal of the type “lets build some unsafe mind with selfish goals and impose rules on it” (such a project can never succeed, and the proposer is probably fundamentally confused, but a simple and correct and sufficient counter argument is: “if the project did succeed, the result would be very bad”)))).
To expand on you having a point. I have obviously not seen every AI proposal on the internet, but as far as I know, no one is proposing to build a wish granting AI that parses speech like a squish djinn (and ending up with such an AI would require a deliberate effort). So I don’t think the squish djinn is a valid argument against proposed wish granting AIs. Any proposed or realistic speech interpreting AI would (as you say) parse english speech as english speech. An AI that makes arbitrary distinctions between different types of meaning would need serious deliberate effort, and as far as I know, no one is proposing to do this. This makes the squish djinn analogy invalid as an argument against proposals to build a wish granting AI. It is a basic fact that statements does not have specified “meanings” attached to them, and AI proposals takes this into account. To take an extreme example to make this very clear would be Bill saying: “Steve is an idiot” to two listeners where one listener will predictably think of one Steve and the other listener will predictable think of some other Steve (or a politician making a speech that different demographics will interpret differently and to their own liking). Bill (or the politician) does not have a specific meaning of which Steve (or which message) they are referring to. This speaker is deliberately making a statement in order to have different effects on different audiences. Another standard example is responding to a question about the location of an object with: “look behind you” (anyone that is able to understand english and has no serious mental deficiencies would be able to guess that the meaning is that the object is/might be behind them (as opposed to following the order and be surprised to see the object lying there and think “what a strange coincidence”)). Building an AI that would parse “look behind you” without understanding that the person is actually saying “it is/might be behind you” would require deliberate effort as it would be necessary to painstakingly avoid using most information while trying to understand speech. Tone of voice, body language, eye gaze, context, prior knowledge of the speaker, models of people in general, etc, etc all provide valuable information when parsing speech. And needing to prevent an AI from using this information (even indirectly, for example through models of “what sentences usually mean”) would put enormous additional burdens on an AI project. An example in the current context would be writing: “It is possible to communicate in a way so that one class of people will infer one meaning and take the speaker seriously and another class of people will infer another meaning and dismiss it as nonsense. This could be done by relying on the fact that people differ in their prior knowledge of the speaker and in their ability to understand certain concepts. One can use non standard vocabulary, take non standard strong positions, describe non common concepts, or otherwise give signals indicating that the speaker is a person that should not be taken seriously so that the speaker is dismissed by most people as talking nonsense. But people that knows the speaker would see a discrepancy and look closer (and if they are familiar with the non standard concepts behind all the “don’t listen to me” signs they might infer a completely different message).”.
To expand on the valid AI squish djinn analogy. I think that hard coding an AI that executes a command is practically impossible. But if it did succeeded, it would act sort of like a squish djinn given that command. And this argument/analogy is a valid and sufficient argument against trying to hard code such a command, making it relevant as long as there exists people that propose to hardcode such commands. If someone tried to hardcode an AI to execute such a command, and they succeeded in creating something that had a real world impact, I predict this represents a failure to implement the command (it would result in an AI that does something other than the squish djinn and something other than what the builders expect it to do). So the squish djinn is not a realistic outcome. But it is what would happen if they succeeded, and thus the squish djinn analogy is a valid argument against “command hard coding” projects. I can’t predict what such an AI would actually do since that depends on how the project failed. Intuitively the situation where confused researchers fail to build a squish djinn does not feel very optimal, but making an argument on this basis is more vague, and require that the proposing researchers accepts their own limited technical ability (saying “doing x is clearly technically possible, but you are not clever enough to succeed” to the typical enthusiastic project proposer (that considers themselves to be clever enough to maybe be the first in the world to create a real AI) might not be the most likely argument to succeed (here I assume that the intent is to be understood, and not to lay the groundworks for later smugly saying “I pointed that out a long time ago” (if one later wants to be smug, then one should optimize for being loud, taking clear and strong positions, and not being understood))). The squish djinn analogy is simply a simpler argument. “Either you fail or you get a squish djinn” is true and simple and sufficient to argue against a project. When presenting this argument, you do spend most of the time arguing about what would happen in a situation that will never actually happen (project success). This might sound very strange to an outside observer, but the strangeness is introduced by the project proposers (invalid) assumption that the project can succeed (analogous to some atheist saying: “if god exists, and is omnipotent, then he is not nice, cuz there is suffering”).
(I’m arrogantly/wisely staying neutral on the question of whether or not it is at all useful to in any way engage with the sort of people whose project proposals can be validly argued against using squish djinn analogies)
(jokes often work by deliberately being understood in different ways at different times by the same listener (the end of the joke deliberately changes the interpretation of the beginning of the joke (in a way that makes fun of someone)). In this case the meaning of the beginning of the joke is not one thing or the other thing. The listener is not first failing to understand what was said and then, after hearing the end, succeeding to understand it. The speaker is intending the listener to understand the first meaning until reaching the end, so the listener is not “first failing to encode the transmission”. There is no inherently true meaning of the beginning of the joke, no inherently true person that this speaker is actually truly referring to. Just a speaker that intends to achieve certain effects on an audience by saying things (and if the speaker is successful, then at the beginning of the joke the listener infers a different meaning from what it infers after hearing the end of the joke). One way to illuminate the concepts discussed above would be to write: “on a somewhat related note, I once considered creating the username “New_Willsome” and to start posting things that sounded like you (for the purpose of demonstrating that if you counter a ban by using sock puppets, you loose your ability to stop people from speaking in your name (I was considering the options of actually acting like I think you would have acted, and the option of including subtle distortions to what I think you would have said, and the option of doing my best to give better explanations of the concepts that you talk about)). But then a bunch of usernames similar to yours showed up and were met with hostility, and I was in a hurry, and drunk, and bat shit crazy, and God told me not to do it, and I was busy writing fanfic, so I decided not to do it (the last sentence is jokingly false. I was not actually in a hurry … :) … )”)