I’m afraid you have lost me: when you say “This seems obviously impossible...” I am not clear which aspect strikes you as obviously impossible.
Before you answer that, though: remember that I am describing someone ELSE’S suggestion about how the AI will behave ….. I am not advocating this as a believable scenario! In fact I am describing that other person’s suggestion in such a way that the impossibility is made transparent. So I, too, believe that this hypothetical AI is fraught with contradictions.
The Dopamine Drip scenario is that the AI knows that it has a set of goals designed to achieve a certain set of results, and since it has an extreme level of intelligence it is capable of understanding that very often a “target set of results” can be described, but not enumerated as a closed set. It knows that very often in its behavior it (or someone else) will design some goal code that is supposed to achieve that “target set of results”, but because of the limitations of goal code writing, the goal code can malfunction. The Dopamine Drip scenario is only one example of how a discrepancy can arise—in that case, the “target set of results” is the promotion of human happiness, and then the rest of the scenario follows straightforwardly. Nobody I have talked to so far misunderstands what the DD scenario implies, and how it fits that pattern. So could you clarify how you think it does not?
I’m afraid you have lost me: when you say “This seems obviously impossible...” I am not clear which aspect strikes you as obviously impossible.
AI: Yes, this is in complete contradiction of my programmed goals. Ha ha, I’m gonna do it anyway.
Before you answer that, though: remember that I am describing someone ELSE’S suggestion about how the AI will behave ….. I am not advocating this as a believable scenario! In fact I am describing that other person’s suggestion in such a way that the impossibility is made transparent. So I, too, believe that this hypothetical AI is fraught with contradictions.
Of course, yeah. I’m basically accusing you of failure to steelman/misinterpreting someone; I, for one, have never heard this suggested (beyond the one example I gave, which I don’t think is what you had in mind.)
The Dopamine Drip scenario is that the AI knows that it has a set of goals designed to achieve a certain set of results, and since it has an extreme level of intelligence it is capable of understanding that very often a “target set of results” can be described, but not enumerated as a closed set.
uhuh. So, any AI smart enough to understand it’s creators, right?
It knows that very often in its behavior it (or someone else) will design some goal code that is supposed to achieve that “target set of results”, but because of the limitations of goal code writing, the goal code can malfunction.
waaait I think I know where this is going. Are you saying an AI would somehow want to do what it’s programmers intended rather than what they actually programmed it to do?
The Dopamine Drip scenario is only one example of how a discrepancy can arise—in that case, the “target set of results” is the promotion of human happiness, and then the rest of the scenario follows straightforwardly. Nobody I have talked to so far misunderstands what the DD scenario implies, and how it fits that pattern. So could you clarify how you think it does not?
Yeah, sorry, I can see how programmers might accidentally write code that creates dopamine world and not eutopia. I just don’t see how this is supposed to connect to the idea of an AI spontaneously violating it’s programmed goals. In this case, surely that would look like “hey guys, you know your programming said to maximise happiness? You guys should be more careful, that actually means “drug everybody”. Anyway, I’m off to torture some people.”
I’m afraid you have lost me: when you say “This seems obviously impossible...” I am not clear which aspect strikes you as obviously impossible.
Before you answer that, though: remember that I am describing someone ELSE’S suggestion about how the AI will behave ….. I am not advocating this as a believable scenario! In fact I am describing that other person’s suggestion in such a way that the impossibility is made transparent. So I, too, believe that this hypothetical AI is fraught with contradictions.
The Dopamine Drip scenario is that the AI knows that it has a set of goals designed to achieve a certain set of results, and since it has an extreme level of intelligence it is capable of understanding that very often a “target set of results” can be described, but not enumerated as a closed set. It knows that very often in its behavior it (or someone else) will design some goal code that is supposed to achieve that “target set of results”, but because of the limitations of goal code writing, the goal code can malfunction. The Dopamine Drip scenario is only one example of how a discrepancy can arise—in that case, the “target set of results” is the promotion of human happiness, and then the rest of the scenario follows straightforwardly. Nobody I have talked to so far misunderstands what the DD scenario implies, and how it fits that pattern. So could you clarify how you think it does not?
AI: Yes, this is in complete contradiction of my programmed goals. Ha ha, I’m gonna do it anyway.
Of course, yeah. I’m basically accusing you of failure to steelman/misinterpreting someone; I, for one, have never heard this suggested (beyond the one example I gave, which I don’t think is what you had in mind.)
uhuh. So, any AI smart enough to understand it’s creators, right?
waaait I think I know where this is going. Are you saying an AI would somehow want to do what it’s programmers intended rather than what they actually programmed it to do?
Yeah, sorry, I can see how programmers might accidentally write code that creates dopamine world and not eutopia. I just don’t see how this is supposed to connect to the idea of an AI spontaneously violating it’s programmed goals. In this case, surely that would look like “hey guys, you know your programming said to maximise happiness? You guys should be more careful, that actually means “drug everybody”. Anyway, I’m off to torture some people.”