Personally, I feel the question itself is misleading because it anthropomorphizes a non-human system. Asking if an AI is nice is like asking of the Fundamental Theorem of Algebra is blue. Is Stockfish nice? Is an AK-47 nice? The adjective isn’t the right category for the noun. Except it’s even worse than that because there are many different kinds of AIs. Are birds blue? Some of them are. Some of them aren’t.
I feel like I understand Eliezer’s arguments well enough that I can pass an Ideological Turing Test, but I also feel there are a few loopholes.
I’ve considered throwing my hat into this ring, but the memetic terrain is against nuance. “AI will kill us all” fits into five words. “Half the things you believe about how minds work, including your own, are wrong. Let’s start over from the beginning with how planet’s major competing optimizers interact. After that, we can go through the fundamentals of behaviorist psychology,” is not a winning thesis in a Hegelian debate (though it can be viable in a Socratic context).
In real life, my conversations usually go like this.
AI doomer: “I believe AI will kill us all. It’s stressing me out. What do you believe?”
Me (as politely as I can): “I operate from a theory of mind so different from yours that the question ‘what do you believe’ is not applicable to this situation.”
AI doomer: “Wut.”
Usually the person loses interest there. For those who don’t, it just turns into an introductory lesson of my own idiosyncratic theory of rationality.
AI doomer: “I never thought about things that way before. I’m not sure I understand you yet, but I feel better about all of this for some reason.”
In practice, I’m finding it more efficient to write stories that teach how competing optimizers, adversarial equilibria, and other things work. This approach is indirect. My hope is that it improves the quality of thinking and discourse.
I may eventually write about this topic if the right person shows up who want to know my opinion well enough they can pass an Ideological Turing Test. Until then, I’ll be trying to become a better writer and YouTuber.
I realize this isn’t your main point here, but I do want to flag I put ‘nice’ in quotes because I don’t mean the colloquial definition. The question here is ‘would a super intelligent system with control over the solar system spend a billionth or trillionth of its resources helping beings too weak to usefully trade with it, if it didn’t benefit directly from it?’
As I see it the question is agnostic to what sort of mind the AI is.
Noted. The problem remains—it’s just less obvious. This phrasing still conflates “intelligent system” with “optimizer”, a mistake that goes all the way back to Eliezer Yudkowsky’s 2004 paper on Coherent Extrapolated Volition.
For example, consider a computer system that, given a number N can (usually) produce the shortest computer program that will output N. Such a computer system is undeniably superintelligent, but it’s not a world optimizer at all.
“Far away, in the Levant, there are yogis who sit on lotus thrones. They do nothing, for which they are revered as gods,” said Socrates.
Strong upvote based on the first sentence. I often wonder why people think an ASI/AGI will want anything that humans do or even see the same things that biological life sees as resources. But it seems like under the covers of many arguments here that is largely assumed true.
Personally, I feel the question itself is misleading because it anthropomorphizes a non-human system. Asking if an AI is nice is like asking of the Fundamental Theorem of Algebra is blue. Is Stockfish nice? Is an AK-47 nice? The adjective isn’t the right category for the noun. Except it’s even worse than that because there are many different kinds of AIs. Are birds blue? Some of them are. Some of them aren’t.
I feel like I understand Eliezer’s arguments well enough that I can pass an Ideological Turing Test, but I also feel there are a few loopholes.
I’ve considered throwing my hat into this ring, but the memetic terrain is against nuance. “AI will kill us all” fits into five words. “Half the things you believe about how minds work, including your own, are wrong. Let’s start over from the beginning with how planet’s major competing optimizers interact. After that, we can go through the fundamentals of behaviorist psychology,” is not a winning thesis in a Hegelian debate (though it can be viable in a Socratic context).
In real life, my conversations usually go like this.
AI doomer: “I believe AI will kill us all. It’s stressing me out. What do you believe?”
Me (as politely as I can): “I operate from a theory of mind so different from yours that the question ‘what do you believe’ is not applicable to this situation.”
AI doomer: “Wut.”
Usually the person loses interest there. For those who don’t, it just turns into an introductory lesson of my own idiosyncratic theory of rationality.
AI doomer: “I never thought about things that way before. I’m not sure I understand you yet, but I feel better about all of this for some reason.”
In practice, I’m finding it more efficient to write stories that teach how competing optimizers, adversarial equilibria, and other things work. This approach is indirect. My hope is that it improves the quality of thinking and discourse.
I may eventually write about this topic if the right person shows up who want to know my opinion well enough they can pass an Ideological Turing Test. Until then, I’ll be trying to become a better writer and YouTuber.
I realize this isn’t your main point here, but I do want to flag I put ‘nice’ in quotes because I don’t mean the colloquial definition. The question here is ‘would a super intelligent system with control over the solar system spend a billionth or trillionth of its resources helping beings too weak to usefully trade with it, if it didn’t benefit directly from it?’
As I see it the question is agnostic to what sort of mind the AI is.
Noted. The problem remains—it’s just less obvious. This phrasing still conflates “intelligent system” with “optimizer”, a mistake that goes all the way back to Eliezer Yudkowsky’s 2004 paper on Coherent Extrapolated Volition.
For example, consider a computer system that, given a number N can (usually) produce the shortest computer program that will output N. Such a computer system is undeniably superintelligent, but it’s not a world optimizer at all.
Strong upvote based on the first sentence. I often wonder why people think an ASI/AGI will want anything that humans do or even see the same things that biological life sees as resources. But it seems like under the covers of many arguments here that is largely assumed true.