This is a good example of two idioms: First, what Bruce Schneier called “fence-post security”. That’s where you build a very tall fencepost in the middle of the desert. People don’t climb the fencepost, they just walk around it.
Second, the idiom that would-be FAI solvers go into when they see a problem, and try to fix it by brute force, which manifests as “hard-wiring into the very circuitry” or “giving it as its ultimate priority” that X. X varies, but with a fairly constant idiom of getting an emotional charge, a sense of having delivered a very strong command or created something very powerful, by talking about how strongly the goal is to be enforced.
First, what Bruce Schneier called “fence-post security”. That’s where you build a very tall fencepost in the middle of the desert. People don’t climb the fencepost, they just walk around it.
I don’t see the analogue. Assuming we could get all the physical stuff right, so that the AI had no real hope or desire of affecting its environment substantially aside from changing the letters on a terminal screen, I think this would be a considerable barrier to destruction of the human race. At the very least it makes sense to have this safeguard in addition to trying to make the AI friendly. It’s not like you can only do one or the other.
Second, the idiom that would-be FAI solvers go into when they see a problem, and try to fix it by brute force, which manifests as “hard-wiring into the very circuitry” or “giving it as its ultimate priority” that X. X varies, but with a fairly constant idiom of getting an emotional charge, a sense of having delivered a very strong command or created something very powerful, by talking about how strongly the goal is to be enforced.
Isn’t that what you do with FAI in general by saying that the AI will have as its ultimate priority to implement CEV? Anyway, you are accusing me of having the same aesthetic as people you disagree with without actually attacking my claims, which is a fallacious argument.
Isn’t that what you do with FAI in general by saying that the AI will have as its ultimate priority to implement CEV?
Nope, I don’t go around talking about it being the super top ultimate indefeatable priority. I just say, “Here’s a proposed decision criterion if I can (a) manage to translate it out of English and (b) code it stably.” It’s not the “ultimate priority”. It’s just a proposed what-it-does of the AI. The problem is not getting the AI to listen. The problem is translating something like CEV into nonmagical language.
Anyway, you are accusing me of having the same aesthetic as people you disagree with without actually attacking my claims, which is a fallacious argument.
Okay. Read the Dreams of Friendliness series—it’s not sequenced up yet, but hopefully you should be able to start there and click back on the dependency chain.
This is a good example of two idioms: First, what Bruce Schneier called “fence-post security”. That’s where you build a very tall fencepost in the middle of the desert. People don’t climb the fencepost, they just walk around it.
Second, the idiom that would-be FAI solvers go into when they see a problem, and try to fix it by brute force, which manifests as “hard-wiring into the very circuitry” or “giving it as its ultimate priority” that X. X varies, but with a fairly constant idiom of getting an emotional charge, a sense of having delivered a very strong command or created something very powerful, by talking about how strongly the goal is to be enforced.
I don’t see the analogue. Assuming we could get all the physical stuff right, so that the AI had no real hope or desire of affecting its environment substantially aside from changing the letters on a terminal screen, I think this would be a considerable barrier to destruction of the human race. At the very least it makes sense to have this safeguard in addition to trying to make the AI friendly. It’s not like you can only do one or the other.
Isn’t that what you do with FAI in general by saying that the AI will have as its ultimate priority to implement CEV? Anyway, you are accusing me of having the same aesthetic as people you disagree with without actually attacking my claims, which is a fallacious argument.
Nope, I don’t go around talking about it being the super top ultimate indefeatable priority. I just say, “Here’s a proposed decision criterion if I can (a) manage to translate it out of English and (b) code it stably.” It’s not the “ultimate priority”. It’s just a proposed what-it-does of the AI. The problem is not getting the AI to listen. The problem is translating something like CEV into nonmagical language.
Okay. Read the Dreams of Friendliness series—it’s not sequenced up yet, but hopefully you should be able to start there and click back on the dependency chain.