You cannot prohibit the expected paperclip maximizer from existing unless you can prohibit superintelligences from accurately calculating which actions lead to how many paperclips, and efficiently searching out plans that would in fact lead to great numbers of paperclips. If you can calculate that, you can hook up that calculation to a motor output and there you go.
Pearce can prohibit paperclippers from existing by prohibiting superintelligences with narrow interests from existing. He doesn’t have to argue that the clipper would not be able to instrumentally reason out how to make paperclips; Pearce can argue that to be a really good instrumental reasoner, an entity needs to have a very broad understanding, and that an entity
with a broad understanding would not retain narrow interests.
To slightly expand, if an intelligence is not prohibited from the following epistemic feats:
1) Be good at predicting which hypothetical actions would lead to how many paperclips, as a question of pure fact.
2) Be good at searching out possible plans which would lead to unusually high numbers of paperclips—answering the purely epistemic search question, “What sort of plan would lead to many paperclips existing, if someone followed it?”
3) Be good at predicting and searching out which possible minds would, if constructed, be good at (1), (2), and (3) as purely epistemic feats.
Then we can hook up this epistemic capability to a motor output and away it goes. You cannot defeat the Orthogonality Thesis without prohibiting superintelligences from accomplishing 1-3 as purely epistemic feats. They must be unable to know the answers to these questions of fact.
Only in the sense that any working Oracle can be trivially transformed into a Genie. The argument doesn’t say that it’s difficult to construct a non-Genie Oracle and use it as an Oracle if that’s what you want; the difficulty there is for other reasons.
Nick Bostrom takes Oracles seriously so I dust off the concept every year and take another look at it. It’s been looking slightly more solvable lately, I’m not sure if it would be solvable enough even assuming the trend continued.
A clarification: my point was that denying orthogonality requires denying the possibility of Oracles being constructed; your post seemed a rephrasing of that general idea (that once you can have a machine that can solve some things abstractly, then you need just connect that abstract ability to some implementation module).
Ah. K. It does seem to me like “you can construct it as an Oracle and then turn it into an arbitrary Genie” sounds weaker than “denying the Orthogonality thesis means superintelligences cannot know 1, 2, and 3.” The sort of person who denies OT is liable to deny Oracle construction because the Oracle itself would be converted unto the true morality, but find it much more counterintuitive that an SI could not know something. Also we want to focus on the general shortness of the gap from epistemic knowledge to a working agent.
Possibly. I think your argument needs to be a bit developed to show that one can extract the knowledge usefully, which is not a trivial statement for general AI. So your argument is better in the end, but needs more argument to establish.
You cannot defeat the Orthogonality Thesis without prohibiting superintelligences from accomplishing 1-3 as purely epistemic feats.
I don’t see the significance of “purely epistemic”. I have argued that epistemic rationality could be capable of affecting values, breaking the orthogonality between values and rationality. I could further argue that instrumental rationality bleeds into epistemic rationality. An agent can’t have perfect knowledge of apriori which things are going to be instrumentally useful to it, so it has to star by understanding things, and then posing the question: is that thing useful for my purposes? Epistemic rationality comes first, in a sense. A good instrumental rationalist has to be a good epistemic rationalist.
What the Orthoganilty Thesis needs is an argument to the effect that a SuperIntelligence would be able to
to endlessly update without ever changing its value system, even accidentally. That is tricky since it effectively
means predicting what smarter version of tiself would do. Making it smarted doesn’t help, because it is still faced with the problem of predicting what an even smarterer version of itself would be .. the carrot remains in front of the donkey.
Assuming that the value stability problem has been solved in general gives you are coherent Clippy, but it doesn’t rescue the Orthogonality Thesis as a claim about rationality in general, sin ce it remains the case
that most most agents won’t have firewalled values. If have to engineer something in , it isn’t an intrinsic truth.
Pearce can prohibit paperclippers from existing by prohibiting superintelligences with narrow interests from existing. He doesn’t have to argue that the clipper would not be able to instrumentally reason out how to make paperclips; Pearce can argue that to be a really good instrumental reasoner, an entity needs to have a very broad understanding, and that an entity with a broad understanding would not retain narrow interests.
(Edits for spelling and clarity)
To slightly expand, if an intelligence is not prohibited from the following epistemic feats:
1) Be good at predicting which hypothetical actions would lead to how many paperclips, as a question of pure fact.
2) Be good at searching out possible plans which would lead to unusually high numbers of paperclips—answering the purely epistemic search question, “What sort of plan would lead to many paperclips existing, if someone followed it?”
3) Be good at predicting and searching out which possible minds would, if constructed, be good at (1), (2), and (3) as purely epistemic feats.
Then we can hook up this epistemic capability to a motor output and away it goes. You cannot defeat the Orthogonality Thesis without prohibiting superintelligences from accomplishing 1-3 as purely epistemic feats. They must be unable to know the answers to these questions of fact.
A nice rephrasing of the “no Oracle” argument.
Only in the sense that any working Oracle can be trivially transformed into a Genie. The argument doesn’t say that it’s difficult to construct a non-Genie Oracle and use it as an Oracle if that’s what you want; the difficulty there is for other reasons.
Nick Bostrom takes Oracles seriously so I dust off the concept every year and take another look at it. It’s been looking slightly more solvable lately, I’m not sure if it would be solvable enough even assuming the trend continued.
A clarification: my point was that denying orthogonality requires denying the possibility of Oracles being constructed; your post seemed a rephrasing of that general idea (that once you can have a machine that can solve some things abstractly, then you need just connect that abstract ability to some implementation module).
Ah. K. It does seem to me like “you can construct it as an Oracle and then turn it into an arbitrary Genie” sounds weaker than “denying the Orthogonality thesis means superintelligences cannot know 1, 2, and 3.” The sort of person who denies OT is liable to deny Oracle construction because the Oracle itself would be converted unto the true morality, but find it much more counterintuitive that an SI could not know something. Also we want to focus on the general shortness of the gap from epistemic knowledge to a working agent.
Possibly. I think your argument needs to be a bit developed to show that one can extract the knowledge usefully, which is not a trivial statement for general AI. So your argument is better in the end, but needs more argument to establish.
I don’t see the significance of “purely epistemic”. I have argued that epistemic rationality could be capable of affecting values, breaking the orthogonality between values and rationality. I could further argue that instrumental rationality bleeds into epistemic rationality. An agent can’t have perfect knowledge of apriori which things are going to be instrumentally useful to it, so it has to star by understanding things, and then posing the question: is that thing useful for my purposes? Epistemic rationality comes first, in a sense. A good instrumental rationalist has to be a good epistemic rationalist.
What the Orthoganilty Thesis needs is an argument to the effect that a SuperIntelligence would be able to to endlessly update without ever changing its value system, even accidentally. That is tricky since it effectively means predicting what smarter version of tiself would do. Making it smarted doesn’t help, because it is still faced with the problem of predicting what an even smarterer version of itself would be .. the carrot remains in front of the donkey.
Assuming that the value stability problem has been solved in general gives you are coherent Clippy, but it doesn’t rescue the Orthogonality Thesis as a claim about rationality in general, sin ce it remains the case that most most agents won’t have firewalled values. If have to engineer something in , it isn’t an intrinsic truth.