You cannot defeat the Orthogonality Thesis without prohibiting superintelligences from accomplishing 1-3 as purely epistemic feats.
I don’t see the significance of “purely epistemic”. I have argued that epistemic rationality could be capable of affecting values, breaking the orthogonality between values and rationality. I could further argue that instrumental rationality bleeds into epistemic rationality. An agent can’t have perfect knowledge of apriori which things are going to be instrumentally useful to it, so it has to star by understanding things, and then posing the question: is that thing useful for my purposes? Epistemic rationality comes first, in a sense. A good instrumental rationalist has to be a good epistemic rationalist.
What the Orthoganilty Thesis needs is an argument to the effect that a SuperIntelligence would be able to
to endlessly update without ever changing its value system, even accidentally. That is tricky since it effectively
means predicting what smarter version of tiself would do. Making it smarted doesn’t help, because it is still faced with the problem of predicting what an even smarterer version of itself would be .. the carrot remains in front of the donkey.
Assuming that the value stability problem has been solved in general gives you are coherent Clippy, but it doesn’t rescue the Orthogonality Thesis as a claim about rationality in general, sin ce it remains the case
that most most agents won’t have firewalled values. If have to engineer something in , it isn’t an intrinsic truth.
I don’t see the significance of “purely epistemic”. I have argued that epistemic rationality could be capable of affecting values, breaking the orthogonality between values and rationality. I could further argue that instrumental rationality bleeds into epistemic rationality. An agent can’t have perfect knowledge of apriori which things are going to be instrumentally useful to it, so it has to star by understanding things, and then posing the question: is that thing useful for my purposes? Epistemic rationality comes first, in a sense. A good instrumental rationalist has to be a good epistemic rationalist.
What the Orthoganilty Thesis needs is an argument to the effect that a SuperIntelligence would be able to to endlessly update without ever changing its value system, even accidentally. That is tricky since it effectively means predicting what smarter version of tiself would do. Making it smarted doesn’t help, because it is still faced with the problem of predicting what an even smarterer version of itself would be .. the carrot remains in front of the donkey.
Assuming that the value stability problem has been solved in general gives you are coherent Clippy, but it doesn’t rescue the Orthogonality Thesis as a claim about rationality in general, sin ce it remains the case that most most agents won’t have firewalled values. If have to engineer something in , it isn’t an intrinsic truth.