Yes and no. As for superintelligence there’s a game of motte-and-bailey between the explicit definition (superintelligence is reasonable because it just means « at least as good as a team of the best humans plus speed ») and LW actual usage of the term in practice (superintelligence is what can solve all physical problems in negligible time).
The points of disagreement I have with LWers on AI existential risk are mostly invariant to how capable AIs and superintelligences are in reality, though how they get the capabilities can matter for my disagreement points, so I’m trying to avoid relying on capabilities limitations for my disagreement points on AI extinction/existential risk.
For orthogonality, LW use of the term in practice is « an intelligence [would likely] jump to arbitrary values whatever the values it started from ». Am I right this is in disguise your own reason for saying we shouldn’t update based on OT?
Not really. The issue is that even accepting the orthogonality thesis is still compatible with a wide range of observations, and in particular is compatible with a view that views the AI safety problem is mostly a non-problem in practice ala Yann LeCun, as even if it’s possible to get an AI that values inhuman goals while still being very intelligent, we can optimize it fairly easily such that we can in practice not have to deal with rogue AI values while being very smart. In essence, it’s not narrow enough, which is why we shouldn’t update much without other assumptions.
In essence, it only claims that this is a possible outcome, but under that standard, logical omniscience is possible, too, and even infinite computation is possible, but we correctly don’t devote much resources to it. It doesn’t make any claim about it’s likelihood, remember that very clearly.
Could you tell more? I happen to count adversarial examples as an argument (weakly) against OT, because it’s not random but looks like an objective property from the dataset. What’s your own reasoning here?
I’m willing to concede this point, but from my perspective the Orthogonality thesis was talking about all possible intelligences, and I suspected that it was very difficult to ensure that the values of an AI couldn’t be say, paper clip maximization.
Keep in mind that the Orthogonality thesis is a really weak claim in terms of evidence, at least how I interpreted it, so it’s not very surprising that it’s probably true. This means it’s not enough to change our priors. That’s the problem I have with the orthogonality thesis and instrumental convergence assumptions: They don’t give enough evidence to justify AI risk from a skeptical prior, even assuming they’re true.
[canceled]
The points of disagreement I have with LWers on AI existential risk are mostly invariant to how capable AIs and superintelligences are in reality, though how they get the capabilities can matter for my disagreement points, so I’m trying to avoid relying on capabilities limitations for my disagreement points on AI extinction/existential risk.
Not really. The issue is that even accepting the orthogonality thesis is still compatible with a wide range of observations, and in particular is compatible with a view that views the AI safety problem is mostly a non-problem in practice ala Yann LeCun, as even if it’s possible to get an AI that values inhuman goals while still being very intelligent, we can optimize it fairly easily such that we can in practice not have to deal with rogue AI values while being very smart. In essence, it’s not narrow enough, which is why we shouldn’t update much without other assumptions.
In essence, it only claims that this is a possible outcome, but under that standard, logical omniscience is possible, too, and even infinite computation is possible, but we correctly don’t devote much resources to it. It doesn’t make any claim about it’s likelihood, remember that very clearly.
I’m willing to concede this point, but from my perspective the Orthogonality thesis was talking about all possible intelligences, and I suspected that it was very difficult to ensure that the values of an AI couldn’t be say, paper clip maximization.
Keep in mind that the Orthogonality thesis is a really weak claim in terms of evidence, at least how I interpreted it, so it’s not very surprising that it’s probably true. This means it’s not enough to change our priors. That’s the problem I have with the orthogonality thesis and instrumental convergence assumptions: They don’t give enough evidence to justify AI risk from a skeptical prior, even assuming they’re true.