To be clear, I am making the claim that, of the people who have made useful advances on Oracle AI safety research (Armstrong counts here; I don’t think Yampolskiy does), all of them believe that the goal of having a safe Oracle AI is to achieve a decisive strategic advantage quickly and get to an aligned future. I recognize that this is a hard claim to evaluate (e.g. because this isn’t a statement one could put in a Serious Academic Journal Article in the 2010s, it would have to be discussed on their blog or in private correspondence), but if anyone has a clear counterexample, I’d be interested in seeing it.
My only evidence for this being a neglected consideration was what I wrote above: that the only place where I recall having seen this discussed in any detail is in my ownpapers. (I do believe that Eliezer has briefly mentioned something similar too, but even he has mostly just used the “well you can’t contain a superintelligence” line in response to Oracle AI arguments in general.)
You’re certainly in a position to know the actual thoughts of researchers working on this better than I do, and the thing about confinement being insufficient on its own is rather obvious if you think about it at all. So if you say that “everyone worth mentioning already thinks this”, then that sounds plausible to me and I don’t see a point in trying to go look for counterexamples. But in that case I feel even more frustrated that the “obvious” thing hasn’t really filtered into public discussion, and that e.g. popular takes on the subject still seem to treat the “can’t box a superintelligence” thing as the main argument against OAI, when you could instead give arguments that were much more compelling.
That’s a legit thing to be frustrated by, but I think you know the reason why AI safety researchers don’t want “we don’t see a way to get to a good outcome except for an aligned project to grab a decisive strategic advantage” to filter into public discourse: it pattern-matches too well to “trust us, you need to let us run the universe”.
To be clear, I am making the claim that, of the people who have made useful advances on Oracle AI safety research (Armstrong counts here; I don’t think Yampolskiy does), all of them believe that the goal of having a safe Oracle AI is to achieve a decisive strategic advantage quickly and get to an aligned future. I recognize that this is a hard claim to evaluate (e.g. because this isn’t a statement one could put in a Serious Academic Journal Article in the 2010s, it would have to be discussed on their blog or in private correspondence), but if anyone has a clear counterexample, I’d be interested in seeing it.
My only evidence for this being a neglected consideration was what I wrote above: that the only place where I recall having seen this discussed in any detail is in my own papers. (I do believe that Eliezer has briefly mentioned something similar too, but even he has mostly just used the “well you can’t contain a superintelligence” line in response to Oracle AI arguments in general.)
You’re certainly in a position to know the actual thoughts of researchers working on this better than I do, and the thing about confinement being insufficient on its own is rather obvious if you think about it at all. So if you say that “everyone worth mentioning already thinks this”, then that sounds plausible to me and I don’t see a point in trying to go look for counterexamples. But in that case I feel even more frustrated that the “obvious” thing hasn’t really filtered into public discussion, and that e.g. popular takes on the subject still seem to treat the “can’t box a superintelligence” thing as the main argument against OAI, when you could instead give arguments that were much more compelling.
That’s a legit thing to be frustrated by, but I think you know the reason why AI safety researchers don’t want “we don’t see a way to get to a good outcome except for an aligned project to grab a decisive strategic advantage” to filter into public discourse: it pattern-matches too well to “trust us, you need to let us run the universe”.