Your framing though requires acceptance of consequentialist optimization view on decision making (so that “good enough” is not considered good enough if it’s possible to do better), and of there being a significant difference between the better outcomes and the “default” outcomes.
Yes, agreed. On the other hand it may especially appeal to some AI researchers who seem really taken with the notion of optimality. :)
So while it’s true that this particular argument doesn’t have to hold for the problem to remain serious, it looks like one of the best available arguments for the seriousness of the problem.
But it seems too early to conclude “orthogonality” at this point. So if you say “AI is dangerous because values are orthogonal to optimization power”, that may just invite people to dismiss you.
If it’s not immediately obvious to someone that the default outcome is not likely to be close to optimal, I think maybe we should emphasize the Malthusian scenario first. Or maybe use a weaker version of the orthogonality thesis (and not call it “orthogonality” which sounds like claiming full independence). And also emphasize that there are multiple lines of argument and not get stuck debating a particular one.
It’s harder to make this argument convincing, as it seems to depend on acceptance of some decision-theoretic/epistemic/metaethical background.
I’d be curious to know if that’s actually the case.
Or maybe use a weaker version of the orthogonality thesis (and not call it “orthogonality” which sounds like claiming full independence). And also emphasize that there are multiple lines of argument and not get stuck debating a particular one.
Right, at least mentioning that there is a more abstract argument that doesn’t depend on particular scenarios could be useful (for example, in Luke’s Facing the Singularity).
The robust part of “orthogonality” seems to be the idea that with most approaches to AGI (including neuromorphic or evolved, with very few exceptions such as WBE, which I wouldn’t call AGI, just faster humans with more dangerous tools for creating an AGI), it’s improbable that we end up with something close to human values, even if we try, and that greater optimization power of a design doesn’t address this issue (while aggravating the consequences, potentially all the way to a fatal intelligence explosion). I don’t think it’s too early to draw this weaker conclusion (and stronger statements seem mostly irrelevant for the argument).
This version is essentially Eliezer’s “complexity and fragility of values”, right? I suggest we keep calling it that, instead of “orthogonality” which again sounds like a too strong claim which makes it less likely for people to consider it seriously.
This version is essentially Eliezer’s “complexity and fragility of values”, right?
Basically, but there is a separate point here that greater optimization power doesn’t help with the problem and instead makes it worse. I agree that the word “orthogonality” is somewhat misleading.
Yes, agreed. On the other hand it may especially appeal to some AI researchers who seem really taken with the notion of optimality. :)
But it seems too early to conclude “orthogonality” at this point. So if you say “AI is dangerous because values are orthogonal to optimization power”, that may just invite people to dismiss you.
If it’s not immediately obvious to someone that the default outcome is not likely to be close to optimal, I think maybe we should emphasize the Malthusian scenario first. Or maybe use a weaker version of the orthogonality thesis (and not call it “orthogonality” which sounds like claiming full independence). And also emphasize that there are multiple lines of argument and not get stuck debating a particular one.
I’d be curious to know if that’s actually the case.
Right, at least mentioning that there is a more abstract argument that doesn’t depend on particular scenarios could be useful (for example, in Luke’s Facing the Singularity).
The robust part of “orthogonality” seems to be the idea that with most approaches to AGI (including neuromorphic or evolved, with very few exceptions such as WBE, which I wouldn’t call AGI, just faster humans with more dangerous tools for creating an AGI), it’s improbable that we end up with something close to human values, even if we try, and that greater optimization power of a design doesn’t address this issue (while aggravating the consequences, potentially all the way to a fatal intelligence explosion). I don’t think it’s too early to draw this weaker conclusion (and stronger statements seem mostly irrelevant for the argument).
This version is essentially Eliezer’s “complexity and fragility of values”, right? I suggest we keep calling it that, instead of “orthogonality” which again sounds like a too strong claim which makes it less likely for people to consider it seriously.
Basically, but there is a separate point here that greater optimization power doesn’t help with the problem and instead makes it worse. I agree that the word “orthogonality” is somewhat misleading.
David Dalrymple was nice enough to illustrate my concern with “orthogonality” just as we’re talking about it. :)
...which also presented an opportunity to make a consequentialist argument for FAI under the assumption that all AGIs are good.