I think discussions about capabilities raise the question “why create AI that is highly capable at deception etc.? seems like it would be safer not to”.
The problem that occurs here is that some ways to create capabilities are quite open-ended, and risk accidentally creating capabilities for deception due to instrumental convergence. But at that point it feels like we are getting into the territory that is best thought of as “intelligence”, rather than “capabilities”.
Nevertheless, I still think we should go with “capabilities” instead of “intelligence.” If someone says to me “why create AI that is highly capable at deception etc.?” I plan to say basically “Good question! Are you aware that multiple tech companies are explicitly trying to create AI that is highly capable at EVERYTHING, a.k.a. AGI, or even superintelligence, and that they have exceeded almost everyone else’s expectations in the past few years and seem to be getting close to succeeding?”
it is not clear if they really think it, or merely say so, but the effort they seem to be putting into it suggests they do have some confidence; and while they may be wrong, they are the ones who would know best, as they’re directly working with the things.
So the question is not really “do you think it is absolutely guaranteed that AGI will be created within the next 10 years?”, but rather “do you think it is absolutely impossible that it will?”. Any small amount of probability is at least worth giving it a thought! I get that lots of people are somewhat skeptical of their claims, makes sense, but you have to at least consider the possibility that they’re right.
I agree that a possible downside of talking about capabilities is that people might assume they are uncorrelated and we can choose not to create them. It does seem relatively easy to argue that deception capabilities arise as a side effect of building language models that are useful to humans and good at modeling the world, as we are already seeing with examples of deception / manipulation by Bing etc.
I think the people who think we can avoid building systems that are good at deception often don’t buy the idea of instrumental convergence either (e.g. Yann LeCun), so I’m not sure that arguing for correlated capabilities in terms of intelligence would have an advantage.
I think that’s the meaning of “general capabilities” though. If you think about an AI good at playing chess, it’s not weird to think it might just learn to use feints to deceive the opponent just as a part of its chess-goodness. A similar principle applies; in fact, I think game analogies might be a very powerful tool when discussing this!
I think discussions about capabilities raise the question “why create AI that is highly capable at deception etc.? seems like it would be safer not to”.
The problem that occurs here is that some ways to create capabilities are quite open-ended, and risk accidentally creating capabilities for deception due to instrumental convergence. But at that point it feels like we are getting into the territory that is best thought of as “intelligence”, rather than “capabilities”.
Nevertheless, I still think we should go with “capabilities” instead of “intelligence.” If someone says to me “why create AI that is highly capable at deception etc.?” I plan to say basically “Good question! Are you aware that multiple tech companies are explicitly trying to create AI that is highly capable at EVERYTHING, a.k.a. AGI, or even superintelligence, and that they have exceeded almost everyone else’s expectations in the past few years and seem to be getting close to succeeding?”
One thing that I think is also worth stressing:
the companies are trying to do it
they think they are close to succeeding
it is not clear if they really think it, or merely say so, but the effort they seem to be putting into it suggests they do have some confidence; and while they may be wrong, they are the ones who would know best, as they’re directly working with the things.
So the question is not really “do you think it is absolutely guaranteed that AGI will be created within the next 10 years?”, but rather “do you think it is absolutely impossible that it will?”. Any small amount of probability is at least worth giving it a thought! I get that lots of people are somewhat skeptical of their claims, makes sense, but you have to at least consider the possibility that they’re right.
I agree that a possible downside of talking about capabilities is that people might assume they are uncorrelated and we can choose not to create them. It does seem relatively easy to argue that deception capabilities arise as a side effect of building language models that are useful to humans and good at modeling the world, as we are already seeing with examples of deception / manipulation by Bing etc.
I think the people who think we can avoid building systems that are good at deception often don’t buy the idea of instrumental convergence either (e.g. Yann LeCun), so I’m not sure that arguing for correlated capabilities in terms of intelligence would have an advantage.
I think that’s the meaning of “general capabilities” though. If you think about an AI good at playing chess, it’s not weird to think it might just learn to use feints to deceive the opponent just as a part of its chess-goodness. A similar principle applies; in fact, I think game analogies might be a very powerful tool when discussing this!