Look: humans can learn what a ‘tank’ is, and can direct their detection activities to specifically seek them—not whether the scene is light or dark, or any other weird regularity that might be present in the test materials. We can identify the regularities, compare them with the properties of tanks, and determine that they’re not what we’re looking for.
If we can do it, the computers can do it as well. We merely need to figure out how to bring it about—it’s an engineering challenge only. That doesn’t dismiss or minimize the difficulty of achieving it, but there isn’t a profound ‘philosophical’ challenge involved.
The problem with making powerful AIs that attempt to make the universe ‘right’ is that most of us have no idea what we mean by ‘right’, and either find it difficult to make our intuitive understanding explicit or have no interest in doing so.
We can’t solve this problem by linguistically redefining it away. There is no quick and easy solution, no magic method that will get us out of doing the hard work. The best way around the problem is to go straight through—there are no substitutions.
Eliezer’s last post may have touched upon a partial resolution, in that his statements about what he wants ‘rightness’ to be may implicitly refer to a guiding principle that would actually be constraining upon a rational AI. I may try to highlight that point when I figure out how to explain it properly.
Look: humans can learn what a ‘tank’ is, and can direct their detection activities to specifically seek them—not whether the scene is light or dark, or any other weird regularity that might be present in the test materials. We can identify the regularities, compare them with the properties of tanks, and determine that they’re not what we’re looking for.
If we can do it, the computers can do it as well. We merely need to figure out how to bring it about—it’s an engineering challenge only. That doesn’t dismiss or minimize the difficulty of achieving it, but there isn’t a profound ‘philosophical’ challenge involved.
The problem with making powerful AIs that attempt to make the universe ‘right’ is that most of us have no idea what we mean by ‘right’, and either find it difficult to make our intuitive understanding explicit or have no interest in doing so.
We can’t solve this problem by linguistically redefining it away. There is no quick and easy solution, no magic method that will get us out of doing the hard work. The best way around the problem is to go straight through—there are no substitutions.
Eliezer’s last post may have touched upon a partial resolution, in that his statements about what he wants ‘rightness’ to be may implicitly refer to a guiding principle that would actually be constraining upon a rational AI. I may try to highlight that point when I figure out how to explain it properly.