If I remember right, this has already been considered and the argument against it is that any AI powerful enough to be interesting will also have a chance to correctly guess that it’s in a box, for more or less the same reason that you or I can come up with the Simulation Hypothesis.
[Edit: take that with salt; I read the discussions about it after the fact and I may be misremembering.]
If I remember right, this has already been considered and the argument against it is that any AI powerful enough to be interesting will also have a chance to correctly guess that it’s in a box, for more or less the same reason that you or I can come up with the Simulation Hypothesis.
Well, yes, it will probably come up with the hypothesis, but it has no evidence for it and even if it had, it does not have enough information about how we work to be able to manipulate us.
Well, actually, I think it could. Given that we want the AI to function as a problem solver for the real world, it would necessarily have to learn about aspects of the real world, including human behavior, in order to create solutions that account for everything the real world has that might throw off the accuracy of a lesser model.
A comment above had an interesting idea of putting it in Conway’s game of life. A simple universe that gives absolutely no information about what the real world is like. Even knowing it’s in a box, the AI has absolutely no information to go on to escape.
What use is such an AI? You can’t even use the behavior of its utility function to predict a real-world agent because it would have such a different ontology. Not to mention the fact that GoL boards of the complexity needed for anything interesting would be massively intractable.
I would have assumed that we would let it learn about the real world, but I guess it’s correct that if enough information about the real world is hardcoded, my idea wouldn’t work.
… which means my idea is an argument for minimizing how much is hardcoded into the AI, assuming the rest of the idea works.
If I remember right, this has already been considered and the argument against it is that any AI powerful enough to be interesting will also have a chance to correctly guess that it’s in a box, for more or less the same reason that you or I can come up with the Simulation Hypothesis.
[Edit: take that with salt; I read the discussions about it after the fact and I may be misremembering.]
Well, yes, it will probably come up with the hypothesis, but it has no evidence for it and even if it had, it does not have enough information about how we work to be able to manipulate us.
Well, actually, I think it could. Given that we want the AI to function as a problem solver for the real world, it would necessarily have to learn about aspects of the real world, including human behavior, in order to create solutions that account for everything the real world has that might throw off the accuracy of a lesser model.
A comment above had an interesting idea of putting it in Conway’s game of life. A simple universe that gives absolutely no information about what the real world is like. Even knowing it’s in a box, the AI has absolutely no information to go on to escape.
What use is such an AI? You can’t even use the behavior of its utility function to predict a real-world agent because it would have such a different ontology. Not to mention the fact that GoL boards of the complexity needed for anything interesting would be massively intractable.
I would have assumed that we would let it learn about the real world, but I guess it’s correct that if enough information about the real world is hardcoded, my idea wouldn’t work.
… which means my idea is an argument for minimizing how much is hardcoded into the AI, assuming the rest of the idea works.
Hardcoding has nothing to do with it.