How would you find out something that a three-year-old is trying to hide from you?
It is complicated, and it very much depends on the kinds of clues that the three-year-old or the environment are giving in the particular situation. Still, adults do that kind of thing all the time, in ways that frankly bewilder three-year-olds, because they’re way smarter, they access more stored knowledge than a three-year-old can imagine and they see causal connections that three-year-olds don’t. How they deduce the truth will frequently be something the three-year-old could understand after the fact if it was explained patiently, but not something the three-year-old had any way of anticipating.
In the AI box scenario, we’re the three-year-olds. We don’t have any way of knowing whether a deception that would fool the likes of us would fool somebody way smarter.
How would you find out something that a three-year-old is trying to hide from you?
A three-year-old that designed the universe I live in in such a way that it stays hidden as well as possible? I have absolutely no idea. I would probably hope for it to get bored and tell me.
We don’t have any way of knowing whether a deception that would fool the likes of us would fool somebody way smarter.
We do have at least some clues:
We know the mathematics of how optimal belief updating works.
We have some rough estimates of the complexity of the theory the AI must figure out.
We have some rough estimates of the amount of information we have given the AI.
I know there is such a thing as underestimating AI, but I think you are severely overestimating it.
How would you find out something that a three-year-old is trying to hide from you?
It is complicated, and it very much depends on the kinds of clues that the three-year-old or the environment are giving in the particular situation. Still, adults do that kind of thing all the time, in ways that frankly bewilder three-year-olds, because they’re way smarter, they access more stored knowledge than a three-year-old can imagine and they see causal connections that three-year-olds don’t. How they deduce the truth will frequently be something the three-year-old could understand after the fact if it was explained patiently, but not something the three-year-old had any way of anticipating.
In the AI box scenario, we’re the three-year-olds. We don’t have any way of knowing whether a deception that would fool the likes of us would fool somebody way smarter.
A three-year-old that designed the universe I live in in such a way that it stays hidden as well as possible? I have absolutely no idea. I would probably hope for it to get bored and tell me.
We do have at least some clues:
We know the mathematics of how optimal belief updating works.
We have some rough estimates of the complexity of the theory the AI must figure out.
We have some rough estimates of the amount of information we have given the AI.
I know there is such a thing as underestimating AI, but I think you are severely overestimating it.