So every time you look at a (future equivalent of) website or email, you ask your tool to list equally convincing counter-arguments to whatever you’re looking at?
Sure, why not? I think IBM is actually planning to do this with IBM Watson. Once mobile phones become fast enough you can receive constant feedback about ideas and arguments you encounter.
For example, some commercial tells you that you can lose 10 pounds in 1 day by taking a pill. You then either ask your “IBM Oracle” or have it set up to give you automatic feedback. It will then tell you that there are no studies that indicate that something as advertised is possible and that it won’t be healthy anyway. Or something along those lines.
I believe that in future it will be possible to augment everything with fact-check annotations.
But that’s besides the point. The idea was that if you run the AI box experiment with Eliezer posing as malicious AI trying to convince the gatekeeper to let it out of the box, and at the same time as a question answering tool using the same algorithms as the AI, then I don’t think someone would let him out of the box. He would basically have to destroy his own arguments by giving unbiased answers about the trustworthiness of the boxed agent and possible consequences of letting it out of the box. At the very best the AI in agent mode would have to contradict the tool mode version and thereby reveal that it is dishonest and not trustworthy.
So every time you look at a (future equivalent of) website or email, you ask your tool to list equally convincing counter-arguments to whatever you’re looking at?
Sure, why not?
When I’m feeling down and my mom sends me an email trying to cheer me up, that’ll be a bit of a bummer.
Sure, why not? I think IBM is actually planning to do this with IBM Watson. Once mobile phones become fast enough you can receive constant feedback about ideas and arguments you encounter.
For example, some commercial tells you that you can lose 10 pounds in 1 day by taking a pill. You then either ask your “IBM Oracle” or have it set up to give you automatic feedback. It will then tell you that there are no studies that indicate that something as advertised is possible and that it won’t be healthy anyway. Or something along those lines.
I believe that in future it will be possible to augment everything with fact-check annotations.
But that’s besides the point. The idea was that if you run the AI box experiment with Eliezer posing as malicious AI trying to convince the gatekeeper to let it out of the box, and at the same time as a question answering tool using the same algorithms as the AI, then I don’t think someone would let him out of the box. He would basically have to destroy his own arguments by giving unbiased answers about the trustworthiness of the boxed agent and possible consequences of letting it out of the box. At the very best the AI in agent mode would have to contradict the tool mode version and thereby reveal that it is dishonest and not trustworthy.
When I’m feeling down and my mom sends me an email trying to cheer me up, that’ll be a bit of a bummer.