You write that Yudkowsky’s box problem is a strawman and a distraction. How do you arrive at this conclusion exactly?
Since I don’t think we can make a very realistic sandbox (at least not in the near future), perhaps the idea is to have an AI design that is known to work similarly with and without interaction with the world (looking at training data sampled from an environment versus the environment itself). Then, putatively, we could test the AI in the non-interactive case before getting anywhere near an AI-box scenario.
Since I don’t think we can make a very realistic sandbox (at least not in the near future), perhaps the idea is to have an AI design that is known to work similarly with and without interaction with the world (looking at training data sampled from an environment versus the environment itself). Then, putatively, we could test the AI in the non-interactive case before getting anywhere near an AI-box scenario.