Would it be possible to write a safe, recursively self-improving chess-playing AI, for instance?
Would this AI think about chess in abstract, or would it play chess against real humans? More precisely, would it have a notion of “in situation X, my opponents are more likely to make a move M” even if such knowledge cannot be derived from mere rules of chess? Because if it has some concept of an opponent (even in sense of some “black box” making the moves), it could start making some assumptions about the opponent and testing them. There would be an information channel from the real world to the world of AI. A very narrow channel, but if the AI could use all bits efficiently, after getting enough bits it could develop a model of the outside world (for the purposes of predicting the opponent’s moves better).
In other words, I imagine an AIXI, which can communicate with the world only through the chess board. If there is a way to influence the world outside, in a way that leads to more wins in chess, the AI would probably find it. For example, the AI could send outside a message (encoded in its choice of possible chess moves) that it is willing to help any humans if those humans will allow the AI to win in chess more often. Somebody could make a deal with the AI like this: “If you help me become the king of the world, I promise I will let you win all chess games every” and the AI would use its powers (combined with the powers of the given human) to reach this goal.
Would this AI think about chess in abstract, or would it play chess against real humans? More precisely, would it have a notion of “in situation X, my opponents are more likely to make a move M” even if such knowledge cannot be derived from mere rules of chess? Because if it has some concept of an opponent (even in sense of some “black box” making the moves), it could start making some assumptions about the opponent and testing them. There would be an information channel from the real world to the world of AI. A very narrow channel, but if the AI could use all bits efficiently, after getting enough bits it could develop a model of the outside world (for the purposes of predicting the opponent’s moves better).
In other words, I imagine an AIXI, which can communicate with the world only through the chess board. If there is a way to influence the world outside, in a way that leads to more wins in chess, the AI would probably find it. For example, the AI could send outside a message (encoded in its choice of possible chess moves) that it is willing to help any humans if those humans will allow the AI to win in chess more often. Somebody could make a deal with the AI like this: “If you help me become the king of the world, I promise I will let you win all chess games every” and the AI would use its powers (combined with the powers of the given human) to reach this goal.