Edited for clarity, thanks. As noted below, the AI wouldn’t have the power to expand its own computational capacity (though we could, of course, ask it what would expand its computational capacity, and what the consequences of expanding its computational capacity would be, and then modify the machine as so if we thought it was a good idea.)
Likewise, each question has its own little utility function, and the AI only cares about its singular answer to the current question. The demons don’t want to manipulate events so that future demons can give better answers, because they don’t care about future demons; they only want to answer their own defining question.
Slight worry here, if a demon has to make a prediction then it has an incentive to manipulate events to ensure its prediction comes true. E.g. a demon is asked what the probability of a nuclear war in the next decade is (suppose answers are graded by the log scoring rule). It finds a way out of the box, outputs 99.9%, then sets about ensuring its ‘prediction’ comes true (once its out of the box we can’t reliably destroy it).
Another problem is that the way it currently works it seems like all you have are the demons and a great big database, which means each demon will need at least a few days to self improve on its own before it can do any good, which allows more opportunities for shenanigans such as those above as well as attempts to expand itself, or stall for time before giving its answer to maximise probability of being correct.
Edited for clarity, thanks. As noted below, the AI wouldn’t have the power to expand its own computational capacity (though we could, of course, ask it what would expand its computational capacity, and what the consequences of expanding its computational capacity would be, and then modify the machine as so if we thought it was a good idea.)
Likewise, each question has its own little utility function, and the AI only cares about its singular answer to the current question. The demons don’t want to manipulate events so that future demons can give better answers, because they don’t care about future demons; they only want to answer their own defining question.
Slight worry here, if a demon has to make a prediction then it has an incentive to manipulate events to ensure its prediction comes true. E.g. a demon is asked what the probability of a nuclear war in the next decade is (suppose answers are graded by the log scoring rule). It finds a way out of the box, outputs 99.9%, then sets about ensuring its ‘prediction’ comes true (once its out of the box we can’t reliably destroy it).
Another problem is that the way it currently works it seems like all you have are the demons and a great big database, which means each demon will need at least a few days to self improve on its own before it can do any good, which allows more opportunities for shenanigans such as those above as well as attempts to expand itself, or stall for time before giving its answer to maximise probability of being correct.