Slight worry here, if a demon has to make a prediction then it has an incentive to manipulate events to ensure its prediction comes true. E.g. a demon is asked what the probability of a nuclear war in the next decade is (suppose answers are graded by the log scoring rule). It finds a way out of the box, outputs 99.9%, then sets about ensuring its ‘prediction’ comes true (once its out of the box we can’t reliably destroy it).
Another problem is that the way it currently works it seems like all you have are the demons and a great big database, which means each demon will need at least a few days to self improve on its own before it can do any good, which allows more opportunities for shenanigans such as those above as well as attempts to expand itself, or stall for time before giving its answer to maximise probability of being correct.
Slight worry here, if a demon has to make a prediction then it has an incentive to manipulate events to ensure its prediction comes true. E.g. a demon is asked what the probability of a nuclear war in the next decade is (suppose answers are graded by the log scoring rule). It finds a way out of the box, outputs 99.9%, then sets about ensuring its ‘prediction’ comes true (once its out of the box we can’t reliably destroy it).
Another problem is that the way it currently works it seems like all you have are the demons and a great big database, which means each demon will need at least a few days to self improve on its own before it can do any good, which allows more opportunities for shenanigans such as those above as well as attempts to expand itself, or stall for time before giving its answer to maximise probability of being correct.