I was thinking more of its algorithm admitting an interpretation where it’s asking “Say, I make prediction X. How accurate would that be?” and then maximizing over relevant possible X. Knowledge about its prediction connects the prediction to its origins and consequences, it establishes the prediction as part of the structure of environment. It’s not necessary (and maybe not possible and more importantly not useful) for the prediction itself to be inferable before it’s made.
Agreed that just outputting a single number is implausible to be a big deal (this is an Oracle AI with extremely low bandwidth and peculiar intended interpretation of its output data), but if we’re getting lots and lots of numbers it’s not as clear.
I’m thinking that type of architecture is less probable, because it would end up being more complicated than alternatives: it would have a powerful predictor as a sub-component of the utility-maximizing system, so an engineer could have just used the predictor in the first place.
But that’s a speculative argument, and I shouldn’t push it too far.
It seems like powerful AI prediction technology, if successful, would gain an important place in society. A prediction machine whose predictions were consumed by a large portion of society would certainly run into situations in which its predictions effect the future it’s trying to predict; there is little doubt about that in my mind. So, the question is what its behavior would be in these cases.
One type of solution would do as you say, maximizing a utility over the predictions. The utility could be “correctness of this prediction”, but that would be worse for humanity than a Friendly goal.
Another type of solution would instead report such predictive instability as accurately as possible. This doesn’t really dodge the issue; by doing this, the system is choosing a particular output, which may not lead to the best future. However, that’s markedly less concerning (it seems).
I was thinking more of its algorithm admitting an interpretation where it’s asking “Say, I make prediction X. How accurate would that be?” and then maximizing over relevant possible X. Knowledge about its prediction connects the prediction to its origins and consequences, it establishes the prediction as part of the structure of environment. It’s not necessary (and maybe not possible and more importantly not useful) for the prediction itself to be inferable before it’s made.
Agreed that just outputting a single number is implausible to be a big deal (this is an Oracle AI with extremely low bandwidth and peculiar intended interpretation of its output data), but if we’re getting lots and lots of numbers it’s not as clear.
I’m thinking that type of architecture is less probable, because it would end up being more complicated than alternatives: it would have a powerful predictor as a sub-component of the utility-maximizing system, so an engineer could have just used the predictor in the first place.
But that’s a speculative argument, and I shouldn’t push it too far.
It seems like powerful AI prediction technology, if successful, would gain an important place in society. A prediction machine whose predictions were consumed by a large portion of society would certainly run into situations in which its predictions effect the future it’s trying to predict; there is little doubt about that in my mind. So, the question is what its behavior would be in these cases.
One type of solution would do as you say, maximizing a utility over the predictions. The utility could be “correctness of this prediction”, but that would be worse for humanity than a Friendly goal.
Another type of solution would instead report such predictive instability as accurately as possible. This doesn’t really dodge the issue; by doing this, the system is choosing a particular output, which may not lead to the best future. However, that’s markedly less concerning (it seems).
It would pass the Turing test—e.g. see here.