Now, if the AI is implementing DRL, the uncertainty between Earth and Mu leads it to delegate to the advisor precisely at the moment this difference is important.
It seems like this is giving up on allowing the AI to make long-term predictions. It can make short-term, testable predictions (since if different advisors disagree, it is possible to see who is right). But long-term predictions can’t be cheaply tested.
In the absence of long-term predictions, it still might be possible to do something along the lines of what Paul is thinking of (i.e. predicting human judgments of longer-term things), but I don’t see what else you could do. Does this match your model?
I’m not giving up on long-term predictions in general. It’s just that, because of traps, some uncertainties cannot be resolved by testing, as you say. In those cases the AI has to rely on what it learned from the advisor, which indeed amounts to human judgment.
It seems like this is giving up on allowing the AI to make long-term predictions. It can make short-term, testable predictions (since if different advisors disagree, it is possible to see who is right). But long-term predictions can’t be cheaply tested.
In the absence of long-term predictions, it still might be possible to do something along the lines of what Paul is thinking of (i.e. predicting human judgments of longer-term things), but I don’t see what else you could do. Does this match your model?
I’m not giving up on long-term predictions in general. It’s just that, because of traps, some uncertainties cannot be resolved by testing, as you say. In those cases the AI has to rely on what it learned from the advisor, which indeed amounts to human judgment.