Maybe if the pre-specified distribution is a reasonably well-calibrated predictor of the AI (given that distribution)? Like, maybe this is a way that an Oracle AI could help ensure the safety of a somewhat weaker Tool AI.
Maybe if the pre-specified distribution is a reasonably well-calibrated predictor of the AI (given that distribution)? Like, maybe this is a way that an Oracle AI could help ensure the safety of a somewhat weaker Tool AI.