Another method for dealing with this case:
Prior to the algorithm, pick a “safe set” S of (input, action) pairs which seem, on some reflection, that they should obviously not be forbidden. Then only punish actions which have a larger set of values r for which F(a,r) is true than, say, the mean of all actions in S (could be done by setting the evaluation of the null action anull to equal the evaluation of some random element in S). This means that if A will choose actions that are at least as safe as those in S (which might be suboptimally influenced by the size of the advice set RF, but would not lead to paralysis). This could compromise safety if S is chosen poorly, or if there are pathological cases where some unsafe action has fewer reasons to think it unsafe than the actions in S (this seems unlikely at first glance).
Another method for dealing with this case: Prior to the algorithm, pick a “safe set” S of (input, action) pairs which seem, on some reflection, that they should obviously not be forbidden. Then only punish actions which have a larger set of values r for which F(a,r) is true than, say, the mean of all actions in S (could be done by setting the evaluation of the null action anull to equal the evaluation of some random element in S). This means that if A will choose actions that are at least as safe as those in S (which might be suboptimally influenced by the size of the advice set RF, but would not lead to paralysis). This could compromise safety if S is chosen poorly, or if there are pathological cases where some unsafe action has fewer reasons to think it unsafe than the actions in S (this seems unlikely at first glance).