Yea, I agree with this description—input space is a strict subset of predictor-state counterfactuals.
In particular, I would be interested to hear if restricting to input space counterfactuals is clearly insufficient for a known reason. It appears to me that you can still pull the trick you describe in “the proposal” sub-section (constructing counterfactuals which change some property in a way that a human simulator would not pick up on) at least in some cases.
Ah, ok, I see what you’re saying now. I don’t see any reason why restricting to input space counterfactuals wouldn’t work, beyond the issues described with predictor-state counterfactuals. Possibly a performance hit from needing to make larger changes. In the worst case, a larger minimum change size might hurt with specifying the direct reporter.
Yea, I agree with this description—input space is a strict subset of predictor-state counterfactuals.
In particular, I would be interested to hear if restricting to input space counterfactuals is clearly insufficient for a known reason. It appears to me that you can still pull the trick you describe in “the proposal” sub-section (constructing counterfactuals which change some property in a way that a human simulator would not pick up on) at least in some cases.
Ah, ok, I see what you’re saying now. I don’t see any reason why restricting to input space counterfactuals wouldn’t work, beyond the issues described with predictor-state counterfactuals. Possibly a performance hit from needing to make larger changes. In the worst case, a larger minimum change size might hurt with specifying the direct reporter.