Jacob Pfau comments on Abram Demski’s ELK thoughts and proposal—distillation

Jacob Pfau 7 Feb 2024 19:59 UTC
1 point
0
Yea, I agree with this description—input space is a strict subset of predictor-state counterfactuals.

In particular, I would be interested to hear if restricting to input space counterfactuals is clearly insufficient for a known reason. It appears to me that you can still pull the trick you describe in “the proposal” sub-section (constructing counterfactuals which change some property in a way that a human simulator would not pick up on) at least in some cases.
- Rubi J. Hudson 8 Feb 2024 0:08 UTC
  1 point
  0
  Parent
  Ah, ok, I see what you’re saying now. I don’t see any reason why restricting to input space counterfactuals wouldn’t work, beyond the issues described with predictor-state counterfactuals. Possibly a performance hit from needing to make larger changes. In the worst case, a larger minimum change size might hurt with specifying the direct reporter.