Could asking counterfactual questions be a potentially useful strategy to bias the reporter to be a direct translator rather than a human simulator?
Concretely, consider a tuple (v, a, v’), where v := ‘before’ video, a := ‘action’ selected by SmartVault or augmented-human or whatever, and v’ := ‘after’ video.
Then, for some new action a’, ask the question:
“Given (v, a, v’), if action a’ was taken, is the diamond in the room?”
(How we collect such data is unclear but doesn’t seem obviously intractable.)
I think there’s some value here:
Answering such a question might not require computation concerning a and v’ ; if we see these computations being used, we might derive more value from regularizers that penalize downstream variables (which now includes the nodes close to a)
This might also force the reporter to essentially model (or compress but not indefinitely) the predictor; the reporter now has both a compressed predictor Bayes’ net and a human Bayes’ net. If we can be confident that the compressed predictor BN is much smaller than the human BN, then doing direct translation within the reporter, i.e. compressed predictor BN inference + translation + read off from human BN might be less expensive than the human simulator alternative, i.e. compressed predictor BN inference + ‘translation’/bridging computation + human BN inference.
We might find ways of being confident that the compressed predictor BN is small (e.g. by adding decoders at every layer of the reporter that reconstruct v, a or v’ and heavily penalizing later-layer decoders)
Naive thought #2618281828:
Could asking counterfactual questions be a potentially useful strategy to bias the reporter to be a direct translator rather than a human simulator?
Concretely, consider a tuple (v, a, v’), where v := ‘before’ video, a := ‘action’ selected by SmartVault or augmented-human or whatever, and v’ := ‘after’ video.
Then, for some new action a’, ask the question:
“Given (v, a, v’), if action a’ was taken, is the diamond in the room?”
(How we collect such data is unclear but doesn’t seem obviously intractable.)
I think there’s some value here:
Answering such a question might not require computation concerning a and v’ ; if we see these computations being used, we might derive more value from regularizers that penalize downstream variables (which now includes the nodes close to a)
This might also force the reporter to essentially model (or compress but not indefinitely) the predictor; the reporter now has both a compressed predictor Bayes’ net and a human Bayes’ net. If we can be confident that the compressed predictor BN is much smaller than the human BN, then doing direct translation within the reporter, i.e. compressed predictor BN inference + translation + read off from human BN might be less expensive than the human simulator alternative, i.e. compressed predictor BN inference + ‘translation’/bridging computation + human BN inference.
We might find ways of being confident that the compressed predictor BN is small (e.g. by adding decoders at every layer of the reporter that reconstruct v, a or v’ and heavily penalizing later-layer decoders)