It will depend on how much much high-quality data you need to train the reporter. Probably it’s a small fraction of the data you need to train the predictor, and so for generating each reporter datapoint you can afford to use many times more data than the predictor usually uses. I often imagine the helpers having 10-100x more computation time.
It will depend on how much much high-quality data you need to train the reporter. Probably it’s a small fraction of the data you need to train the predictor, and so for generating each reporter datapoint you can afford to use many times more data than the predictor usually uses. I often imagine the helpers having 10-100x more computation time.