Is there a reason you have it switch to the human net just once in the middle?
I would worry that the predictor might switch ontologies as time goes on. Perhaps, to make the best use of the human compute time, it reasons in a human ontology up until n/2. Once the threat of translation is past, it might switch to its own ontology from n/2 to n. If so, the encoder that works up to n/2 might be useless thereafter. A natural alternative would be to have it switch back and forth some random number of times at random intervals.
Later in the post, I proposed a similar modification:
I think we should modify the simplified hand-off procedure I described above so that, during training:
A range of handoff thresholds and pproportions are drawn—in particular, there should be a reasonable probability of drawing pvalues close to 0, close to 1, and also 0 and 1 exactly.
The human net runs for pnsteps before calling the reporter.
Interesting proposal!
Is there a reason you have it switch to the human net just once in the middle?
I would worry that the predictor might switch ontologies as time goes on. Perhaps, to make the best use of the human compute time, it reasons in a human ontology up until n/2. Once the threat of translation is past, it might switch to its own ontology from n/2 to n. If so, the encoder that works up to n/2 might be useless thereafter. A natural alternative would be to have it switch back and forth some random number of times at random intervals.
Later in the post, I proposed a similar modification: