This gets good log loss because it’s trained in the regime where the human understands what’s going on, correct?
Yes; for this bad F, the resulting G is very similar to a human simulator.
This gets good log loss because it’s trained in the regime where the human understands what’s going on, correct?
Yes; for this bad F, the resulting G is very similar to a human simulator.