Fair enough for the alignment comparison, I was just hoping you could maybe correct the quoted paragraph to say “performance on the hold-out data” or something similar.
(The reason to expect more spread would be that training performance can’t detect overfitting but performance on the hold-out data can. I’m guessing some of the nets trained in Miller et al did indeed overfit (specifically the ones with lower performance).)
Fair enough for the alignment comparison, I was just hoping you could maybe correct the quoted paragraph to say “performance on the hold-out data” or something similar.
(The reason to expect more spread would be that training performance can’t detect overfitting but performance on the hold-out data can. I’m guessing some of the nets trained in Miller et al did indeed overfit (specifically the ones with lower performance).)