For that earlier section, we used smaller models trained on intersect (4,000 parameters) instead of intersect (80,000 parameters) -- the only reason for this was to allow for a larger sample size of 10,000 models with our compute budget. All subsequent sections use the models.
Wilson Wu
Karma: 112
“Utter elitism” is a nice article about this phenomenon