An interesting way to build on my results here would be to do the same experiment with lots of different batch sizes, and plot the equi-temperature tradeoff curve between the batch size and the epochs, using the nick in the curve as a known-constant temperature in the graphs you get. You’ll probably want to zoom in on the graphs around that nick for more detailed measurements.
It would be interesting if many different training setups had the same functional form relating the batch size and the epochs to the temperature, but this seems like a too nice hypothesis to be true. Still possibly worth trying, and classifying the different functional forms you get.
Though you can use any epoch wise phase transition for this. Or even directly find the function mapping batch size to temperature if you have a good understanding of the situation like we do in toy models.
An interesting way to build on my results here would be to do the same experiment with lots of different batch sizes, and plot the equi-temperature tradeoff curve between the batch size and the epochs, using the nick in the curve as a known-constant temperature in the graphs you get. You’ll probably want to zoom in on the graphs around that nick for more detailed measurements.
It would be interesting if many different training setups had the same functional form relating the batch size and the epochs to the temperature, but this seems like a too nice hypothesis to be true. Still possibly worth trying, and classifying the different functional forms you get.
Though you can use any epoch wise phase transition for this. Or even directly find the function mapping batch size to temperature if you have a good understanding of the situation like we do in toy models.