Jeremy Gillen comments on Broad Basins and Data Compression

Jeremy Gillen 9 Aug 2022 4:26 UTC
2 points
0
Yeah that would be interesting, but how would we tell the difference between trivial params (I’m assuming this means function doesn’t change anywhere) and equal loss models? Estimate this with a sampling of points out of distribution?
I kind of assumed that all changes in the parameters changed the function, but that some areas of the loss landscape change the function faster than others? This would be my prediction