Vivek Hebbar comments on Hessian and Basin volume

Vivek Hebbar 11 Jul 2022 8:55 UTC
LW: 6 AF: 4
2
AF
From this paper, “Theoretical work limited to ReLU-type activation functions, showed that in overparameterized networks, all global minima lie in a connected manifold (Freeman & Bruna, 2016; Nguyen, 2019)”
So for overparameterized nets, the answer is probably:
- There is only one solution manifold, so there are no separate basins. Every solution is connected.
- We can salvage the idea of “basin volume” as follows:
  - In the dimensions perpendicular to the manifold, calculate the basin cross-section using the Hessian.
  - In the dimensions parallel to the manifold, ask “how can I move before it stops being the ‘same function’?”. If we define “sameness” as “same behavior on the validation set”,^[1] then this means looking at the Jacobian of that behavior in the plane of the manifold.
  - Multiply the two hypervolumes to get the hypervolume of our “basin segment” (very roughly, the region of the basin which drains to our specific model)
1. ^
  There are other “sameness” measures which look at the internals of the model; I will be proposing one in an upcoming post.