Lucius Bushnaq comments on The Hessian rank bounds the learning coefficient

Lucius Bushnaq 16 Aug 2024 20:59 UTC
11 points
4
Getting the Hessian eigenvalues does not require calculating the full Hessian. You use Jacobian vector product methods in e.g. JAX. The Hessian itself never has to be explicitly represented in memory.

And even assuming the estimator for the Hessian pseudoinverse is cheap and precise, you’d still need to get its rank anyway, which would by default be just as expensive as getting the rank of the Hessian.
- harsimony 16 Aug 2024 21:44 UTC
  1 point
  0
  Parent
  That makes sense, I guess it just comes down to an empirical question of which is easier.
  
  Question about what you said earlier: How can you use the top/bottom eigenvalues to estimate the rank of the Hessian? I’m not as familiar with this so any pointers would be appreciated!
  - George Ingebretsen 3 Sep 2024 1:36 UTC
    4 points
    0
    Parent
    The rank of a matrix = the number of non-zero eigenvalues of the matrix! So you can either use the top eigenvalues to count the non-zeros, or you can use the fact that an $n \times n$ matrix always has $n$ eigenvalues to determine the number of non-zero eigenvalues by counting the bottom zero-eigenvalues.
    
    Also for more detail on the “getting hessian eigenvalues without calculating the full hessian” thing, I’d really recommend Johns explanation in this linear algebra lecture he recorded.
    - harsimony 3 Sep 2024 19:54 UTC
      3 points
      0
      Parent
      Thanks for this! I misinterpreted Lucius as saying “use the single highest and single lowest eigenvalues to estimate the rank of a matrix” which I didn’t think was possible.
      
      Counting the number of non-zero eigenvalues makes a lot more sense!