neverix comments on Research Report: Alternative sparsity methods for sparse autoencoders with OthelloGPT.

neverix 15 Jun 2024 3:12 UTC
3 points
0
Freshman’s dream sparsity loss
A similar regularizer is known as Hoyer-Square.
Pick a value for $k$ and a small $ϵ \geq 0$ . Then define the activation function $T_{k, ϵ}$ in the following way. Given a vector $x$ , let $b$ be the value of the $k$ th-largest entry in $x$ . Then define the vector $T_{k, ϵ} (x)$ by
Is $a$ in the following formula a typo?
- Andrew Quaisley 15 Jun 2024 18:38 UTC
  1 point
  0
  Parent
  Oh, yeah, looks like with $p = 2$ this is equivalent to Hoyer-Square. Thanks for pointing that out; I didn’t know this had been studied previously.
  And you’re right, that was a typo, and I’ve fixed it now. Thank you for mentioning that!