mtaran answers Supposing the 1bit LLM paper pans out

mtaran 2 Mar 2024 16:39 UTC
10 points
−4
I think this could be a big boon for mechanistic interpretability, since it’s can be a lot more straightforward to interpret a bunch of {-1, 0, 1}s than reals. Not a silver bullet by any means, but it would at least peel back one layer of complexity.
- Thomas Kwa 3 Mar 2024 4:41 UTC
  6 points
  6
  Parent
  It could also be harder. Say that 10 bits of current 16 bit parameters are useful; then to match the capacity you would need 6 ternary parameters, which would potentially be hard to find or interact in unpredictable ways.
  - mtaran 3 Mar 2024 15:11 UTC
    5 points
    1
    Parent
    Perhaps if you needed a larger number of ternary weights, but the paper claims to achieve the same performance with ternary weights as one gets with 16-bit weights using the same parameter count.