I wonder what is going on here. When I think about this naively, it would seem that there are two factors which should pull in the opposite directions.
On one hand, extreme quantization is somewhat similar to the “spirit of sparsification” (using less bits to represent), and one would expect that this would normally pull towards more superposition.
But on the other hand, with strongly quantized weights there is less room to have sufficiently different linear combinations with few terms, and this should work against superposition...
Interesting, thanks!
I wonder what is going on here. When I think about this naively, it would seem that there are two factors which should pull in the opposite directions.
On one hand, extreme quantization is somewhat similar to the “spirit of sparsification” (using less bits to represent), and one would expect that this would normally pull towards more superposition.
But on the other hand, with strongly quantized weights there is less room to have sufficiently different linear combinations with few terms, and this should work against superposition...