Note that their improvement over Strassen on 4x4 matrices is for finite fields only, i.e. modular arithmetic, not what most neural networks use.
In fact, the 47 multiplication result is on Z/2Z, so it’s not even general modular arithmetic. That being said, there are still speedups on standard floating point arithmetic both in terms of number of multiplications, but also wall clock time.
That’s a very important correction. For real arithmetic they only improve for rectangular matrices (e.g. 3x4 multiplied by 4x5) which is less important and less well studied.
Note that their improvement over Strassen on 4x4 matrices is for finite fields only, i.e. modular arithmetic, not what most neural networks use.
In fact, the 47 multiplication result is on Z/2Z, so it’s not even general modular arithmetic.
That being said, there are still speedups on standard floating point arithmetic both in terms of number of multiplications, but also wall clock time.
That’s a very important correction. For real arithmetic they only improve for rectangular matrices (e.g. 3x4 multiplied by 4x5) which is less important and less well studied.