Note that their improvement over Strassen on 4x4 matrices is for finite fields only, i.e. modular arithmetic, not what most neural networks use.
In fact, the 47 multiplication result is on Z/2Z, so it’s not even general modular arithmetic. That being said, there are still speedups on standard floating point arithmetic both in terms of number of multiplications, but also wall clock time.
That’s a very important correction. For real arithmetic they only improve for rectangular matrices (e.g. 3x4 multiplied by 4x5) which is less important and less well studied.
Current theme: default
Less Wrong (text)
Less Wrong (link)
Arrow keys: Next/previous image
Escape or click: Hide zoomed image
Space bar: Reset image size & position
Scroll to zoom in/out
(When zoomed in, drag to pan; double-click to close)
Keys shown in yellow (e.g., ]) are accesskeys, and require a browser-specific modifier key (or keys).
]
Keys shown in grey (e.g., ?) do not require any modifier keys.
?
Esc
h
f
a
m
v
c
r
q
t
u
o
,
.
/
s
n
e
;
Enter
[
\
k
i
l
=
-
0
′
1
2
3
4
5
6
7
8
9
→
↓
←
↑
Space
x
z
`
g
Note that their improvement over Strassen on 4x4 matrices is for finite fields only, i.e. modular arithmetic, not what most neural networks use.
In fact, the 47 multiplication result is on Z/2Z, so it’s not even general modular arithmetic.
That being said, there are still speedups on standard floating point arithmetic both in terms of number of multiplications, but also wall clock time.
That’s a very important correction. For real arithmetic they only improve for rectangular matrices (e.g. 3x4 multiplied by 4x5) which is less important and less well studied.