Joseph Van Name comments on Deep learning models might be secretly (almost) linear

Joseph Van Name 13 Oct 2023 19:45 UTC
1 point
0
I trained a (plain) neural network on a couple of occasions to predict the output of the function $x_{1} \oplus \dots \oplus x_{5}$ where $x_{1}, \dots, x_{5}$ are bits and $\oplus$ denotes the XOR operation. The neural network was hopelessly confused despite the fact that neural networks usually do not have any trouble memorizing large quantities of random information. This time the neural network could not even memorize the truth table for XOR. While the operation $(x_{1}, \dots, x_{5}) \mapsto x_{1} \oplus \dots \oplus x_{5}$ is linear over the field $F_{2}$ , it is quite non-linear over $R$ . The inability for a simple neural network to learn this function indicates that neural networks are better at learning when they are not required to stray too far away from linearity.