Why doesn’t it work to train on all the 1-hot input vectors using an architecture that suitably encodes Z_2 dot product and the only variable weights are those for the vector representing S? Does B not get to choose the inputs they will train with?
Edit: Mentally swapped A with B in one place while reading.
Why doesn’t it work to train on all the 1-hot input vectors using an architecture that suitably encodes Z_2 dot product and the only variable weights are those for the vector representing S? Does B not get to choose the inputs they will train with?Edit: Mentally swapped A with B in one place while reading.