Razied comments on Open Thread—July 2023

Razied 26 Jul 2023 23:41 UTC
2 points
0
I’m fairly sure that there’s architectures where each layer is a linear function of the concatenated activations of all previous layers, though I can’t seem to find it right now. If you add possible sparsity to that, then I think you get a fully general DAG.