Are artificial neural networks really Turing-complete? Yep, they are [Siegelman, Sontag 91]. Amount of neurons in the paper is 105, with rational edge weights, so it’s really Kolmogorov-complex. This, however, doesn’t say if we can build good machines for specific purposes.
Let’s figure out how to sort a dozen numbers with λ-calculus and sorting networks. It must stand to notice, that lambda-expression is O(1), whereas sorter network is O(n (log n)^2) in size.
Batcher’s odd–even mergesort would be O(log n) levels deep, and given one neuron is used to implement comparator, would result in O(n!) possible connections (around 229 per level). That we need 200 bits of insight to sort a dozen of numbers with that specific method does not mean that there is no cheaper way to do that, but sets a reasonable upper bound.
Apparently, I cannot do good lambda-calculus, but seems like we can do merge sorting of Church-encoded numerals in less than a hundred lambda-terms which is about the same amount of bits as sorting networks.
On a second note: how are Bayesian networks different from preceptrons, except fro having no thresholds?
Are artificial neural networks really Turing-complete? Yep, they are [Siegelman, Sontag 91]. Amount of neurons in the paper is 105, with rational edge weights, so it’s really Kolmogorov-complex. This, however, doesn’t say if we can build good machines for specific purposes.
Let’s figure out how to sort a dozen numbers with λ-calculus and sorting networks. It must stand to notice, that lambda-expression is O(1), whereas sorter network is O(n (log n)^2) in size.
Batcher’s odd–even mergesort would be O(log n) levels deep, and given one neuron is used to implement comparator, would result in O(n!) possible connections (around 229 per level). That we need 200 bits of insight to sort a dozen of numbers with that specific method does not mean that there is no cheaper way to do that, but sets a reasonable upper bound.
Apparently, I cannot do good lambda-calculus, but seems like we can do merge sorting of Church-encoded numerals in less than a hundred lambda-terms which is about the same amount of bits as sorting networks.
On a second note: how are Bayesian networks different from preceptrons, except fro having no thresholds?