I tried to sketch a toy problem with tune-able “factorizability”
Draw n samples each from two equivariant normal distributions with different means a and b. The task is to estimate the median of the 2n combined samples.
If a << b it factorizes completely—the network just needs to estimate max(A) and min(B) and in the last step output their midpoint. As we move a and b closer together more and more comparisons across A and B are important for accuracy, so the problem is partially factorizable. When a = b the problem doesn’t (naively) factorize at all.
(I assume small enough neural networks can’t find the fancy recursive solution*, so the circuit complexity will behave like the number of comparisons in the naive algorithm.)
I tried to sketch a toy problem with tune-able “factorizability”
Draw n samples each from two equivariant normal distributions with different means a and b. The task is to estimate the median of the 2n combined samples.
If a << b it factorizes completely—the network just needs to estimate max(A) and min(B) and in the last step output their midpoint. As we move a and b closer together more and more comparisons across A and B are important for accuracy, so the problem is partially factorizable. When a = b the problem doesn’t (naively) factorize at all.
(I assume small enough neural networks can’t find the fancy recursive solution*, so the circuit complexity will behave like the number of comparisons in the naive algorithm.)
*https://en.wikipedia.org/wiki/Median_of_medians