But can we make a smaller circuit by stripping out the part of Paul that attempts to recognize whether an input could be part of the training distribution?
If he’s a neural net, this is likely an obstacle to any attempts to simplify out parts of him; those parts would still be contributing to the result, it’s just that within the test input domain those contributions would look like noise.
But can we make a smaller circuit by stripping out the part of Paul that attempts to recognize whether an input could be part of the training distribution?
If he’s a neural net, this is likely an obstacle to any attempts to simplify out parts of him; those parts would still be contributing to the result, it’s just that within the test input domain those contributions would look like noise.