natural abstractions are real, but finding them is hard
Depending on what exactly you mean by “hard”, this statement might be an oxymoron. Specifically, if a so-called “natural” abstraction is no easier to find than a corruption of that abstraction, there’s no real sense in which the abstraction in question is “natural”.
Or, in more detail: one way to phrase the natural abstraction hypothesis is that, out of the space of all possible abstractions (which is continuous, essentially by definition), certain abstractions within that space act as attractors, to which any learner must converge. Those attractors then “carve up” the continuous space into a discrete space of convergent abstractions, which would be the “natural” abstractions we’re looking for. (E.g. there’s no concept of “tree” arbitrarily close to and yet distinct from the natural abstraction of trees; or rather, there is, but it’s not an attractor, and will hence be dispreferred in favor of the natural version.)
Seeing KataGo exhibit this kind of brittleness, then, functions as a strike against this version of the hypothesis, because it evidences that a seemingly very clear concept (liberties) in a game with very simple rules (Go), despite being something you might expect to be a “natural” concept, was apparently not properly converged to by at least one system. Instead, the system in question basically did converge to a concept “arbitrarily close to, and yet distinct from” the real concept of liberties, which agrees with the real concept almost everywhere in normal play, and yet whose differences can be adversarially amplified until they turn into a winning strategy executable by far inferior agents (humans).
I think there are two “simple” abstractions possible here, where the amount of data to distinguish which one is right is minuscule under natural play in Go, and therefore can easily be overwhelmed by small constant factors in the “ease” of converging to either one due to inductive bias.
Abstraction 1: the size of the set of all empty spaces adjacent to a group
Abstraction 2: the sum of the number of empty spaces next to a given stone on the group, plus, recursively, this same sum for each of the neighboring stones of the same player to the north, south, east, west, where this recursion doesn’t “backtrack”, plus a penalty if a nearby empty space is bordered by the same group from multiple sides.
Abstraction 1 is mathematically more elegant, but abstraction 2 is algorithmically simpler to implement. If you modify abstraction 2′s algorithm to include a globally remembered “set” such that traversed spaces and stones are added to this set and are not re-traversed, you can make it equivalent to abstraction 1, but that requires additional intermediate memory during the computation and so is more complex.
Indeed, when programmers write iterative or recursive algorithms that operate on graphs, it’s easy and it’s usually less lines of code and less local state if you forget to include proper checks on prohibiting return to the global set of locations previously visited during that instance of that algorithm, and then your algorithm may work fine on trees but fails on cycles. The “less memory or state required” part, plus the fact that in Go groups with large cycles that would distinguish these two possibilites are rare, is what leads me to hypothesize right now (without explicit evidence or confirmation) that convergence to abstraction 2 is morally what’s going on here.
Hmm. I agree that this weakens the convergent natural abstractions hypothesis, but I also think that there is such a thing as slicing reality at the mechanism joints in your model, and that, while whether a learning system converges on those is a property that must be explicitly optimized in the design of the system in order for that convergence to occur, it still seems to me appropriate to say that the model almost learned a natural abstraction but had adversarial example issues due to not being robust. Approaches to robustifying a neural network seem likely to me to significantly increase the degree to which the network learns the natural abstraction.
I’ve inverted my self vote on my original comment because of becoming uncertain.
Depending on what exactly you mean by “hard”, this statement might be an oxymoron. Specifically, if a so-called “natural” abstraction is no easier to find than a corruption of that abstraction, there’s no real sense in which the abstraction in question is “natural”.
Or, in more detail: one way to phrase the natural abstraction hypothesis is that, out of the space of all possible abstractions (which is continuous, essentially by definition), certain abstractions within that space act as attractors, to which any learner must converge. Those attractors then “carve up” the continuous space into a discrete space of convergent abstractions, which would be the “natural” abstractions we’re looking for. (E.g. there’s no concept of “tree” arbitrarily close to and yet distinct from the natural abstraction of trees; or rather, there is, but it’s not an attractor, and will hence be dispreferred in favor of the natural version.)
Seeing KataGo exhibit this kind of brittleness, then, functions as a strike against this version of the hypothesis, because it evidences that a seemingly very clear concept (liberties) in a game with very simple rules (Go), despite being something you might expect to be a “natural” concept, was apparently not properly converged to by at least one system. Instead, the system in question basically did converge to a concept “arbitrarily close to, and yet distinct from” the real concept of liberties, which agrees with the real concept almost everywhere in normal play, and yet whose differences can be adversarially amplified until they turn into a winning strategy executable by far inferior agents (humans).
That’s a pretty big strike, in my estimation!
I think there are two “simple” abstractions possible here, where the amount of data to distinguish which one is right is minuscule under natural play in Go, and therefore can easily be overwhelmed by small constant factors in the “ease” of converging to either one due to inductive bias.
Abstraction 1: the size of the set of all empty spaces adjacent to a group
Abstraction 2: the sum of the number of empty spaces next to a given stone on the group, plus, recursively, this same sum for each of the neighboring stones of the same player to the north, south, east, west, where this recursion doesn’t “backtrack”, plus a penalty if a nearby empty space is bordered by the same group from multiple sides.
Abstraction 1 is mathematically more elegant, but abstraction 2 is algorithmically simpler to implement. If you modify abstraction 2′s algorithm to include a globally remembered “set” such that traversed spaces and stones are added to this set and are not re-traversed, you can make it equivalent to abstraction 1, but that requires additional intermediate memory during the computation and so is more complex.
Indeed, when programmers write iterative or recursive algorithms that operate on graphs, it’s easy and it’s usually less lines of code and less local state if you forget to include proper checks on prohibiting return to the global set of locations previously visited during that instance of that algorithm, and then your algorithm may work fine on trees but fails on cycles. The “less memory or state required” part, plus the fact that in Go groups with large cycles that would distinguish these two possibilites are rare, is what leads me to hypothesize right now (without explicit evidence or confirmation) that convergence to abstraction 2 is morally what’s going on here.
Hmm. I agree that this weakens the convergent natural abstractions hypothesis, but I also think that there is such a thing as slicing reality at the mechanism joints in your model, and that, while whether a learning system converges on those is a property that must be explicitly optimized in the design of the system in order for that convergence to occur, it still seems to me appropriate to say that the model almost learned a natural abstraction but had adversarial example issues due to not being robust. Approaches to robustifying a neural network seem likely to me to significantly increase the degree to which the network learns the natural abstraction.
I’ve inverted my self vote on my original comment because of becoming uncertain.