Sounds like I’ve maybe not communicated the thing about circularity. I’ll try again, it would be useful to let me know whether or not this new explanation matches what you were already picturing from the previous one.
Let’s think about circular definitions in terms of equations for a moment. We’ll have two equations: one which “defines” x in terms of y, and one which “defines” y in terms of x:
x=f(y)
y=g(x)
Now, ifg=f−1, then (I claim) that’s what we normally think of as a “circular definition”. It’s “pretending” to fully specify x and y, but in fact it doesn’t, because one of the two equations is just a copy of the other equation but written differently. The practical problem, in this case, is that x and y are very underspecified by the supposed joint “definition”.
But now suppose g is notf−1, and more generally the equations are not degenerate. Then our two equations are typically totally fine and useful, and indeed we use equations like this all the time in the sciences and they work great. Even though they’re written in a “circular” way, they’re substantively non-circular. (They might still allow for multiple solutions, but the solutions will typically at least be locally unique, so there’s a discrete and typically relatively small set of solutions.)
That’s the sort of thing which clustering algorithms do: they have some equations “defining” cluster-membership in terms of the data points and cluster parameters, and equations “defining” the cluster parameters in terms of the data points and the cluster-membership:
cluster_membership = f(data, cluster_params)
cluster_params = g(data, cluster_membership)
… where f and g are different (i.e. non-degenerate; g is not just f−1 with data held constant). Together, these “definitions” specify a discrete and typically relatively small set of candidate (cluster_membership, cluster_params) values given some data.
That, I claim, is also part of what’s going on with abstractions like “dog”.
(Now, choice of axes is still a separate degree of freedom which has to be handled somehow. And that’s where I expect the robustness to choice of axes does load-bearing work. As you say, that’s separate from the circularity issue.)
Sounds like I’ve maybe not communicated the thing about circularity. I’ll try again, it would be useful to let me know whether or not this new explanation matches what you were already picturing from the previous one.
Let’s think about circular definitions in terms of equations for a moment. We’ll have two equations: one which “defines” x in terms of y, and one which “defines” y in terms of x:
x=f(y)
y=g(x)
Now, if g=f−1, then (I claim) that’s what we normally think of as a “circular definition”. It’s “pretending” to fully specify x and y, but in fact it doesn’t, because one of the two equations is just a copy of the other equation but written differently. The practical problem, in this case, is that x and y are very underspecified by the supposed joint “definition”.
But now suppose g is not f−1, and more generally the equations are not degenerate. Then our two equations are typically totally fine and useful, and indeed we use equations like this all the time in the sciences and they work great. Even though they’re written in a “circular” way, they’re substantively non-circular. (They might still allow for multiple solutions, but the solutions will typically at least be locally unique, so there’s a discrete and typically relatively small set of solutions.)
That’s the sort of thing which clustering algorithms do: they have some equations “defining” cluster-membership in terms of the data points and cluster parameters, and equations “defining” the cluster parameters in terms of the data points and the cluster-membership:
cluster_membership = f(data, cluster_params)
cluster_params = g(data, cluster_membership)
… where f and g are different (i.e. non-degenerate; g is not just f−1 with data held constant). Together, these “definitions” specify a discrete and typically relatively small set of candidate (cluster_membership, cluster_params) values given some data.
That, I claim, is also part of what’s going on with abstractions like “dog”.
(Now, choice of axes is still a separate degree of freedom which has to be handled somehow. And that’s where I expect the robustness to choice of axes does load-bearing work. As you say, that’s separate from the circularity issue.)