Interesting article. I dare not say I understand it fully. But to argue for some categories as more or less wrong than others is it fair to say you are arguing against the ugly duckling theorem?
Well, I usually try not to argue against theorems (as contrasted to arguing that a theorem’s premises don’t apply in a particular situation)—but in spirit, I guess so! Let me try to work out what’s going on here—
The boxed example on the Wikipedia page you link, following Watanabe, posits a universe of three ducks—a White duck that comes First, a White duck that is not First, and a nonWhite duck that is not First—and observes that every pair of ducks agrees on half of the possible logical predicates that you can define in terms of Whiteness and Firstness. Generally, there are sixteen possible truth functions on two binary variables (like Whiteness or Firstness), but here only eight of them are distinct. (Although really, only eight of them could be distinct, because that’s the number of possible subsets of three ducks (2³ = 8).) In general, we can’t measure the “similarity” between objects by counting the number of sets that group them together, because that’s the same for any pair of objects. We also get a theorem on binary vectors: if you have some k-dimensional vectors of bits, you can use Hamming distance to find the “most dissimilar” one, but if you extend the vectors into 2^k-dimensional vectors of all k-ary boolean functions on the original k bits, then you can’t.
Watanabe concludes, “any objects, in so far as they are distinguishable, are equally similar” (!!).
So, I think the reply to this is going to have to do with inductive bias and the “coincidence” that we in fact live in a low-entropy universe where some cognitive algorithms actually do have an advantage, even if they wouldn’t have an advantage averaged over all possible universes? Unfortunately, I don’t think I understand this in enough detail to explain it well (mumble mumble, new riddle of induction, blah blah, no canonical universal Turing machine for Solomonoff induction), but the main point I’m trying to make in my post is actually much narrower and doesn’t require us to somehow find non-arbitrary canonical categories or reason about all possible categories.
I’m saying that which “subspace” of properties a rational agent is interested in will depend on the agent’s values, but given such a choice, the categories the agent ends up with is going to be the result of running some clustering algorithm on the actual distribution of things in the world, which depends on the world, not the agent’s values. In terms of Watanabe’s ducks: you might not care about a duck’s color or its order, but redefining Whiteness to include the black duck is cheating; it’s wireheading yourself; it can’t help you optimize the ducks.
Interesting article. I dare not say I understand it fully. But to argue for some categories as more or less wrong than others is it fair to say you are arguing against the ugly duckling theorem?
Well, I usually try not to argue against theorems (as contrasted to arguing that a theorem’s premises don’t apply in a particular situation)—but in spirit, I guess so! Let me try to work out what’s going on here—
The boxed example on the Wikipedia page you link, following Watanabe, posits a universe of three ducks—a White duck that comes First, a White duck that is not First, and a nonWhite duck that is not First—and observes that every pair of ducks agrees on half of the possible logical predicates that you can define in terms of Whiteness and Firstness. Generally, there are sixteen possible truth functions on two binary variables (like Whiteness or Firstness), but here only eight of them are distinct. (Although really, only eight of them could be distinct, because that’s the number of possible subsets of three ducks (2³ = 8).) In general, we can’t measure the “similarity” between objects by counting the number of sets that group them together, because that’s the same for any pair of objects. We also get a theorem on binary vectors: if you have some k-dimensional vectors of bits, you can use Hamming distance to find the “most dissimilar” one, but if you extend the vectors into 2^k-dimensional vectors of all k-ary boolean functions on the original k bits, then you can’t.
Watanabe concludes, “any objects, in so far as they are distinguishable, are equally similar” (!!).
So, I think the reply to this is going to have to do with inductive bias and the “coincidence” that we in fact live in a low-entropy universe where some cognitive algorithms actually do have an advantage, even if they wouldn’t have an advantage averaged over all possible universes? Unfortunately, I don’t think I understand this in enough detail to explain it well (mumble mumble, new riddle of induction, blah blah, no canonical universal Turing machine for Solomonoff induction), but the main point I’m trying to make in my post is actually much narrower and doesn’t require us to somehow find non-arbitrary canonical categories or reason about all possible categories.
I’m saying that which “subspace” of properties a rational agent is interested in will depend on the agent’s values, but given such a choice, the categories the agent ends up with is going to be the result of running some clustering algorithm on the actual distribution of things in the world, which depends on the world, not the agent’s values. In terms of Watanabe’s ducks: you might not care about a duck’s color or its order, but redefining Whiteness to include the black duck is cheating; it’s wireheading yourself; it can’t help you optimize the ducks.