Zack_M_Davis comments on Secrets of the eliminati

Zack_M_Davis 30 Jul 2011 5:48 UTC
4 points

The full idea I think I failed to correctly reference is that giving certain concepts short “description lengths” [...] in your language is equivalent to saying that the concepts signified by those words represent things-in-the-world that show up more often. [...] “More specifically, [...] Kraft’s inequality can be thought of in terms of a constrained budget to be spent on codewords, with shorter codewords being more expensive.”

Sure. Short words are more expensive because there are fewer of them; because short words are scarce, we want to use them to refer to frequently-used concepts. Is that what you meant? I still don’t see how this is relevant to the preceding discussion (see the grandparent).

Also, for clearer communication, you might consider directly saying things like “Short words are more expensive because there are fewer of them” rather than making opaque references to things like Kraft’s inequality. Technical jargon is useful insofar as it helps communicate ideas; references that may be appropriate in the context of a technical discussion about information theory may not be appropriate in other contexts.
- Will_Newsome 30 Jul 2011 6:16 UTC
  −2 points
  Parent
  That’s not quite what I mean, no. It’s not the length of the words that I actually care about, really, and thus upon reflection it is clear that the analogy is too opaque. What I care about is the choice of which concepts to have set aside as concepts-that-need-little-explanation—”ultimate convergent algorithm for arbitrary superintelligence’s’” here, “God” at some theological hangout—and how that reflects which things-in-the-world one has implicitly claimed are more or less common (but really it’d be too hard to disentangle from things-in-the-world one has implicitly claimed are more or less important). It’s the differential “length” of the concepts that I’m trying to talk about. The syntactic length, i.e. the number of letters, doesn’t interest me.
  
  Referencing Kraft’s inequality was my way of saying “this is the general type of reasoning that I have cached as perhaps relevant to the kind of inquiry it would be useful to do”. But I think you’re right that it’s too opaque to be useful.
  
  Edit: To try to explain the intuition a little more, it’s like applying the “scarce short strings” theme to the concepts directly, where the words are just paintbrush handles. That is how I think one might try to argue that language choices can be objectively “irrational” anyway.
  - Zack_M_Davis 30 Jul 2011 18:52 UTC
    3 points
    Parent
    I don’t think the analogy holds. The reason Kraft’s inequality works is that the number of possible strings of length n over a b-symbol alphabet is exactly b^n. This places a bound on the number of short words you can have. Whereas if we’re going to talk about the “amount of mental content” we pack into a single “concept-needing-little-explanation,” I don’t see any analogous bound: I don’t see any reason in principle why a mind of arbitrary size couldn’t have an arbitrary number of complicated “short” concepts.
    
    For concreteness, consider that in technical disciplines, we often speak and think in terms of “short” concepts that would take a lot of time to explain to outsiders. For example, eigenvalues. The idea of an eigenvalue is “short” in the sense that we treat it as a basic conceptual unit, but “complicated” in the sense that it’s built out of a lot of prerequisite knowledge about linear transformations. Why couldn’t a mind create an arbitrary number of such conceptual “chunks”? Or if my model of what it means for a concept to be “short” is wrong, then what do you mean?
    
    I note that my thinking here feels confused; this topic may be too advanced for me to discuss sanely.
  - Will_Newsome 30 Jul 2011 6:26 UTC
    −2 points
    Parent
    On top of that there’s this whole thing where people are constantly using social game theory to reason about what choice of words does or doesn’t count as defecting against local norms, what the consequences would be of failing to punish non-punishers of people who use words in a way that differs from ways that are privileged by social norms, et cetera, which make a straight up information theoretic approach somewhat off-base for even more reasons other than just the straightforward ambiguities imposed by considering implicit utilities as well as probabilities. And that doesn’t even mention the heuristics and biases literature or neuroscience, which take the theoretical considerations and laugh at them.
    - Will_Newsome 30 Jul 2011 7:00 UTC
      −2 points
      Parent
      Ah, I’m needlessly reinventing some aspects of the wheel.