Kindly comments on Open thread, Mar. 16 - Mar. 22, 2015

Kindly 23 Mar 2015 13:50 UTC
0 points
What sort of evidence about x do you expect to update on?
- Transfuturist 23 Mar 2015 20:51 UTC
  0 points
  Parent
  The result of some built-in string function length(s), that, depending on the implementation of the string type, either returns the header integer stating the size, or counts the length until the terminator symbol and returns that integer.
  - Kindly 24 Mar 2015 15:06 UTC
    0 points
    Parent
    That doesn’t sound like something you’d need to do statistics on. Once you learn something about the string length, you basically just know it.
    
    Improper priors are not useful on their own: the point of using them is that you will get a proper distribution after you update on some evidence. In your case, after you update on some evidence, you’ll just have a point distribution, so it doesn’t matter what your prior is.
    - Transfuturist 24 Mar 2015 23:22 UTC
      0 points
      Parent
      Not so. I’m trying to figure out how to find the maximum entropy distribution for simple types, and recursively defined types are a part of that. This does not only apply to strings, it applies to sequences of all sorts, and I’m attempting to allow the possibility of error correction in these techniques. What is the point of doing statistics on coin flips? Once you learn something about the flip result, you basically just know it.
      - Kindly 24 Mar 2015 23:48 UTC
        0 points
        Parent
        Well, in the coin flip case, the thing you care about learning about isn’t the value in {Heads, Tails} of a coin flip, but the value in [0,1] of the underlying probability that the coin comes up heads. We can then put an improper prior on that underlying probability, with the idea that after a single coin flip, we update it to a proper prior.
        
        Similarly, you could define here a family of distributions of string lengths, and have a prior (improper or otherwise) about which distribution in the family you’re working with. For example, you could assume that the length of a string is distributed as a Geometric(p) variable for some unknown parameter p, and then sampling a single string gives you some evidence about what p might be.
        
        Having an improper prior on the length of a single string, on the other hand, only makes sense if you expect to gain (and update on) partial evidence about the length of that string.