harfe comments on Lucius Bushnaq’s Shortform

harfe 23 Oct 2024 14:42 UTC
3 points
0
I think you are broadly right.

So we’re automatically giving $x_{1}$ ca. $2^{1000}$ higher probability – even before applying the length penalty $\frac{1}{2^{| p |}}$ .

But note that under the Solomonoff prior, you will get another $2^{- 2000 - | G |}$ penalty for these programs with DEADCODE. So with this consideration, the weight changes from $2^{- 1000}$ (for normal $p_{1}$ ) to $2^{- 1000} (1 + 2^{- | G |})$ (normal $p_{1}$ plus $2^{1000}$ DEADCODE versions of $p_{1}$ ), which is not a huge change.

For your case of “uniform probability until $10^{90}$ ” I think you are right about exponential decay.
- Lucius Bushnaq 23 Oct 2024 14:52 UTC
  4 points
  0
  Parent
  Yes, my point here is mainly that the exponential decay seems almost baked into the setup even if we don’t explicitly set it up that way, not that the decay is very notably stronger than it looks at first glance.
  
  Given how many words have been spilled arguing over the philosophical validity of putting the decay with program length into the prior, this seems kind of important?
  - Richard_Kennaway 23 Oct 2024 16:03 UTC
    3 points
    0
    Parent
    The number of programs of length at most n increases exponentially with n. Therefore any probability measure over them must decrease at least exponentially with length. That is, exponential decay is the least possible penalisation of length.
    
    This is also true of the number of minimal programs of length at most n, hence the corresponding conclusion. (Proof: for each string S, consider the minimal program that writes S and halts. These programs are all different. Their sizes are no more than length(S)+c, where c is the fixed overhead of writing a program with S baked into it. Therefore exponentiality.)
    
    I’ve written “at most n” instead of simply “n”, to guard against quirks like a programming language in which all programs are syntactically required to e.g. have even length, or deep theorems about the possible lengths of minimal programs.