When algorithms have bad worst cases, these cases are often the inputs with a particular structure. For example if you use quicksort without randomising your pivots then you have an average case complexity of n.log(n) but a worst case complexity of n^2. The worst inputs are the ones where the list is already partially sorted. The “partially sorted” is the kind of structure I’m talking about.
If we expected to see inputs with some kind of structure (but we didn’t know which kind of structure) then we might well be worried that we would get inputs with precisely the structure that would hit our worst case.
So suppose we do indeed have a prior over our inputs, but that this is a Solomonoff prior. Among all the inputs of length n we expect to see each one with probability proportional to exp(-k) where k is its Kolmogorov complexity. Then most of the strings will have complexity n and so probability ~ exp(-n). However our worst case string will have complexity at most m+log(n) where m is the length of our algorithm’s code. This is by virtue of its description as the worst case for that algorithm among the strings of length n. So we will anticipate getting the worst case input with probability ~ exp(-m)/n. So if our worst case complexity is g(n) then our average case complexity is O(g(n)/n).
Under a Solomonoff prior the worst cases and average cases are the same! (up to a polynomial)
EDIT: So if there are cases where randomisation gives an exponentially better worst case complexity, then we won’t be able to derandomise them under the Solomonoff prior.
It’s a pretty canonical example actually. I immediately thought of quicksort’s worst case vs average case when I got to the end of the debate. It’s taught in many intro to algorithms courses. (But I sure didn’t know about the Kolmogorov complexity part!)
This is quite possibly the best LW comment I’ve ever read. An excellent point with a really concise explanation. In fact it is one of the most interesting points I’ve seen within Kolmogorov complexity too. Well done on independently deriving the result!
When algorithms have bad worst cases, these cases are often the inputs with a particular structure. For example if you use quicksort without randomising your pivots then you have an average case complexity of n.log(n) but a worst case complexity of n^2. The worst inputs are the ones where the list is already partially sorted. The “partially sorted” is the kind of structure I’m talking about.
If we expected to see inputs with some kind of structure (but we didn’t know which kind of structure) then we might well be worried that we would get inputs with precisely the structure that would hit our worst case.
So suppose we do indeed have a prior over our inputs, but that this is a Solomonoff prior. Among all the inputs of length n we expect to see each one with probability proportional to exp(-k) where k is its Kolmogorov complexity. Then most of the strings will have complexity n and so probability ~ exp(-n). However our worst case string will have complexity at most m+log(n) where m is the length of our algorithm’s code. This is by virtue of its description as the worst case for that algorithm among the strings of length n. So we will anticipate getting the worst case input with probability ~ exp(-m)/n. So if our worst case complexity is g(n) then our average case complexity is O(g(n)/n).
Under a Solomonoff prior the worst cases and average cases are the same! (up to a polynomial)
EDIT: So if there are cases where randomisation gives an exponentially better worst case complexity, then we won’t be able to derandomise them under the Solomonoff prior.
reference
Wow, they say exactly the same things I do, right down to using quicksort as an example.
I assumed you were aware of that paper. If you came up with that result independently, then kudos to you ;)
It’s a pretty canonical example actually. I immediately thought of quicksort’s worst case vs average case when I got to the end of the debate. It’s taught in many intro to algorithms courses. (But I sure didn’t know about the Kolmogorov complexity part!)
This is quite possibly the best LW comment I’ve ever read. An excellent point with a really concise explanation. In fact it is one of the most interesting points I’ve seen within Kolmogorov complexity too. Well done on independently deriving the result!
Thanks!