I think adding a little meta-probability will help.
Since there’s some probability of the sequence being “the most surprising”, this would basically mean that several of the most surprising end up with basically the same probability. For example, if it takes n bits of data to define “the most surprising m-bit sequence”, then there must be a 2^-n chance of that happening. Since there are 2^m sequences, and the most surprising sequence must have a probability of at most 2^-m, there must be at least 2^(m-n) most surprising sequences.
I think adding a little meta-probability will help.
Since there’s some probability of the sequence being “the most surprising”, this would basically mean that several of the most surprising end up with basically the same probability. For example, if it takes n bits of data to define “the most surprising m-bit sequence”, then there must be a 2^-n chance of that happening. Since there are 2^m sequences, and the most surprising sequence must have a probability of at most 2^-m, there must be at least 2^(m-n) most surprising sequences.