I don’t get paid on the basis of Omega’s prediction given my action. I get paid on the basis of my action given Omega’s prediction. I at least need to know the base-rate probability with which I actually one-box (or two-box), although with only two minutes, I would probably need to know the base rate at which Omega predicts that I will one-box. Actually, just getting the probability for each of P(Ix|Ox) and P(Ix|O~x) would be great.
I also don’t have a mechanism to determine if 1033 is prime that is readily available to me without getting hit by a trolley (with what probability do I get hit by the trolley, incidentally?), nor do I know the ratio of odd-numbered primes to odd-numbered composites is off-hand.
I don’t quite have enough information to solve the problem in any sort of respectable fashion. So what the heck, I two-box and hope that Omega is right and that the number is composite. But if it isn’t, then I cry into my million dollars. (With P(.1): I don’t expect to actually be sad winning $1M, especially after having played several thousand times and presumably having won at least some money in that period.)
One could judge the strength of these with a few empirical tests: such as for (2), comparing industries where it is clear that the skills learned in college (or in a particular major) are particularly relevant vs. industries where it is not as clear, and comparing the number of college grads w/ the relevant skill-signals vs. college grads w/o the relevant skill-signals vs. non-college grads; and for (3), looking to industries where signals of pre-existing ability in that industry do not conform to being in college and comparing their rate of hiring grads vs. non-grads. (This would presumably be jobs in sectors where some sort of loosely defined intellectual ability is not as important. These jobs are becoming more scarce due to automation, and in First World countries in particular, but the tests should still be possible.) (1) is harder to test, as it is agnostic, but trying to see how these intuitions conform to those in hiring positions could be informative. Other signals, as mentioned in the comments, probably have their own tests which can be run on them.