I’ve decided to run a pseudorandomness contest — a reverse Turing test of sorts, if you will. Winners will get to send some of my money to a charity of their choice! Here’s how it’s going to work:
Round 1: Pseudorandom number generation
Should you choose to take part, you will have 10 minutes to come up with a sequence of 150 bits (i.e. zeros and ones). The deadline for doing so is Saturday, December 19th 11:59 ET. You will use the form linked at the bottom of this post for submission. (But read the rest of this post first.)
You may not use a computer or any other resources. For example, you may not use a watch to try to generate random numbers (though you can use one to keep track of the time remaining). No using books, or calculators, or deriving random bits from cracks in the ceiling, etc. The only exception is: you may use Notepad (or something similar) to write down your sequence of bits as you come up with them (but not for scratchwork — everything besides writing down your bits must be done in your head). I recommend this website so you can keep track of how many bits you have. (You may not use a counter that tells you how many of each bit you have.) You may use anything you currently have stored inside your head. You may not memorize anything in advance of participating, though you are allowed to think in advance about general strategy (without writing anything down). [Edit 1: Please do not look up strategies; any thinking you do in advance should be your own.] [Edit 2: Think about the rules this way—ideally you’d be doing this in a uniformly blank room with a computer where the only things you can do are type 0, type 1, backspace, and go to some point in the string you have already typed, as well as see how much time you have left and how many bits you have already typed.]
Round 2: Distinguishing real and fake randomness
If there are n entries in Round 1, I will use a computer to come up with n sequences of 150 random bits. I will post all 2n sequences in a random order in a Google sheet. You will have as much time as you want (subject to a deadline — probably a week or so after I post the Google sheet) to try to figure out which sequences were generated by a human and which by a computer. Then, for every sequence you will submit a probability that that sequence was made by a computer (i.e. “truly” random).
You may use any resources you want to complete this task, e.g. write a computer program or do research on the Internet. The only restriction is: you may not use a program written by someone else for this (or a similar) purpose. You also may not collaborate with other contest participants. (These details are subject to change, but this will be the basic format.)
Grading
Participants in Round 2 will be graded using Brier’s quadratic scoring rule, which means that they will be incentivized to report their true probabilities. Participants in Round 1 will be graded based on the average probability-of-being-generated-by-a-computer assigned to their string by Round 2 participants. The higher this number, the better a participant did.
Prizes
Prizes will be of the form “I will donate $X to a charitable organization of your choice”, subject to my approval (e.g. I won’t donate to the Trump Foundation). At least $75 will be donated to charity through this contest, and at least $150 (possibly more) if both rounds end up with 10 or more participants. (This means that by participating, you’re causing more money to go to charity in expectation!) I haven’t decided exactly how that will be distributed, but my inclination is to give more prize money to Round 2 winners, since that round is more time-consuming.
More details
People may participate in both rounds. In fact, participants in Round 1 will have a small advantage in Round 2, since they will be able to say 0% for the string they submitted.
To make the contest incentive-compatible, in Round 2 you will be able to indicate which string from Round 1 is yours. The probability you submit for that string (i.e. probably 0%) will not count toward your Round 1 grade (so your Round 2 submission doesn’t penalize your Round 1 submission).
Probabilities in Round 2 will be normalized in some way to add to n, so that there isn’t an incentive to say 0 to everything if you’re really keen on winning Round 1.
In Round 1, the rules allow you to e.g. use the digits of Pi to come up with random numbers. You’re allowed to do this, but I advise you against it because Round 2 participants will be on the lookout!
How to participate
Use this form! (I ask for your email address so you can get a receipt with your submission, so I can let you know when Round 2 is available, and so I can contact you if you win a prize. Email addresses will be kept private.)
Questions?
Comment below my blog post!
Just to clarify, I am not allowed to look at the bits I’ve already written down in my notepad in order to derive something from them? Ideally, I would be sitting in an absolutely uniform room with only 2 buttons “0” and “1″ and pressing them for 150 times. Right?
It is okay for you to go back and change or insert bits. Ideally you are in a uniform room with the buttons 0, 1, switch to a point of your choice in the string you have typed, and backspace. Thanks for clarifying!
This confused me because the bits you have already written down are displayed on the recommended website https://www.charactercountonline.com/. Since bit reuse is forbidden, please disqualify my entry.
I’m not sure I understand; could you clarify? If you’re saying that the number of characters you’ve typed is displayed, that’s okay—that’s why I recommended that one. (I suppose this makes my latest comment not quite accurate but the point of that is just so you know when to stop.)
My guess is that your are not breaking the rules as I intended to state them. Or if you think you did, perhaps do it again without breaking the rules and submit with a comment saying to count the later entry?
I mean that I re-read the characters I had already typed and used that information to generate new characters. Do the rules allow this?
Yeah, this is okay. But, something that wouldn’t be okay is writing some math or whatever on the side as part of calculating more bits. You can look at bits you’ve typed, but any computation you do with them must be in your head.
Thank you for clearing things up. Please keep my entry in the competition.
Note for people considering, on the official submission form each line has 30 characters (at least, on my phone it did), so you can just fill up 5 lines (or, in my case, delete about 50 extra bits after copy/pasting from my note app)
Too bad that I didn’t see the post in time! I would have loved to participate. I have a mental mechanism to generate arbitrary amounts of random numbers without any external help—except for a small two-digit seed. I tried it on this task and it turns out it is not fast enough. I could only generate 120 bits in ten minutes. I continued the process and here is my sequence:
11010000 11100110 10011101 10101101 01001001 00111011 01001000 10000110
00100011 11000111 11010011 11010000 11110110 11011101 11110000 10110111
10011010 00110100 011111
I’m curious how it compares.
My scoring system would have assigned this string a 68% chance of being truly random, scoring it above 73% of the other strings.
Thank you Measure. I think that is an OK result for zero preparation but I expected it to do better given that is it based on a small linear congruential generator + noise.
How would this pure sequence score?
100001010101110100011111001111001100111000111011010110111110001000011111100110100111011111011111010111000110111010110110111101101100101111101110011101
It was generated like this:
var a = “”; var n = 12;
for(i = 0; i < 38; i++) { n = (n*8+1)%49; a = a + (n%32).toString(2); }
console.log(a);
That one scores as only 58% random.
One of the main parts of my scoring system is a program that compares the relative frequency of various substrings of equal length, such as “0000” vs. “0101″. It calculates a score for each string using this method and then looks at what fraction of a large population of truly random strings have a lower score than the sample string. Your first string scored better than 40% of random strings on this metric, and your second string was better than only 1.4% (mostly because it has so many more ones than zeros − 57:93).
Keep in mind that these scores are assuming an equal mix of truly random and user-submitted strings, and the actual guesses depend on how good the other submissions are. If everyone submitted very strongly random strings, all my guesses would be around 50%. (My highest guess on any string was 82% random)
Thank you again. Makes sense that the second one has more zeros—because the original range is 0..48. I did compensate for that in my manual sequence by negating every second and I was very careless (a.k.a. noise) with how I put groups of bits together. I think I could arrive at a more random sequence but it would take much longer to generate.
Can I use computer to prepare for part 1, as long as I don’t try to memorize any bit sequences?
No, sorry—any thinking about strategy in advance must be your own thoughts. No external resources for that either.
So, we could use a pseudo random sequence that we happened to have already memorized before this contest was announced? (Eg pi, the Gettysburg address, 100 random digits that I happened to have memorized back in high school)
Yup, these are all okay!
I think generating a good random sequence is VERY HARD. I’ll be very interested to know what’s the best way to do it.
In round 2 when you generate random sequences, will all bits be independent identically distributed with p(0)=p(1)=0.5?
Yes.