So, I got nerd-sniped by this (together with my flatmate on P3) and we spent some delicious hours and days in various rabbit-holes. Posting some thoughts here.
First of all, a meta-comment (minor spoiler on the nature of best-known solutions):
The first puzzle got me into an optimization mindset—and it seems to be the right mindset for that problem—but the optimal solutions to P2 and P3 (which I missed before reading spoilers) are “crisp” or “a neat trick” rather that a result of whatever matches a (however clever) “optimization process” for me. (If this was intended, I really appreciate the subtlety of it, Scott!)
Puzzle 1
For P1, I got a bit over-board and found optimal* solutions for numbers up to 10000 (and some more, with some caveats—see below). The spoiler-tags just below do not contain the solutions, but spoilery notes on my approach, actual solutions are lower down.
By “optimal*” I mean definitely cheapest formulas among all formulas using only natural numbers smaller than 101000as intermediate values (and modulo any bugs in my code). You can find the code here and the solutions for all numbers 1..10000 here (obvious spoilers).
The largest intermediate value needed for all of those is below 10220. Note the code also finds optimal* formulas for many more values (~100 million on 50GB RAM), just does not report them by default.
I have ran the code to also consider intermediate values up to 1010000 and even 10100000 and did not find any cheaper solution for numbers up to 10000. (In those runs my code got to the optimal solutions for only 98% and 88% (respectively) of the numbers up to 10000, not finding any better solutions, then ran out of memory. I still consider it a good indication that much larger values are likely not useful for numbers up to 10000.) I would be really interested in constructions or intuitions why this may or may not be the case!
I know about at least one case of a formula using real numbers as intermediates that is (very likely) strictly better that any formula with only natural intermediate values. That example is due to Scott—leaving it up to him to disclose it or not.
My solution for 2021 and some notes on optimal* solutions for other numbers
2021 = (isqrt(isqrt((isqrt(isqrt(isqrt(isqrt((2 ** fact(fact(3))))))) // 2))) - (3 ** 3)) -- a cost-108 variation on 211−33 that uses a clever cost-12 formula for 211. This uses 5.5e216 as an intermediate value—the highest (up to small variations) encountered in formulas for values under 10000. This sub-formula is used in 149 of the 10000 solutions. My code prefers solutions with the smallest intermediate values used as a secondary criterion, so these large intermediate values are necessary for optimal* solutions.
Other encountered sub-formulas using large intermediate values: 7381 = (fact((2 + fact(5))) // (2 * fact(fact(5)))) (needs 1e203, used 10 times in 1..10000), 3782 = (fact(((2 ** fact(3)) - 2)) // fact((fact(5) // 2))) (needs 3e85, used once), 1189 = isqrt(((2 // 2) + (fact((fact(3) ** 2)) // fact((2 ** 5))))) (needs 4e41, used once), and 32768 = isqrt(isqrt(isqrt((2 ** fact(5))))) (needs 1e36, used 56 times). All other optimal* solutions for values up to 10000 only use intermediate values smaller than 1010.
The distribution of used intermediate values—only 3 between 1010 and 10100, 2 more under 10300, and no more under 10100000 -- is my main argument for all the found solutions for 1...10000 being likely optimal. Anything to the contrary would be intriguing indeed!
And a universal cost-8 solution if log was allowed:
N=−log2log2√√…√2 using N nested square-roots.
Puzzle 2
My (non-optimal) solution, more for reference and the algorithm that anything else. (“dN” denotes an N-sided fair die.)
After 2 dice rolls of dA, dB, partition the AB outcomes into N=2022 equally-sized sets, and C=ABmodN remaining outcomes. If you hit one of the N sets, you are done. If not, you can use the remaining information as already having thrown a dC and only throw one additional dice (if C>2) before you do another partition of the outcomes into N sets and a remainder (rather than doing a full restart). (Note that it is generally better to be left with a larger C when the probability of spillover is comparable.)
This can be represented by a recursive formula for EARN(A): “expected additional rolls if you already have thrown dA”, the puzzle result is then EAR2022(1). Rather than the full formula, here is a program that iteratively computes EAR. My result is 2.0000005791099, matching other similar solutions here. (See Vaniver’s comment for the optimal one)
Puzzle 3
While me and Adam did not find the best known solution for P3, we had a lot of fun trying to figure out various approaches to and properties of the problem. Here is a short summary, happy to share more details. The framing here is “emulating dN” (an N-sided fair dice).
There are some reductions between coins and emulated dice. (Note d1 is trivially available.)
Reduction 1: If you can emulate dA and dB, you can emulate dAB (trivial).
Reduction 2: If you can emulate dA, you can emulate dice of any divisor of A (trivial).
Reduction 3: If you can emulate dA and dB, you can emulate d(A+B) with additional A/(A+B)-coin (flip the A/(A+B)-coin, then emulate dA or dB). In particular, you can emulate d(A+1) using dA and one extra coin.
This then gives a few solutions to 2021 using only rational coins and a range of general solution schemas to explore:
Using k different coins, one can emulate dA where A is a number with k ones in its binary representation (start with 1, use R1 with d2 to append “0” binary digit and R3 to add “1″ binary digit) as well as all its divisors (R2). 2021 divides 68987912193=236+228+20, we use 3 coins: 1⁄2, 1⁄257, 1⁄68987912193. (One can look for low-binary-1 multiples of a number with binary residuals and some branching.)
Find a,b,c,d≥0 such that N|2a3b+2c3d (3 coins) or N|2a5b+2c5d (3 coins, 5=2^2+1) etc. 2021 divides 210527+1.
Yet another solution for 2021 is to go from needing d2021 to d2020 via a 1/2021-coin, then to d505 via d2, then notice that 505|250+1. This also implies some possible schemas.
I don’t know about a fully-general (and guaranteed to halt) algorithm using just R1-R3, since it is not clear how high you need to go in the multiples to split via R3. There may also be other reduction rules even for just rational coins. I would be curious about any thoughts in that direction!
And few more theoretical observations.
For any argument, you can assume any schema is “static”: it throws a fixed sequence of coins regardless of intermediate results. (Proof: Take an arbitrary decision tree with its leaves grouped into N groups of equal probability. Inserting a new coin-flip anywhere in the tree (branching into two identical copies of its former subtree) preserves the resulting groups-distribution. Obtain a tree with single-coin levels starting at the root.)
Any finite set of rational-probability coins of the form p=a/b with a natural and b odd (i.e. without a 1/2-coin) can’t emulate any dice (proof via expansion of all the probability terms with a common denominator in a static decision tree; exactly one of these terms is odd) (a simpler proof shows this for single rational a/b-coin for a/b≠1/2 using binomial expansion of (a+(b−a))k/bk).
Therefore using only rational coins, the only 1-coin-expressable dice are d(2k). I suspect that the only 2-coin-expressable dice are divisors of 2a+2b for some a,b≥0.
Any set of coins with transcendental and algebraically independent probabilities (i.e. numbers that are not roots of any multi-variate polynom with rational coefficients, e.g. {e,π}) can’t emulate any dice (proof using the “not-roots-of-any-rational-polynomial” definition on the expansion of leaf probabilities of a static decision tree).
So, I got nerd-sniped by this (together with my flatmate on P3) and we spent some delicious hours and days in various rabbit-holes. Posting some thoughts here.
First of all, a meta-comment (minor spoiler on the nature of best-known solutions):
The first puzzle got me into an optimization mindset—and it seems to be the right mindset for that problem—but the optimal solutions to P2 and P3 (which I missed before reading spoilers) are “crisp” or “a neat trick” rather that a result of whatever matches a (however clever) “optimization process” for me. (If this was intended, I really appreciate the subtlety of it, Scott!)
Puzzle 1
For P1, I got a bit over-board and found optimal* solutions for numbers up to 10000 (and some more, with some caveats—see below). The spoiler-tags just below do not contain the solutions, but spoilery notes on my approach, actual solutions are lower down.
By “optimal*” I mean definitely cheapest formulas among all formulas using only natural numbers smaller than 101000as intermediate values (and modulo any bugs in my code). You can find the code here and the solutions for all numbers 1..10000 here (obvious spoilers).
The largest intermediate value needed for all of those is below 10220. Note the code also finds optimal* formulas for many more values (~100 million on 50GB RAM), just does not report them by default.
I have ran the code to also consider intermediate values up to 1010000 and even 10100000 and did not find any cheaper solution for numbers up to 10000. (In those runs my code got to the optimal solutions for only 98% and 88% (respectively) of the numbers up to 10000, not finding any better solutions, then ran out of memory. I still consider it a good indication that much larger values are likely not useful for numbers up to 10000.) I would be really interested in constructions or intuitions why this may or may not be the case!
I know about at least one case of a formula using real numbers as intermediates that is (very likely) strictly better that any formula with only natural intermediate values. That example is due to Scott—leaving it up to him to disclose it or not.
My solution for 2021 and some notes on optimal* solutions for other numbers
2021 = (isqrt(isqrt((isqrt(isqrt(isqrt(isqrt((2 ** fact(fact(3))))))) // 2))) - (3 ** 3))
-- a cost-108 variation on 211−33 that uses a clever cost-12 formula for 211. This uses 5.5e216 as an intermediate value—the highest (up to small variations) encountered in formulas for values under 10000. This sub-formula is used in 149 of the 10000 solutions. My code prefers solutions with the smallest intermediate values used as a secondary criterion, so these large intermediate values are necessary for optimal* solutions.Other encountered sub-formulas using large intermediate values:
7381 = (fact((2 + fact(5))) // (2 * fact(fact(5))))
(needs 1e203, used 10 times in 1..10000),3782 = (fact(((2 ** fact(3)) - 2)) // fact((fact(5) // 2)))
(needs 3e85, used once),1189 = isqrt(((2 // 2) + (fact((fact(3) ** 2)) // fact((2 ** 5)))))
(needs 4e41, used once), and32768 = isqrt(isqrt(isqrt((2 ** fact(5)))))
(needs 1e36, used 56 times). All other optimal* solutions for values up to 10000 only use intermediate values smaller than 1010.The distribution of used intermediate values—only 3 between 1010 and 10100, 2 more under 10300, and no more under 10100000 -- is my main argument for all the found solutions for 1...10000 being likely optimal. Anything to the contrary would be intriguing indeed!
And a universal cost-8 solution if log was allowed:
N=−log2log2√√…√2 using N nested square-roots.
Puzzle 2
My (non-optimal) solution, more for reference and the algorithm that anything else. (“dN” denotes an N-sided fair die.)
After 2 dice rolls of dA, dB, partition the AB outcomes into N=2022 equally-sized sets, and C=AB mod N remaining outcomes. If you hit one of the N sets, you are done. If not, you can use the remaining information as already having thrown a dC and only throw one additional dice (if C>2) before you do another partition of the outcomes into N sets and a remainder (rather than doing a full restart). (Note that it is generally better to be left with a larger C when the probability of spillover is comparable.)
This can be represented by a recursive formula for EARN(A): “expected additional rolls if you already have thrown dA”, the puzzle result is then EAR2022(1). Rather than the full formula, here is a program that iteratively computes EAR. My result is 2.0000005791099, matching other similar solutions here. (See Vaniver’s comment for the optimal one)
Puzzle 3
While me and Adam did not find the best known solution for P3, we had a lot of fun trying to figure out various approaches to and properties of the problem. Here is a short summary, happy to share more details. The framing here is “emulating dN” (an N-sided fair dice).
There are some reductions between coins and emulated dice. (Note d1 is trivially available.)
Reduction 1: If you can emulate dA and dB, you can emulate dAB (trivial).
Reduction 2: If you can emulate dA, you can emulate dice of any divisor of A (trivial).
Reduction 3: If you can emulate dA and dB, you can emulate d(A+B) with additional A/(A+B)-coin (flip the A/(A+B)-coin, then emulate dA or dB). In particular, you can emulate d(A+1) using dA and one extra coin.
This then gives a few solutions to 2021 using only rational coins and a range of general solution schemas to explore:
Using k different coins, one can emulate dA where A is a number with k ones in its binary representation (start with 1, use R1 with d2 to append “0” binary digit and R3 to add “1″ binary digit) as well as all its divisors (R2). 2021 divides 68987912193=236+228+20, we use 3 coins: 1⁄2, 1⁄257, 1⁄68987912193. (One can look for low-binary-1 multiples of a number with binary residuals and some branching.)
Find a,b,c,d≥0 such that N|2a3b+2c3d (3 coins) or N|2a5b+2c5d (3 coins, 5=2^2+1) etc. 2021 divides 210527+1.
Yet another solution for 2021 is to go from needing d2021 to d2020 via a 1/2021-coin, then to d505 via d2, then notice that 505|250+1. This also implies some possible schemas.
I don’t know about a fully-general (and guaranteed to halt) algorithm using just R1-R3, since it is not clear how high you need to go in the multiples to split via R3. There may also be other reduction rules even for just rational coins. I would be curious about any thoughts in that direction!
And few more theoretical observations.
For any argument, you can assume any schema is “static”: it throws a fixed sequence of coins regardless of intermediate results. (Proof: Take an arbitrary decision tree with its leaves grouped into N groups of equal probability. Inserting a new coin-flip anywhere in the tree (branching into two identical copies of its former subtree) preserves the resulting groups-distribution. Obtain a tree with single-coin levels starting at the root.)
Any finite set of rational-probability coins of the form p=a/b with a natural and b odd (i.e. without a 1/2-coin) can’t emulate any dice (proof via expansion of all the probability terms with a common denominator in a static decision tree; exactly one of these terms is odd) (a simpler proof shows this for single rational a/b-coin for a/b≠1/2 using binomial expansion of (a+(b−a))k/bk).
Therefore using only rational coins, the only 1-coin-expressable dice are d(2k). I suspect that the only 2-coin-expressable dice are divisors of 2a+2b for some a,b≥0.
Any set of coins with transcendental and algebraically independent probabilities (i.e. numbers that are not roots of any multi-variate polynom with rational coefficients, e.g. {e,π}) can’t emulate any dice (proof using the “not-roots-of-any-rational-polynomial” definition on the expansion of leaf probabilities of a static decision tree).