Christopher King comments on Thoughts on sharing information about language model capabilities

Christopher King 1 Aug 2023 16:23 UTC
5 points
0
I know that prediction markets don’t really work in this domain (apocalypse markets are equivalent to loans), but what if we tried to approximate Solomonoff induction via a code golfing competition?

That is, we take a bunch of signals related to AI capabilities and safety (investment numbers, stock prices, ML benchmarks, number of LW posts, posting frequency or embedding vectors of various experts’ twitter account, etc...) and hold a collaborative competition to find the smallest program that generates this data. (You could allow the program to be output probabilities sequentially, at a penalty of (log_(1/2) of the overall likelihood) bits.) Contestants are encouraged to modify or combine other entries (thus ensuring there are no unnecessary special cases hiding in the code).

By analyzing such a program, we would get a very precise model of the relationship between the variables, and maybe even could extract causal relationships.

(Really pushing the idea, you also include human population in the data and we all agree to a joint policy that maximizes the probability of the “population never hits 0” event. This might be stretching how precise of models we can code-golf though.)

Technically, taking a weighted average of the entries would be closer to Solomonoff induction, but the probability is basically dominated by the smallest program.