I would say that they are motivated by the same basic idea, but are applied to different problems. The MDL (or the closely-related BIC) is a method for model selection given a dataset, whereas surprise accounting is a method for evaluating heuristic explanations, which don’t necessarily involve model selection.
Take the Boolean circuit worked example: what is the relevant dataset? Perhaps it is the 256 (input, TRUE) pairs. But the MDL would select a much simpler model, namely the circuit that ignores the input and outputs TRUE (or “x_1 OR (NOT x_1)” if it has to consist of AND, OR and NOT gates). On the other hand, a heuristic explanation is not interested choosing a simpler model, but is instead interested in explaining why the model we have been given behaves in the way it does.
The heuristic explanations in the post do use a single prior or over the set of circuits, which we also call a “reference class”. But we wish to allow explanations that use other reference classes, as well as explanations that combine multiple reference classes, and perhaps even explanations that use “subjective” reference classes that do not seem to correspond to any precise prior. These are the sorts of issues explored in the upcoming paper. Ultimately, though, a lot of our heuristic arguments and the surprise accounting for them remain somewhat ambiguous or informal.
I would say that they are motivated by the same basic idea, but are applied to different problems. The MDL (or the closely-related BIC) is a method for model selection given a dataset, whereas surprise accounting is a method for evaluating heuristic explanations, which don’t necessarily involve model selection.
Take the Boolean circuit worked example: what is the relevant dataset? Perhaps it is the 256 (input, TRUE) pairs. But the MDL would select a much simpler model, namely the circuit that ignores the input and outputs TRUE (or “x_1 OR (NOT x_1)” if it has to consist of AND, OR and NOT gates). On the other hand, a heuristic explanation is not interested choosing a simpler model, but is instead interested in explaining why the model we have been given behaves in the way it does.
The heuristic explanations in the post do use a single prior or over the set of circuits, which we also call a “reference class”. But we wish to allow explanations that use other reference classes, as well as explanations that combine multiple reference classes, and perhaps even explanations that use “subjective” reference classes that do not seem to correspond to any precise prior. These are the sorts of issues explored in the upcoming paper. Ultimately, though, a lot of our heuristic arguments and the surprise accounting for them remain somewhat ambiguous or informal.