This is actually one of the best comments I’ve seen on Less Wrong, especially this part:
Shannon information was discovered for the informal notion of surprise (with the assumption of independent identically distributed symbols from a known distribution). Bayesian decision theory was discovered for the informal notion of rationality (with assumptions like perfect deliberation and side-effect-free cognition). And Solomonoff induction was discovered for the informal notion of Occam’s razor (with assumptions like a halting oracle and a taken-for-granted choice of universal machine). These simple conceptual cores can then be used to motivate and evaluate less-simple approximations for situations where where the assumptions about the decision-maker don’t perfectly apply. For the AI safety problem, the informal notions (for which the mathematical core descriptions would need to be discovered) would be a bit more complex—like the “how to figure out what my designers would want to do in this case” idea above.
This is actually one of the best comments I’ve seen on Less Wrong, especially this part:
Thanks for the clear explanation.