paulfchristiano comments on Prize for probable problems

paulfchristiano 9 Mar 2018 6:31 UTC
14 points
Some examples that come to mind:
- This comment of yours changed my thinking about security amplification by cutting off some lines of argument and forced me to lower my overall goals (though it is simple enough that it feels like it should have been clear in advance). I believe the scheme overall survives, as I discussed at the workshop, but in a slightly different form.
- This post by Jessica both does a good job of overviewing some concerns and makes a novel argument (if the importance weight is slightly wrong then you totally lose) that leaves me very skeptical about any importance-weighting approach to fixing Solomonoff induction, which in turn leaves me more skeptical about “direct” approaches to benign induction.
- In this post I listed implicit ensembling as an approach to robustness. Between Jessica’s construction described here and discussions with MIRI folk arguing persuasively that the number of extra bits needed to get honesty was reasonably large such that even a good KWIK bound would be mediocre (partially described by Jessica here) I ended up pessimistic.
None of these posts use heavy machinery.