$5k for this post by Wei_Dai, and the preceding/following discussion, some points about the difficulty of learning corrigibility in small pieces.
$3k for Point 1 from this comment by eric_langlois, an intuition pump for why security amplification is likely to be more difficult than you might think.
$2k for this post by William_S, which clearly explains a consideration / design constraint that would make people less optimistic about my scheme. (This fits under “summarizing/clarifying” rather than novel observation.)
Thanks to everyone who submitted a criticism! Overall I found this process useful for clarifying my own thinking (and highlighting places where I could make it easier to engage with my research by communicating more clearly).
The results:
$5k for this post by Wei_Dai, and the preceding/following discussion, some points about the difficulty of learning corrigibility in small pieces.
$3k for Point 1 from this comment by eric_langlois, an intuition pump for why security amplification is likely to be more difficult than you might think.
$2k for this post by William_S, which clearly explains a consideration / design constraint that would make people less optimistic about my scheme. (This fits under “summarizing/clarifying” rather than novel observation.)
Thanks to everyone who submitted a criticism! Overall I found this process useful for clarifying my own thinking (and highlighting places where I could make it easier to engage with my research by communicating more clearly).
Can you link this comment from the OP? I skimmed the whole thread looking for info on who won prizes and managed to miss this on my first pass.