TurnTrout comments on ELK Proposal: Thinking Via A Human Imitator

TurnTrout 25 Feb 2022 19:13 UTC
LW: 2 AF: 2
AF
I do think this is what happens given the current architecture. I argued that the desired outcome solves narrow ELK as a sanity check, but I’m not claiming that the desired setup is uniquely loss-minimizing.
Part of my original intuition was “the human net is set up to be useful at predicting given honest information about the situation”, and “pressure for simpler reporters will force some kind of honesty, but I don’t know how far that goes.” As time passed I became more and more aware of how this wasn’t the best/only way for the human net to help with prediction, and turned more towards a search for a crisp counterexample.