TurnTrout comments on Optimal play in human-judged Debate usually won’t answer your question

TurnTrout 27 Jan 2021 16:09 UTC
LW: 3 AF: 2
AF
I expect optimal play would be approachable via gradient descent in most contexts. With k bits, you can slide pretty smoothly from using all k as a direct answer to using all k to provide high value information, one bit at a time. In fact, I expect there are many paths to ignorance.
This seems off; presumably, gradient descent isn’t being performed on the bits of the answer provided, but on the parameters of the agent which generated those bits.
- Joe Collman 27 Jan 2021 16:56 UTC
  LW: 3 AF: 2
  AF Parent
  Oh yes—I didn’t mean to imply otherwise.
  My point is only that there’ll be many ways to slide an answer pretty smoothly between [direct answer] and [useful information]. Splitting into [Give direct answer with (k—x) bits] [Give useful information with x bits] and sliding x from 0 to k is just the first option that occurred to me.
  In practice, I don’t imagine the path actually followed would look like that. I was just sanity-checking by asking myself whether a discontinuous jump is necessary to get to the behaviour I’m suggesting: I’m pretty confident it’s not.