I expect optimal play would be approachable via gradient descent in most contexts. With k bits, you can slide pretty smoothly from using all k as a direct answer to using all k to provide high value information, one bit at a time. In fact, I expect there are many paths to ignorance.
This seems off; presumably, gradient descent isn’t being performed on the bits of the answer provided, but on the parameters of the agent which generated those bits.
My point is only that there’ll be many ways to slide an answer pretty smoothly between [direct answer] and [useful information]. Splitting into [Give direct answer with (k—x) bits] [Give useful information with x bits] and sliding x from 0 to k is just the first option that occurred to me.
In practice, I don’t imagine the path actually followed would look like that. I was just sanity-checking by asking myself whether a discontinuous jump is necessary to get to the behaviour I’m suggesting: I’m pretty confident it’s not.
This seems off; presumably, gradient descent isn’t being performed on the bits of the answer provided, but on the parameters of the agent which generated those bits.
Oh yes—I didn’t mean to imply otherwise.
My point is only that there’ll be many ways to slide an answer pretty smoothly between [direct answer] and [useful information]. Splitting into [Give direct answer with (k—x) bits] [Give useful information with x bits] and sliding x from 0 to k is just the first option that occurred to me.
In practice, I don’t imagine the path actually followed would look like that. I was just sanity-checking by asking myself whether a discontinuous jump is necessary to get to the behaviour I’m suggesting: I’m pretty confident it’s not.