I don’t think that matters, because knowing all but the last bit, I can simply take two actions—action assuming last bit is true, and action assuming it’s false.
Not sure I’m following the setup and notation quite close enough to argue that one way or the other, as far as the order we’re saying the agent receives evidence and has to commit to actions. Above I was considering the simplest case of 1 bit evidence in, 1 bit action out, repeat.
I pretty sure that could be extended to get that one small key/update that unlocks the whole puzzle sort of effect and have the model click all at once. As you say though, not sure that gets to the heart of the matter regarding the bound; it may show that no such bound exists on the margin, the last piece can be much more valuable on the margin than all the prior pieces of evidence, but not necessarily in a way that violates the proposed bound overall. Maybe we have to see that last piece as unlocking some bounded amount of value from your prior observations.
I don’t think that matters, because knowing all but the last bit, I can simply take two actions—action assuming last bit is true, and action assuming it’s false.
Not sure I’m following the setup and notation quite close enough to argue that one way or the other, as far as the order we’re saying the agent receives evidence and has to commit to actions. Above I was considering the simplest case of 1 bit evidence in, 1 bit action out, repeat.
I pretty sure that could be extended to get that one small key/update that unlocks the whole puzzle sort of effect and have the model click all at once. As you say though, not sure that gets to the heart of the matter regarding the bound; it may show that no such bound exists on the margin, the last piece can be much more valuable on the margin than all the prior pieces of evidence, but not necessarily in a way that violates the proposed bound overall. Maybe we have to see that last piece as unlocking some bounded amount of value from your prior observations.