In this case, at every timestep we take the N most probable models, and only take an action a with probability p if **every** one of the N models takes that action with at least probability p.
This is so much clearer than I’ve ever put it.
(There’s a specific rule that ensures that N decreases over time.)
N won’t necessarily decrease over time, but all of the models will eventually agree with other.
monitor the performance of your system online, and train to correct any problems
I would have described Vanessa’s and my approaches as more about monitoring uncertainty, and avoiding problems before the fact rather than correcting them afterward. But I think what you said stands too.
N won’t necessarily decrease over time, but all of the models will eventually agree with other.
Ah, right. I rewrote that paragraph, getting rid of that sentence and instead talking about the tradeoff directly.
I would have described Vanessa’s and my approaches as more about monitoring uncertainty, and avoiding problems before the fact rather than correcting them afterward. But I think what you said stands too.
Added a sentence to the opinion noting the benefits of explicitly quantified uncertainty.
This is so much clearer than I’ve ever put it.
N won’t necessarily decrease over time, but all of the models will eventually agree with other.
I would have described Vanessa’s and my approaches as more about monitoring uncertainty, and avoiding problems before the fact rather than correcting them afterward. But I think what you said stands too.
Ah, right. I rewrote that paragraph, getting rid of that sentence and instead talking about the tradeoff directly.
Added a sentence to the opinion noting the benefits of explicitly quantified uncertainty.