I worry that using OD as the space of behaviors misses something important about the intuitive idea of robustness, making any conclusions about F or OD or behavior manifolds harder to apply. A more natural space (to illustrate my point, not as something helpful for this post) would be ORn, with a metric that cares about how outputs differ on inputs that fall within a particular base distribution γ, something liked(g,h)=EX∼γ|g(x)−h(x)|.
The issue with OD is that models in a behavior manifold only need to agree on the training inputs, and always include all models with arbitrarily crazy behaviors at all inputs outside the dataset, even if we are talking about inputs very close to those in the dataset (which is what γ above is supposed to prevent). So the behavior manifolds are more like cylinders than balls, ignoring crucial dimensions. Since generalization does work (so learning tends to find very unusual points of them), it’s generally unclear how a behavior manifold as a whole is going to be relevant to what’s actually going on.
I agree that the space OD may well miss important concepts and perspectives. As I say, it is not my suggestion to look at it, but rather just something that was implicitly being done in another post. The space ORn may well be a more natural one. (It’s of course the space of functions Rn→O, and so a space in which ‘model space’ naturally sits in some sense. )
I worry that using OD as the space of behaviors misses something important about the intuitive idea of robustness, making any conclusions about F or OD or behavior manifolds harder to apply. A more natural space (to illustrate my point, not as something helpful for this post) would be ORn, with a metric that cares about how outputs differ on inputs that fall within a particular base distribution γ, something like d(g,h)=EX∼γ|g(x)−h(x)|.
The issue with OD is that models in a behavior manifold only need to agree on the training inputs, and always include all models with arbitrarily crazy behaviors at all inputs outside the dataset, even if we are talking about inputs very close to those in the dataset (which is what γ above is supposed to prevent). So the behavior manifolds are more like cylinders than balls, ignoring crucial dimensions. Since generalization does work (so learning tends to find very unusual points of them), it’s generally unclear how a behavior manifold as a whole is going to be relevant to what’s actually going on.
I agree that the space OD may well miss important concepts and perspectives. As I say, it is not my suggestion to look at it, but rather just something that was implicitly being done in another post. The space ORn may well be a more natural one. (It’s of course the space of functions Rn→O, and so a space in which ‘model space’ naturally sits in some sense. )