One thought I had was ^V(fθ)L(y,fθ(x))+λR(θ), but this has the problem of vanishing L when the volumes are compact. I’m still not sure whether the relationship between ^V and L should be multiplicative or additive. I am fairly sure that ^V shouldn’t directly affect R, as that would have lots of nasty failure modes (such as high complexities being justified by tiny, weird volumes).
edit: It should be additive, otherwise 0 total loss is achieved by classifying everything as unknown.
One thought I had was ^V(fθ)L(y,fθ(x))+λR(θ), but this has the problem of vanishing L when the volumes are compact. I’m still not sure whether the relationship between ^V and L should be multiplicative or additive. I am fairly sure that ^V shouldn’t directly affect R, as that would have lots of nasty failure modes (such as high complexities being justified by tiny, weird volumes).
edit: It should be additive, otherwise 0 total loss is achieved by classifying everything as unknown.