Would the checks of the naturality conditions you have in mind primarily be empirical (e.g. sampling a bunch of data points and running some statistical independence checks), or might they just as often be mechanistic (e.g. not sure how that would work for complex models like Llama but e.g. for a Bayes net you obviously already have a factorization that makes robust model independence checks much easier)?
Asking because the idea of “in some model” (plus the desire for e.g. adversarial robustness) suggests to me that we’d want to have a more mechanistic idea of whether the naturality conditions hold, but they seem easier to check empirically.
Would the checks of the naturality conditions you have in mind primarily be empirical (e.g. sampling a bunch of data points and running some statistical independence checks), or might they just as often be mechanistic (e.g. not sure how that would work for complex models like Llama but e.g. for a Bayes net you obviously already have a factorization that makes robust model independence checks much easier)?
Asking because the idea of “in some model” (plus the desire for e.g. adversarial robustness) suggests to me that we’d want to have a more mechanistic idea of whether the naturality conditions hold, but they seem easier to check empirically.
That’s a big open question which we’re still figuring out.