I like the addition of the pseudo-equivalences; the graph seems a lot more accurate as a representation of my views once that’s done.
But it seems to me that there’s something missing in terms of acceptability.
The definition of “objective robustness” I used says “aligns with the base objective” (including off-distribution). But I think this isn’t an appropriate representation of your approach. Rather, “objective robustness” has to be defined something like “generalizes acceptably”. Then, ideas like adversarial training and checks and balances make sense as a part of the story.
WRT your suggestions, I think there’s a spectrum from “clean” to “not clean”, and the ideas you propose could fall at multiple points on that spectrum (depending on how they are implemented, how much theory backs them up, etc). So, yeah, I favor “cleaner” ideas than you do, but that doesn’t rule out this path for me.
The definition of “objective robustness” I used says “aligns with the base objective” (including off-distribution). But I think this isn’t an appropriate representation of your approach. Rather, “objective robustness” has to be defined something like “generalizes acceptably”. Then, ideas like adversarial training and checks and balances make sense as a part of the story.
But it seems to me that there’s something missing in terms of acceptability.
The definition of “objective robustness” I used says “aligns with the base objective” (including off-distribution). But I think this isn’t an appropriate representation of your approach. Rather, “objective robustness” has to be defined something like “generalizes acceptably”. Then, ideas like adversarial training and checks and balances make sense as a part of the story.
WRT your suggestions, I think there’s a spectrum from “clean” to “not clean”, and the ideas you propose could fall at multiple points on that spectrum (depending on how they are implemented, how much theory backs them up, etc). So, yeah, I favor “cleaner” ideas than you do, but that doesn’t rule out this path for me.
Yeah, strong +1.
Great! I feel like we’re making progress on these basic definitions.