It’s well studied. I’m not an expert in differential privacy and would need to read multiple papers in depth to be sure I’d answered precisely, but I know that at an english level of mathematical description, what’s guaranteed is that there is definitely not memorization of individual datapoints, so any successful performance on a test set is definitely generalization. That doesn’t mean it’s causal generalization though. and the accuracy is usually worse—getting both differential privacy and capabilities pushes non-differentially-private capabilities more, usually, I think, or something. I’d have to go paper hunting to find how well it performs. Instead of doing that, I’ll post my usual pitch: I strongly encourage you to do your level best to find some papers, because that shit is not always trivial and attempting and failing is a really good finding-stuff workout. tools I’d recommend include (in order of recommendation) wiki page on differential privacy, opening papers on semantic scholar → a result and browsing forward through the citations they’ve received, especially sorted by latest (same link as “a result” but with sort), maybe a metaphor.systems query on the topic (sign in required but worth it). problem is, these results don’t directly answer your question without doing some reading; I’d suggest opening papers, skimming through them fast to see if they answer, and browse the paper citation graph forwards and backwards towards titles that sound relevant. it might help to put the papers that seem relevant into papermap.xyz [edit: giving me 500 errors now] (via reddit u/Neabfi) or https://my.paperscape.org/ (which helps with forward and backward browsing a lot but is a jankier ui than papermap).
Thx for these links. I’ll need some time for a deeper reading, but after a few hours the first bits of my take home message are probably settled to: there’s no theorical guarantee that pursuing DP (or, more precisely, pursuing any of DP numerous variants) lead to worse result, except that’s what everyone report in practice, so if that’s a pure technical issue it’ll probably be not trivial to fix it.
I like your note that DP generalizability might not be causal generalizability: that’s both a contender for explaining why these seemingly technical difficulties arise and a potentially key thought for improving this strategy.
because DP gives up performance for guaranteed generalization.
Is « garanteed » important in your answer? E.g. do you know some code that shows this is real in practice or it’s more of a theorical result?
It’s well studied. I’m not an expert in differential privacy and would need to read multiple papers in depth to be sure I’d answered precisely, but I know that at an english level of mathematical description, what’s guaranteed is that there is definitely not memorization of individual datapoints, so any successful performance on a test set is definitely generalization. That doesn’t mean it’s causal generalization though. and the accuracy is usually worse—getting both differential privacy and capabilities pushes non-differentially-private capabilities more, usually, I think, or something. I’d have to go paper hunting to find how well it performs. Instead of doing that, I’ll post my usual pitch: I strongly encourage you to do your level best to find some papers, because that shit is not always trivial and attempting and failing is a really good finding-stuff workout. tools I’d recommend include (in order of recommendation) wiki page on differential privacy, opening papers on semantic scholar → a result and browsing forward through the citations they’ve received, especially sorted by latest (same link as “a result” but with sort), maybe a metaphor.systems query on the topic (sign in required but worth it). problem is, these results don’t directly answer your question without doing some reading; I’d suggest opening papers, skimming through them fast to see if they answer, and browse the paper citation graph forwards and backwards towards titles that sound relevant. it might help to put the papers that seem relevant into
papermap.xyz [edit: giving me 500 errors now](via reddit u/Neabfi) or https://my.paperscape.org/ (which helps with forward and backward browsing a lot but is a jankier ui than papermap).some results from the metaphor query:
https://www.borealisai.com/research-blogs/tutorial-12-differential-privacy-i-introduction/
https://differentialprivacy.org/
https://opendp.org/about
I don’t think it does in general, and every case I can think of right now did not, but I agree that it is a worthwhile thing to worry about.
I’d add clicking through citations and references on arxiv and looking at the litmap explorer in arxiv.
Thx for these links. I’ll need some time for a deeper reading, but after a few hours the first bits of my take home message are probably settled to: there’s no theorical guarantee that pursuing DP (or, more precisely, pursuing any of DP numerous variants) lead to worse result, except that’s what everyone report in practice, so if that’s a pure technical issue it’ll probably be not trivial to fix it.
I like your note that DP generalizability might not be causal generalizability: that’s both a contender for explaining why these seemingly technical difficulties arise and a potentially key thought for improving this strategy.