We didn’t investigate the specific question of whether it’s raw diversity or specific features. In the Grosse et al paper on influence functions, they find that “high influence scores are relatively rare and they cover a large portion of the total influence”. This (vaguely) suggests that the top k paraphrases would do most of the work, which is what I would guess. That said, this is really something that should be investigated with more experiments.
We didn’t investigate the specific question of whether it’s raw diversity or specific features. In the Grosse et al paper on influence functions, they find that “high influence scores are relatively rare and they cover a large portion of the total influence”. This (vaguely) suggests that the top k paraphrases would do most of the work, which is what I would guess. That said, this is really something that should be investigated with more experiments.