you obviously can narrow that size down to groups of n = 1
I’m looking at what LLMs can infer about the current user (& how that’s represented in the model) as part of my research currently, and I think this is a very useful framing; given a universe of n possible users, how much information does the LLM need on average to narrow that universe to 1 with high confidence, with a theoretical minimum of log2(n) bits.
I do think there’s an interesting distinction here between authors who may have many texts in the training data, who can be fully identified, and users (or authors) who don’t; in the latter case it’s typically impossible (without access to external resources) to eg determine the user’s name, but as the “Beyond Memorization” paper shows (thanks for linking that), models can still deduce quite a lot.
It also seems worth understanding how the model represents info about the user, and that’s a key thing I’d like to investigate.
I’m looking at what LLMs can infer about the current user (& how that’s represented in the model) as part of my research currently, and I think this is a very useful framing; given a universe of n possible users, how much information does the LLM need on average to narrow that universe to 1 with high confidence, with a theoretical minimum of log2(n) bits.
This isn’t very related to what you’re talking about, but it is related and also by gwern, so have you read Death Note: L, Anonymity & Eluding Entropy? People leak bits all the time.
I’m looking at what LLMs can infer about the current user (& how that’s represented in the model) as part of my research currently, and I think this is a very useful framing; given a universe of n possible users, how much information does the LLM need on average to narrow that universe to 1 with high confidence, with a theoretical minimum of log2(n) bits.
I do think there’s an interesting distinction here between authors who may have many texts in the training data, who can be fully identified, and users (or authors) who don’t; in the latter case it’s typically impossible (without access to external resources) to eg determine the user’s name, but as the “Beyond Memorization” paper shows (thanks for linking that), models can still deduce quite a lot.
It also seems worth understanding how the model represents info about the user, and that’s a key thing I’d like to investigate.
This isn’t very related to what you’re talking about, but it is related and also by gwern, so have you read Death Note: L, Anonymity & Eluding Entropy? People leak bits all the time.
I have, but it was years ago; seems worth looking back at. Thanks!