I understand that you’re not planning to publicly deanonymize the accounts, and I assume you don’t plan to do so privately either.
That’s right: I stopped once I was reasonably sure this approach worked. I’m not planning to do more with this. I made my scraping code open source, since scraping is something LW and the EA Forum seem fine with, but my stylometry code is not even pushed to a private repo.
I can imagine having more barriers for people to post things “anonymously” (or having them feel less safe when trying to do so) to heavily discourage some of the potentially most useful cases of anonymous accounts. … some people seem to be really worried about their reputation/potential repercussions for what they post
I think maybe we’re thinking about the risks differently? I think it’s common that people who post under an alt think that they’re keeping these identities pretty separate, and do not think someone could connect them with a few hours of playing with open source tools. And so it’s important to make it public knowledge that this approach is not very private, so people can take more thorough steps if better privacy is something they want. Keep in mind that this will only get easier: we’re not far from someone non-technical being able to ask GPT4 to write the code for this. (Possibly that would already work?)
Specifically on the “feel less safe”, if people feel safer than they actually are then they’re not in a position to make well-considered decisions around their safety.
I don’t see why people wouldn’t expect this to work on LW/EAF, given that it worked on HN?
I could have posted “here’s a thing I saw in this other community”, but my guess is people would take it less seriously, partly because they think it’s harder than it actually is. And so “here’s a thing, which didn’t end up being very hard” is much more informative.
I think it’s reasonably common that people who post under an alt think that they’re keeping these identities pretty separate, and do not think someone could connect them with a few hours of playing with open source tools. And so it’s important to make it public knowledge that this approach is not very private, so people can take more thorough steps if better privacy is something they want.
I think that a PSA about accounts on LW/EAF/the internet often not being as anonymous as people think could be good, and should mention stylometry, internet archives, timezones, IP addresses, user agents, browser storage; and suggest using TOR, making a new account for every post/comment, scheduling messages at random times, running comments through LLMs, not using your name or leaking information in other ways, and considering using deliberate disinformation (e.g. pretending to be of the opposite gender, scheduling messages to appear to be in a different timezone, …)
Specifically on the “feel less safe”, if people feel safer than they actually are then they’re not in a position to make well-considered decisions around their safety.
I think this is a very good point.
I could have posted “here’s a thing I saw in this other community”, but my guess is people would take it less seriously, partly because they think it’s harder than it actually is.
I’m not sure about this. I think you could have written that there are very easy ways to deanonymize users, so people who really care about their anonymity should do the things I mentioned above?
I think maybe we’re thinking about the risks differently?
Possibly, I think I might be less optimistic that people can/will, in practice, start changing their posting habits. And probably I think it’s more likely that this post lowers the barrier for an adversarial actor to actively deanonymize people. It reminds me a bit of the tradeoffs you mentioned in your previous post on security culture.
I think it was a good call not to post reproducible code for this, for example, although it might have made it clearer how easy it is and strengthened the value of the demonstration.
I’m not planning to do more with this. I made my scraping code open source, since scraping is something LW and the EA Forum seem fine with, but my stylometry code is not even pushed to a private repo.
Thank you for this, and I do trust you. On some level, anonymous users already had to trust you before this project, since it’s clearly something anyone with some basic coding experience would be able to do if they wanted, but I think now they need to trust you a tiny bit more, since you now just need to press a button instead of spending a few minutes/hours actively working on it.
In any case, I don’t feel strongly about this, and I don’t think it’s important, but I still think that, compared to an informative post without a demonstration, this post increases the probability that an adversarial actor deanonymizes people slightly more than the probability that anonymous users are protected from similar attacks. (Which often are even less sophisticated)
a PSA about accounts on LW/EAF/the internet often not being as anonymous as people think could be good, and mention stylometry, internet archives, timezones, IP addresses, user agents, browser storage; and suggest using TOR, making a new account for every post/comment, scheduling messages at random times, running comments through LLMs, not using your name or leaking information in other ways, and considering using deliberate disinformation (e.g. pretending to be of the opposite gender, scheduling messages to appear to be in a different timezone, …)
A lot of this depends on your threat model. For example, “IP addresses, user agents, browser storage; and suggest using TOR” aren’t much of a concern if you’re mostly just trying to avoid people who read your comments identifying you. But there is subtlety here, including that while you might trust the people running a forum you might not trust everyone who could legally get them to disclose this information.
Possibly, I think I might be less optimistic that people can/will, in practice, start changing their posting habits.
I also do expect my post here to change people’s posting habits, though I suspect an embarrassing public linking of an alt to a main (where someone was doing something like criticizing a rival) would do even more.
It reminds me a bit of the tradeoffs you mentioned in your previous post on security culture.
Definitely! I think this one is pretty well within where computer security culture has the appropriate norms, though.
That’s right: I stopped once I was reasonably sure this approach worked. I’m not planning to do more with this. I made my scraping code open source, since scraping is something LW and the EA Forum seem fine with, but my stylometry code is not even pushed to a private repo.
I think maybe we’re thinking about the risks differently? I think it’s common that people who post under an alt think that they’re keeping these identities pretty separate, and do not think someone could connect them with a few hours of playing with open source tools. And so it’s important to make it public knowledge that this approach is not very private, so people can take more thorough steps if better privacy is something they want. Keep in mind that this will only get easier: we’re not far from someone non-technical being able to ask GPT4 to write the code for this. (Possibly that would already work?)
Specifically on the “feel less safe”, if people feel safer than they actually are then they’re not in a position to make well-considered decisions around their safety.
I could have posted “here’s a thing I saw in this other community”, but my guess is people would take it less seriously, partly because they think it’s harder than it actually is. And so “here’s a thing, which didn’t end up being very hard” is much more informative.
Hmm. I wonder if having an LLM rephrase comments using the same prompt would stymie stylometric analysis.
You could have an easy checkbox “rewrite comments to prevent stylometric analysis” as a setting for alt accounts.
I agree with this. I think sometimes people are pretty clueless. E.g. people post under their first name and use the same IP. (There is at least one very similar recent example, but I can’t link to it.)
I think that a PSA about accounts on LW/EAF/the internet often not being as anonymous as people think could be good, and should mention stylometry, internet archives, timezones, IP addresses, user agents, browser storage; and suggest using TOR, making a new account for every post/comment, scheduling messages at random times, running comments through LLMs, not using your name or leaking information in other ways, and considering using deliberate disinformation (e.g. pretending to be of the opposite gender, scheduling messages to appear to be in a different timezone, …)
I think this is a very good point.
I’m not sure about this. I think you could have written that there are very easy ways to deanonymize users, so people who really care about their anonymity should do the things I mentioned above?
Possibly, I think I might be less optimistic that people can/will, in practice, start changing their posting habits. And probably I think it’s more likely that this post lowers the barrier for an adversarial actor to actively deanonymize people. It reminds me a bit of the tradeoffs you mentioned in your previous post on security culture.
I think it was a good call not to post reproducible code for this, for example, although it might have made it clearer how easy it is and strengthened the value of the demonstration.
Thank you for this, and I do trust you. On some level, anonymous users already had to trust you before this project, since it’s clearly something anyone with some basic coding experience would be able to do if they wanted, but I think now they need to trust you a tiny bit more, since you now just need to press a button instead of spending a few minutes/hours actively working on it.
In any case, I don’t feel strongly about this, and I don’t think it’s important, but I still think that, compared to an informative post without a demonstration, this post increases the probability that an adversarial actor deanonymizes people slightly more than the probability that anonymous users are protected from similar attacks. (Which often are even less sophisticated)
A lot of this depends on your threat model. For example, “IP addresses, user agents, browser storage; and suggest using TOR” aren’t much of a concern if you’re mostly just trying to avoid people who read your comments identifying you. But there is subtlety here, including that while you might trust the people running a forum you might not trust everyone who could legally get them to disclose this information.
Maybe, though note that getting it out in the open allows us talking about ways to fix it, including things like convenient stylometry-thwarting tooling.
I also do expect my post here to change people’s posting habits, though I suspect an embarrassing public linking of an alt to a main (where someone was doing something like criticizing a rival) would do even more.
Definitely! I think this one is pretty well within where computer security culture has the appropriate norms, though.