For example, we were positively surprised by the reception to… his March piece in TIME magazine (which was TIME’s highest-traffic page for a week).
THIS is what real social media/public opinion research looks like.
Not counting likes. Not scrolling down your social media news feed and recording the results in a spreadsheet. Not approaching it from a fancy angle like searching “effective altruism” on Twitter and assuming that you are capable of eyeballing bot accounts. Certainly not building your own crawler, which requires you to assume that the platform’s security teams can’t see it from a mile away and already set it automatially to generate false data for every account that doesn’t perfectly mimic human scrolling.
In the 2020s, real public opinion research requires awareness that if you want to understand how Americans think, you have to recognize that you are competing against domestic and foreign intelligence agencies, running their own botnets to evade detection, get that data, and detect enemy botnets. It’s not an issue at all if you’re doing paper surveys, and has historically been the dominating factor if facebook-sized orgs are involved.
To the people who downvoted with no explanation: why?
Issues related measuring public opinion are now revealed to be extremely worth thinking about, especially when it comes to data science in the 21st century, and I’ve been researching this matter professionally for years.
I think it’s worth pointing out that MIRI has succeeded at getting relatively strong internet-related impact data, while many otherwise-competent orgs and people have failed, and also pointing out why this is the case.
I didn’t downvote (I’m just now seeing this for the first time), but the above comment left me confused about why you believe a number of things:
What methodology do you think MIRI used to ascertain that the Time piece was impactful, and why do you think that methodology isn’t vulnerable to bots or other kinds of attacks?
Why would social media platforms go to the trouble of feeding fake data to bots instead of just blocking them? What would they hope to gain thereby?
What does any of this have to do with the Social Science One incident?
In general, what’s your threat model? How are the intelligence agencies involved? What are they trying to do?
Who are you even arguing with? Is there a particular group of EAsphere people who you think are doing public opinion research in a way that doesn’t make sense?
Also, I think a lot of us don’t take claims like “I’ve been researching this matter professionally for years” seriously because they’re too vaguely worded; you might want to be a bit more specific about what kind of work you’ve done.
why do you think that methodology isn’t vulnerable to bots or other kinds of attacks?
Ah, yes, I thought that methodology wasn’t vulnerable to bots or other kinds of attacks because I was wrong. Oops. Glad I asked.
For the other stuff, I’ve explained it pretty well in the past but you’re right that I did an inadequate job covering it here. Blocking bots is basically giving constructive feedback to the people running the botnet (since they can tell when bots are blocked), on how to run botnets without detection; it’s critical to conceal every instance of a detected bot for as long as possible, which is why things like shadowbanning and vote fuzzing are critical for security for modern social media platforms. This might potentially explain why amateurish bots are so prevalent; state-level attackers can easily run both competent botnets and incompetent botnets simultaneously and learn valuable lessons about the platform’s security system from both types of botnets (there’s other good explanations though). Not sure how I missed such a large hole in my explanation in the original comment, but still, glad I asked.
Does that logic apply to crawlers that don’t try to post or vote, as in the public-opinion-research use case? The reason to block those is just that they drain your resources, so sophisticated measures to feed them fake data would be counterproductive.
THIS is what real social media/public opinion research looks like.
Not counting likes. Not scrolling down your social media news feed and recording the results in a spreadsheet. Not approaching it from a fancy angle like searching “effective altruism” on Twitter and assuming that you are capable of eyeballing bot accounts. Certainly not building your own crawler, which requires you to assume that the platform’s security teams can’t see it from a mile away and already set it automatially to generate false data for every account that doesn’t perfectly mimic human scrolling.
In the 2020s, real public opinion research requires awareness that if you want to understand how Americans think, you have to recognize that you are competing against domestic and foreign intelligence agencies, running their own botnets to evade detection, get that data, and detect enemy botnets. It’s not an issue at all if you’re doing paper surveys, and has historically been the dominating factor if facebook-sized orgs are involved.
To the people who downvoted with no explanation: why?
Issues related measuring public opinion are now revealed to be extremely worth thinking about, especially when it comes to data science in the 21st century, and I’ve been researching this matter professionally for years.
I think it’s worth pointing out that MIRI has succeeded at getting relatively strong internet-related impact data, while many otherwise-competent orgs and people have failed, and also pointing out why this is the case.
I didn’t downvote (I’m just now seeing this for the first time), but the above comment left me confused about why you believe a number of things:
What methodology do you think MIRI used to ascertain that the Time piece was impactful, and why do you think that methodology isn’t vulnerable to bots or other kinds of attacks?
Why would social media platforms go to the trouble of feeding fake data to bots instead of just blocking them? What would they hope to gain thereby?
What does any of this have to do with the Social Science One incident?
In general, what’s your threat model? How are the intelligence agencies involved? What are they trying to do?
Who are you even arguing with? Is there a particular group of EAsphere people who you think are doing public opinion research in a way that doesn’t make sense?
Also, I think a lot of us don’t take claims like “I’ve been researching this matter professionally for years” seriously because they’re too vaguely worded; you might want to be a bit more specific about what kind of work you’ve done.
Ah, yes, I thought that methodology wasn’t vulnerable to bots or other kinds of attacks because I was wrong. Oops. Glad I asked.
For the other stuff, I’ve explained it pretty well in the past but you’re right that I did an inadequate job covering it here. Blocking bots is basically giving constructive feedback to the people running the botnet (since they can tell when bots are blocked), on how to run botnets without detection; it’s critical to conceal every instance of a detected bot for as long as possible, which is why things like shadowbanning and vote fuzzing are critical for security for modern social media platforms. This might potentially explain why amateurish bots are so prevalent; state-level attackers can easily run both competent botnets and incompetent botnets simultaneously and learn valuable lessons about the platform’s security system from both types of botnets (there’s other good explanations though). Not sure how I missed such a large hole in my explanation in the original comment, but still, glad I asked.
Does that logic apply to crawlers that don’t try to post or vote, as in the public-opinion-research use case? The reason to block those is just that they drain your resources, so sophisticated measures to feed them fake data would be counterproductive.