V_V comments on Could auto-generated troll scores reduce Twitter and Facebook harassments?

V_V 30 Apr 2015 15:30 UTC
15 points
Machine learning methods can often have good accuracy at population level, but fail spectacularly on specific instances, and if the instance-level outputs are visible to the public, these failures may be quite embarrassing: imagine if somebody posted a quote by Churchill or a passage from the Qur’an and it was mistakenly tagged as trolling.

Even if the machine learning system was not misclassifying, it can never be better than the data it is trained on.
If you train it on user ratings, it will turn into a popularity contest, with unpopular opinions being tagged as trolling, especially once the politically-motivated users figure out how to manipulate the system by voting strategically.
If it is based on a dataset curated by the company it would be perceived as the company exerting ideological and political bias in a subtle, opaque manner.

In general it would be seen as a powerful institution putting an official seal of approval/mark of shame on speech and people. It reeks of totalitarianism.
- DanArmak 2 May 2015 14:41 UTC
  1 point
  Parent
  
  imagine if somebody posted a quote by Churchill or a passage from the Qur’an and it was mistakenly tagged as trolling.
  
  If the trolling criteria include “racism” or “threats of violence” (as in the OP), I think both of these sources would be correctly matched by the software. (Which is not to say I think we should censor them.)
  
  If the criteria include the generous “language [which] is offensive” (also in the OP), I think most language ever written would turn out to be offensive to someone.
  - V_V 2 May 2015 15:46 UTC
    0 points
    Parent
    
    If the trolling criteria include “racism” or “threats of violence” (as in the OP), I think both of these sources would be correctly matched by the software. (Which is not to say I think we should censor them.)
    
    Indeed.
- Stefan_Schubert 1 May 2015 11:12 UTC
  1 point
  Parent
  My suggestion was not to train the system on user ratings:
  
  The first is to let a number of sensible people give their troll scores of different Facebook posts and tweets (using the general and vague definition of what is to count as trolling). You would feed this into your algorithms, which would learn which combinations of words are characteristic of trolls (as judged by these people), and which arent’t. The second is to simply list a number of words or phrases which would count as characteristic of trolls, in the sense of the general and vague definition.
  - V_V 1 May 2015 23:21 UTC
    2 points
    Parent
    So, essentially it would depend on the company opinion.
    
    Anyway, lists of words or short phrases won’t work. Keep in mind that trolls are human intelligences, any AI short of Turing-test level won’t beat human intelligences at their own game.