I though a bit about it, but I think Tay is basically a software version of a parrot that repeats back what it hears—I don’t think it has any commonsense knowledge or serious attempt to understand that tweets are about a world that exists outside of twitter. I.e it has no semantics, it’s just a syntax manipulator that uses some kind of probabilistic language model to generate grammatically correct sentences and a machine learning model to try and learn which kind of sentences will get the most retweets or will most closely resemble other things people are tweeting about. Tay does’t know what a “Nazi” actually is. I haven’t looked into it in any detail but I know enough to guess that that’s how it works.
As such, the failure of Tay doesn’t particularly tell us much about Friendliness, because friendliness research pertains to superintelligent AIs which would definitely have a correct ontology/semantics and understand the world.
However, it does tell us that a sufficiently stupid, amateurish attempt to harvest human values using an infrahuman intelligence wouldn’t reliably work. This is obvious to anyone who has been “in the trade” for a while, however it does seem to surprise the mainstream media.
It’s probably useful as a rude slap-in-the-face to people who are so ignorant of how software and machine learning work that they think friendliness is a non-issue.
Tay doesn’t tell us much about deliberate Un-Friendliness. But Tay does tell us that a well-intentioned effort to make an innocent, harmless AI can go wrong for unexpected reasons. Even for reasons that, in hindsight, are obvious.
Are you sure that superintelligent AIs would have a “correct ontology/semantics”? They would have to have a useful one, in order to achieve their goals, but both philosophers and scientists have had incorrect conceptualizations that nevertheless matched the real world closely enough to be productive. And for an un-Friendly AI, “productive” translates to “using your atoms for its own purposes.”
Are you sure that superintelligent AIs would have a “correct ontology/semantics”?
it’s hard to imagine a superintelligent AGI that didn’t know basic facts about the world like “trees have roots underground” or “most human beings sleep at night”.
They would have to have a useful one, in order to achieve their goals
Useful models of reality (useful in the sense of achieving goals) tend to be ones that are accurate. This is especially true of a single agent that isn’t subject to the weird foibles of human psychology and isn’t mainly achieving things via signalling like many humans do.
The reason I made the point about having a correct understanding of the world, for example knowing what the term “Nazi” actually means, is that Tay has not achieved the status of being “unfriendly”, because it doesn’t actually have anything that could reasonably be called goals pertaining to the world. Tay is not even an unfriendly infra-intelligence. Though I’d be very interested if someone managed to make one.
I though a bit about it, but I think Tay is basically a software version of a parrot that repeats back what it hears—I don’t think it has any commonsense knowledge or serious attempt to understand that tweets are about a world that exists outside of twitter. I.e it has no semantics
Well neither does image recognition software. Neither does Google’s search algorithm.
it does tell us that a sufficiently stupid, amateurish attempt to harvest human values using an infrahuman intelligence wouldn’t reliably work.
You probably mean “reliably wouldn’t work” :-)
However I have to question whether the Tay project was an attempt to harvest human values. As you mentioned, Tay lacks understanding of what she hears or says and so whatever it “learned” about humanity by listening to Twitter it would have been able to learn by straightforward statistical analysis of the corpus of text from Twitter.
I though a bit about it, but I think Tay is basically a software version of a parrot that repeats back what it hears—I don’t think it has any commonsense knowledge or serious attempt to understand that tweets are about a world that exists outside of twitter. I.e it has no semantics, it’s just a syntax manipulator that uses some kind of probabilistic language model to generate grammatically correct sentences and a machine learning model to try and learn which kind of sentences will get the most retweets or will most closely resemble other things people are tweeting about. Tay does’t know what a “Nazi” actually is. I haven’t looked into it in any detail but I know enough to guess that that’s how it works.
As such, the failure of Tay doesn’t particularly tell us much about Friendliness, because friendliness research pertains to superintelligent AIs which would definitely have a correct ontology/semantics and understand the world.
However, it does tell us that a sufficiently stupid, amateurish attempt to harvest human values using an infrahuman intelligence wouldn’t reliably work. This is obvious to anyone who has been “in the trade” for a while, however it does seem to surprise the mainstream media.
It’s probably useful as a rude slap-in-the-face to people who are so ignorant of how software and machine learning work that they think friendliness is a non-issue.
Tay doesn’t tell us much about deliberate Un-Friendliness. But Tay does tell us that a well-intentioned effort to make an innocent, harmless AI can go wrong for unexpected reasons. Even for reasons that, in hindsight, are obvious.
Are you sure that superintelligent AIs would have a “correct ontology/semantics”? They would have to have a useful one, in order to achieve their goals, but both philosophers and scientists have had incorrect conceptualizations that nevertheless matched the real world closely enough to be productive. And for an un-Friendly AI, “productive” translates to “using your atoms for its own purposes.”
it’s hard to imagine a superintelligent AGI that didn’t know basic facts about the world like “trees have roots underground” or “most human beings sleep at night”.
Useful models of reality (useful in the sense of achieving goals) tend to be ones that are accurate. This is especially true of a single agent that isn’t subject to the weird foibles of human psychology and isn’t mainly achieving things via signalling like many humans do.
The reason I made the point about having a correct understanding of the world, for example knowing what the term “Nazi” actually means, is that Tay has not achieved the status of being “unfriendly”, because it doesn’t actually have anything that could reasonably be called goals pertaining to the world. Tay is not even an unfriendly infra-intelligence. Though I’d be very interested if someone managed to make one.
Well neither does image recognition software. Neither does Google’s search algorithm.
You probably mean “reliably wouldn’t work” :-)
However I have to question whether the Tay project was an attempt to harvest human values. As you mentioned, Tay lacks understanding of what she hears or says and so whatever it “learned” about humanity by listening to Twitter it would have been able to learn by straightforward statistical analysis of the corpus of text from Twitter.