For anyone not clicking to read the article: its author “was technical leader for Google’s social efforts (including photos)” at the time, and doesn’t cite any public sources for the information. So we should at least consider how that’s going to colour their interpretation/representation of the information.
They don’t mention how often black people were classified as gorillas, and how that compared to white people being classified as dogs or seals. It could be that for every thousand cases of the former, there was one of the latter, or it may be one for one. My sibling comment says “The dataset had a good mix of races in it” (which I take to mean there was some reasonable proportional representation of races) - the article doesn’t claim that. It says “the training data included a wide range of people of all races and colors”, contrasting that with HP webcams where “the training data for “faces” had been composed exclusively of white people”—so it clears the bar of not being exclusively white, but we don’t know by how much. In fact, the article goes on to say (due to photography practices) “our standards for what constitute “good images” still overwhelmingly favor white faces rather than black ones.”
I’m writing this rather nitpicky comment because this is the top comment with rather strong wording (“no journalists bothered reporting this, but that system classified white people as ‘dogs’ and ‘seals’”), that on another day I might have just taken on faith (especially if I’d seen it was from gwern, which I didn’t at first) - I would have assumed the link contained a study, or at least images of results pages, and contained solid additional information about these results from a third party.
So, how many third parties reported about the classification and how trustworthy were they? How many studies were conducted on the classification of black people as gorillas? What should we make of an ecosystem which tells us on a literally daily to weekly basis (google the term) about the gorillas, but never, ever tells you about the seals (I only learned about that one because I was reading the Google expert’s post for other reasons)? What should we infer about the epistemics and justifications of the various experts and reporting here?
I’m writing this rather nitpicky comment because this is the top comment replying with rather strong wording about sourcing and studies and double standards for reporting...
I’m writing this rather nitpicky comment because this is the top comment replying with rather strong wording about sourcing and studies and double standards for reporting...
This is an unnecessarily snarky addition to the comment that’s disappointing to see (and doesn’t even make sense, since mine is neither a top comment, nor does it mention studies). In case you interpreted the “especially if I’d seen it was from gwern” in a negative way, I meant it as a factual statement that I saw you as a person with high respect and trust and hence assigned high prior confidence to things from you.
So, how many third parties reported about the classification and how trustworthy were they?
The original post was by a disinterested third party sharing a screenshot. However small the level of evidence provided by that is, an offhand statement by someone literally involved in the project and with no attached evidence at all is obviously much weaker.
What should we make of an ecosystem which tells us on a literally daily to weekly basis (google the term) about the gorillas, but never, ever tells you about the seals (I only learned about that one because I was reading the Google expert’s post for other reasons)? What should we infer about the epistemics and justifications of the various experts and reporting here?
Let’s simulate two worlds:
In world (A), Google’s PR team were so incompetent that they did not mention this “white people misidentified as seals” at the time, even to say “our system sometimes misidentifies people as animals, for eg. <photo of white people with seal tag>, and we are improving the system”—which would have softened the PR blow significantly. Users see white people tagged as animals, but they never ever share it; or they do, but no one bothers to report it, not even a tabloid with space to fill and low standards, not even contrarian media that carries “All Lives Matter” articles and would love to use any “attack on whiteness”; and the screenshot doesn’t go viral either (despite “bots are funnily dumb” being a favourite meme category).
In world (B), the “white people tagged as seals” happens either only in obviously-distorted or blurry photos, or only in an internal test system that never even got out of Google, or in some other not-usable-for-PR way. Journalists do not report on it because they don’t see it. A Googler writes a non-official Medium article that’s not focused on this, two years after the fact, and mentions it in a couple of sentences offhandedly. Perhaps one or two journalists happen to read it for other reasons, just like you, but it’s mentioned as a past bug that’s likely fixed, and there’s no supporting evidence, nothing to show their editor as a potential article, so they move on.
With the evidence available to us, something similar to world (B) seems much more likely than world (A).
This is an unnecessarily snarky addition to the comment that’s disappointing to see (and doesn’t even make sense, since mine is neither a top comment, nor does it mention studies).
How should I reply to such a flagrant double standard, where a Twitter screenshot calling out Google is incontrovertible ‘disinterested’ evidence never to be questioned, and any objection is instead required to be multiple independent third-party studies?
The original post was by a disinterested third party sharing a screenshot.
They were not disinterested in the least! They were specifically ‘calling out’ and shaming Google for it, and it worked brilliantly in earning them huge fake internet points. (Someone who left Google and is mentioning it in an aside years later which no one noticed, that’s much closer to disinterested.)
With the evidence available to us, something similar to world (B) seems much more likely than world (A).
No, it doesn’t. A is vastly more plausible. Happens every time. You don’t believe the seal thing? Fine, look at the recent Twitter cropping thing! You see anyone pointing out that the social media campaigns about how biased the cropping algorithm were wildly incorrect and exaggerated in every way, and missed the actual pro-woman biases that Twitter’s followup study showed? (You wanted a study...) Or Tay! AFAICT, Tay could not learn, and so the media narrative is impossible. Did you hear that from anyone yet? Or how about the ‘arrested because facial recognition software accused them of shoplifting’? It’s literally in the original media article that they were arrested because a human told the cops to do so; did you hear that from anyone yet? World B is exactly what happens frequently. Did you not pay attention to how things like Timnit Gebru’s ultimatum was completely laundered out of media accounts? Or how about Mitchell, where media outlets quoting the Google statement edited out the part of the statement mentioning, y’know, what she did to get fired (dumping docs to leak to friendly reporters)? You’ve seen the factoids about how much CO2 training a DL model costs, did you see any of the followups like “oops we overestimated the cost by 10,000%” or “actually the cost is 0 because the datacenters use renewable energy”? How about that time that a prominent activist and Nvidia VP shared a literal enemies list for her followers to coordinate attacks on, which you could earn membership on for liking the wrong tweet? Do you not pay any attention to how activists and the media work? Have you not noticed the techlash or the ideological affiliations of reporters? We live in world A, not world B.
I believe there’s a conflict-vs-mistake thing going on here. I saw this as a specific case where the probabilities don’t line up the way your comment assumes they do. You seem to be from the beginning assuming I’m opposing an entire worldview and seeing it as some attack.
where a Twitter screenshot calling out Google is incontrovertible ‘disinterested’ evidence never to be questioned
Behold, the Straw Man! Today for his trick, he turns “However small X’s level of evidence is, Y has even weaker evidence” into “X is incontrovertible evidence never to be questioned”.
I considered the original tweet, the fact that Google did not refute it, and the claim that Google blocked ape-related tags from Photos for years afterwards (with its own attached probability), and updated my inner measure of probability of this being true. And I find that the statement from the ex-Googler provides even weaker evidence to update based on. Nothing is “never to be questioned” here.
any objection is instead required to be multiple independent third-party studies?
Again a thing I didn’t say, and you keep repeating.
They were not disinterested in the least! They were specifically ‘calling out’ and shaming Google for it
A “claim from a disinterested party” means someone previously disinterested, someone where our prior assumptions can be close to assuming they are disinterested. A police officer is considered disinterested in a case if their family and friends are not involved in the case, and so can be assigned the case. It makes no sense to say “they’ve been assigned to the case now, so they’re not a disinterested party”!
They were specifically ‘calling out’ and shaming Google for it and it worked brilliantly in earning them huge fake internet points.
This is the original tweetpointing out the issue. To me it just looks like a user casually pointing out a problem with a tool they use. Not everyone is obsessed all the time with culture wars and internet points.
(Someone who left Google and is mentioning it in an aside years later which no one noticed, that’s much closer to disinterested.)
Oh by the way, I found evidence that this author tweeted this seals claim back at the time as part of the Twitter thread about this. Still without any links or images, but that helped update my probabilities a little bit (as I had believed that part of the problem might be that it came two years later, as mentioned before). I wish that’s what this conversation had consisted of, actual evidence to try to arrive at the truth, instead of straw men and outright false claims.
The last paragraph has little to do with the claims here, unless you’re dumping an entire opposing worldview on me, and arguing against that imaginary person. For the record, I am much closer to your worldview regarding these issues and have noticed most of the things you mentioned. It’s just that in this instance even with that background there isn’t good enough evidence to believe the media suppressed some narrative.
My original comment has served its purpose to provide additional context for those who want it, and I don’t think further discussion with you here will be productive. Thanks for all the fish.
For anyone not clicking to read the article: its author “was technical leader for Google’s social efforts (including photos)” at the time, and doesn’t cite any public sources for the information. So we should at least consider how that’s going to colour their interpretation/representation of the information.
They don’t mention how often black people were classified as gorillas, and how that compared to white people being classified as dogs or seals. It could be that for every thousand cases of the former, there was one of the latter, or it may be one for one. My sibling comment says “The dataset had a good mix of races in it” (which I take to mean there was some reasonable proportional representation of races) - the article doesn’t claim that. It says “the training data included a wide range of people of all races and colors”, contrasting that with HP webcams where “the training data for “faces” had been composed exclusively of white people”—so it clears the bar of not being exclusively white, but we don’t know by how much. In fact, the article goes on to say (due to photography practices) “our standards for what constitute “good images” still overwhelmingly favor white faces rather than black ones.”
I’m writing this rather nitpicky comment because this is the top comment with rather strong wording (“no journalists bothered reporting this, but that system classified white people as ‘dogs’ and ‘seals’”), that on another day I might have just taken on faith (especially if I’d seen it was from gwern, which I didn’t at first) - I would have assumed the link contained a study, or at least images of results pages, and contained solid additional information about these results from a third party.
So, how many third parties reported about the classification and how trustworthy were they? How many studies were conducted on the classification of black people as gorillas? What should we make of an ecosystem which tells us on a literally daily to weekly basis (google the term) about the gorillas, but never, ever tells you about the seals (I only learned about that one because I was reading the Google expert’s post for other reasons)? What should we infer about the epistemics and justifications of the various experts and reporting here?
I’m writing this rather nitpicky comment because this is the top comment replying with rather strong wording about sourcing and studies and double standards for reporting...
This is an unnecessarily snarky addition to the comment that’s disappointing to see (and doesn’t even make sense, since mine is neither a top comment, nor does it mention studies). In case you interpreted the “especially if I’d seen it was from gwern” in a negative way, I meant it as a factual statement that I saw you as a person with high respect and trust and hence assigned high prior confidence to things from you.
The original post was by a disinterested third party sharing a screenshot. However small the level of evidence provided by that is, an offhand statement by someone literally involved in the project and with no attached evidence at all is obviously much weaker.
Let’s simulate two worlds:
In world (A), Google’s PR team were so incompetent that they did not mention this “white people misidentified as seals” at the time, even to say “our system sometimes misidentifies people as animals, for eg. <photo of white people with seal tag>, and we are improving the system”—which would have softened the PR blow significantly. Users see white people tagged as animals, but they never ever share it; or they do, but no one bothers to report it, not even a tabloid with space to fill and low standards, not even contrarian media that carries “All Lives Matter” articles and would love to use any “attack on whiteness”; and the screenshot doesn’t go viral either (despite “bots are funnily dumb” being a favourite meme category).
In world (B), the “white people tagged as seals” happens either only in obviously-distorted or blurry photos, or only in an internal test system that never even got out of Google, or in some other not-usable-for-PR way. Journalists do not report on it because they don’t see it. A Googler writes a non-official Medium article that’s not focused on this, two years after the fact, and mentions it in a couple of sentences offhandedly. Perhaps one or two journalists happen to read it for other reasons, just like you, but it’s mentioned as a past bug that’s likely fixed, and there’s no supporting evidence, nothing to show their editor as a potential article, so they move on.
With the evidence available to us, something similar to world (B) seems much more likely than world (A).
How should I reply to such a flagrant double standard, where a Twitter screenshot calling out Google is incontrovertible ‘disinterested’ evidence never to be questioned, and any objection is instead required to be multiple independent third-party studies?
They were not disinterested in the least! They were specifically ‘calling out’ and shaming Google for it, and it worked brilliantly in earning them huge fake internet points. (Someone who left Google and is mentioning it in an aside years later which no one noticed, that’s much closer to disinterested.)
No, it doesn’t. A is vastly more plausible. Happens every time. You don’t believe the seal thing? Fine, look at the recent Twitter cropping thing! You see anyone pointing out that the social media campaigns about how biased the cropping algorithm were wildly incorrect and exaggerated in every way, and missed the actual pro-woman biases that Twitter’s followup study showed? (You wanted a study...) Or Tay! AFAICT, Tay could not learn, and so the media narrative is impossible. Did you hear that from anyone yet? Or how about the ‘arrested because facial recognition software accused them of shoplifting’? It’s literally in the original media article that they were arrested because a human told the cops to do so; did you hear that from anyone yet? World B is exactly what happens frequently. Did you not pay attention to how things like Timnit Gebru’s ultimatum was completely laundered out of media accounts? Or how about Mitchell, where media outlets quoting the Google statement edited out the part of the statement mentioning, y’know, what she did to get fired (dumping docs to leak to friendly reporters)? You’ve seen the factoids about how much CO2 training a DL model costs, did you see any of the followups like “oops we overestimated the cost by 10,000%” or “actually the cost is 0 because the datacenters use renewable energy”? How about that time that a prominent activist and Nvidia VP shared a literal enemies list for her followers to coordinate attacks on, which you could earn membership on for liking the wrong tweet? Do you not pay any attention to how activists and the media work? Have you not noticed the techlash or the ideological affiliations of reporters? We live in world A, not world B.
I believe there’s a conflict-vs-mistake thing going on here. I saw this as a specific case where the probabilities don’t line up the way your comment assumes they do. You seem to be from the beginning assuming I’m opposing an entire worldview and seeing it as some attack.
Behold, the Straw Man! Today for his trick, he turns “However small X’s level of evidence is, Y has even weaker evidence” into “X is incontrovertible evidence never to be questioned”.
I considered the original tweet, the fact that Google did not refute it, and the claim that Google blocked ape-related tags from Photos for years afterwards (with its own attached probability), and updated my inner measure of probability of this being true. And I find that the statement from the ex-Googler provides even weaker evidence to update based on. Nothing is “never to be questioned” here.
Again a thing I didn’t say, and you keep repeating.
A “claim from a disinterested party” means someone previously disinterested, someone where our prior assumptions can be close to assuming they are disinterested. A police officer is considered disinterested in a case if their family and friends are not involved in the case, and so can be assigned the case. It makes no sense to say “they’ve been assigned to the case now, so they’re not a disinterested party”!
This is the original tweet pointing out the issue. To me it just looks like a user casually pointing out a problem with a tool they use. Not everyone is obsessed all the time with culture wars and internet points.
Oh by the way, I found evidence that this author tweeted this seals claim back at the time as part of the Twitter thread about this. Still without any links or images, but that helped update my probabilities a little bit (as I had believed that part of the problem might be that it came two years later, as mentioned before). I wish that’s what this conversation had consisted of, actual evidence to try to arrive at the truth, instead of straw men and outright false claims.
The last paragraph has little to do with the claims here, unless you’re dumping an entire opposing worldview on me, and arguing against that imaginary person. For the record, I am much closer to your worldview regarding these issues and have noticed most of the things you mentioned. It’s just that in this instance even with that background there isn’t good enough evidence to believe the media suppressed some narrative.
My original comment has served its purpose to provide additional context for those who want it, and I don’t think further discussion with you here will be productive. Thanks for all the fish.