The Elon Musk one has realistic faces so I can’t share it; I have, however, confirmed that DALL-E does not speak ASL with “The ASL word for “thank you”″:
I’m curious why this prompt resulted in overwhelmingly black looking hands. Especially considering that all the other prompts I see result in white subjects being represented. Any theories?
It’s unnatural, yes: ASL is predominantly white, and people involved in ASL are even more so (I went to NTID and the national convention, so can speak first-hand, but you can also check Google Image for that query and it’ll look like what you expect, which is amusing because ‘Deaf’ culture is so university & liberal-centric). So it’s not that ASL diagrams or photographs in the wild really do look like that—they don’t.
Overrepresentation of DEI material in the supersekrit licensed databases would be my guess. Stock photography sources are rapidly updated for fashions, particularly recent ones, and you can see this occasionally surfacing in weird queries. (An example going around Twitter which you can check for yourself: “happy white woman” in Google will turn up a lot of strange photos for what seems like a very easy straightforward query.) Which parts are causing it is a better question: I wouldn’t expect there to be much Deaf stock photo material which had been updated, or much ASL material at all, so maybe there’s bleedthrough from all of the hand-centric (eg ‘Black Power salute’, upraised Marxist fists, protests) iconography? There being so much of the latter and so little of the former that the latter becomes the default kind of hand imagery.
It must be something like that, but it still feels like there’s a hole there. The query is for “ASL”, not “Hands”, and these images don’t look like something from a protest. The top left might be vaguely similar to some kind of street gesture.
I’m curious what the role of the query writer is. Can you ask DALL-E for “this scene, but with black skin colour”? I got a sense that updating areas was possible but inconsistent. Could DALL-E learn to return more of X to a given person by receiving feedback? I really don’t know how complicated the process gets.
The query is for “ASL”, not “Hands”, and these images don’t look like something from a protest.
ASL will always be depicted by a model like DALL-E as hands; I am sure that there are non-pictorial ways to write down ASL but I can’t recall them, and I actually took ASL classes. So that query should always produce hands in it. Then because actual ASL diagrams will be rare and overwhelmed by leakage from more popular classes (keep in mind that deafness is well under 1% of the US population, even including people like me who are otherwise completely uninvolved and invisible, and basically any political fad whatsoever will rapidly produce vastly more material than even core deaf topics), and maybe some more unCLIP looseness...
The Elon Musk one has realistic faces so I can’t share it; I have, however, confirmed that DALL-E does not speak ASL with “The ASL word for “thank you”″:
We’ve got some funky fingers here. Six six fingers, a sort of double-tipped finger, an extra joint on the index finger on picture (1, 4). Fascinating.
It seems to be mostly trying to go for the “I love you” sign, perhaps because that’s one of the most commonly represented ones.
I’m curious why this prompt resulted in overwhelmingly black looking hands. Especially considering that all the other prompts I see result in white subjects being represented. Any theories?
It’s unnatural, yes: ASL is predominantly white, and people involved in ASL are even more so (I went to NTID and the national convention, so can speak first-hand, but you can also check Google Image for that query and it’ll look like what you expect, which is amusing because ‘Deaf’ culture is so university & liberal-centric). So it’s not that ASL diagrams or photographs in the wild really do look like that—they don’t.
Overrepresentation of DEI material in the supersekrit licensed databases would be my guess. Stock photography sources are rapidly updated for fashions, particularly recent ones, and you can see this occasionally surfacing in weird queries. (An example going around Twitter which you can check for yourself: “happy white woman” in Google will turn up a lot of strange photos for what seems like a very easy straightforward query.) Which parts are causing it is a better question: I wouldn’t expect there to be much Deaf stock photo material which had been updated, or much ASL material at all, so maybe there’s bleedthrough from all of the hand-centric (eg ‘Black Power salute’, upraised Marxist fists, protests) iconography? There being so much of the latter and so little of the former that the latter becomes the default kind of hand imagery.
It must be something like that, but it still feels like there’s a hole there. The query is for “ASL”, not “Hands”, and these images don’t look like something from a protest. The top left might be vaguely similar to some kind of street gesture.
I’m curious what the role of the query writer is. Can you ask DALL-E for “this scene, but with black skin colour”? I got a sense that updating areas was possible but inconsistent. Could DALL-E learn to return more of X to a given person by receiving feedback? I really don’t know how complicated the process gets.
ASL will always be depicted by a model like DALL-E as hands; I am sure that there are non-pictorial ways to write down ASL but I can’t recall them, and I actually took ASL classes. So that query should always produce hands in it. Then because actual ASL diagrams will be rare and overwhelmed by leakage from more popular classes (keep in mind that deafness is well under 1% of the US population, even including people like me who are otherwise completely uninvolved and invisible, and basically any political fad whatsoever will rapidly produce vastly more material than even core deaf topics), and maybe some more unCLIP looseness...
OA announced its new ‘reducing bias’ DALL-E 2 today. Interestingly, it appears to do so by secretly editing your prompt to inject words like ‘black’ or ‘female’.