Swimmer963 (Miranda Dixon-Luinenburg) comments on What DALL-E 2 can and cannot do

Swimmer963 (Miranda Dixon-Luinenburg) 6 May 2022 17:52 UTC
3 points
“A bronze statue of three wise monkeys.” Pretty solid!
“See no evil, hear no evil, speak no evil, statue of monkeys.”
- PoignardAzur 16 May 2022 9:24 UTC
  1 point
  Parent
  Interesting. It seems to understand that the pattern should be “Three monkeys with hands on their heads somehow”, but it doesn’t seem to get that each monkey should have hands in a different position.
  I wonder if that means gwern is wrong when he says DALL-E 2′s problem is that the text model compresses information, and the underlying “representation” model genuinely struggles with composition and “there must be three X with only a single Y among them” type of constraints.
- gturk1 8 May 2022 2:06 UTC
  1 point
  Parent
  Thank you so much for this! It did do quite well.
  I have been trying to think of another set of three items that are reliably found together, but this is all I could come up with. Pairs of items are much easier to come up with.
- TibuAI 6 May 2022 23:41 UTC
  1 point
  Parent
  This is so good.