DALL-E is often sensitive to exact wording, and in particular it’s fascinating how “in the style of x” often gets very different results from “screenshot from an x movie”. I’m guessing that in the Pixar case, generic “Pixar style” might capture training data from Pixar shorts or illustrations that aren’t in their standard recognizable movie style.
I’ve seen this prompt programming bug noted on Twitter by DALL-E 2 users as well. With earlier models, there didn’t seem to be that much difference between ‘by X’ vs ‘in the style of X’, but with the new high-end models, perhaps there is now?
The speculation why is that ‘in the style of X’ is generally inferior because you are now tapping into epigones, imitations, and loosely related images rather than the masters themselves. So it’s become a version of ‘trending on Artstation’: if you ask for X, you ask for the best; if you ask for in the style of X, you ask for broader (and regressed-to-the-mean?) things.
I’ve seen this prompt programming bug noted on Twitter by DALL-E 2 users as well. With earlier models, there didn’t seem to be that much difference between ‘by X’ vs ‘in the style of X’, but with the new high-end models, perhaps there is now?
The speculation why is that ‘in the style of X’ is generally inferior because you are now tapping into epigones, imitations, and loosely related images rather than the masters themselves. So it’s become a version of ‘trending on Artstation’: if you ask for X, you ask for the best; if you ask for in the style of X, you ask for broader (and regressed-to-the-mean?) things.