Optimization Process comments on Optimization Process’s Shortform

Optimization Process 13 Aug 2021 0:26 UTC
2 points
Consider AI-generated art (e.g. TWDNE, GPT-3 does Seinfeld, reverse captioning, Jukebox, AI Dungeon). Currently, it’s at the “heh, that’s kinda neat” stage; a median person might spend 5-30 minutes enjoying it before the novelty wears off.

(I’m about to speculate a lot, so I’ll tag it with my domain knowledge level: I’ve dabbled in ML, I can build toy models and follow papers pretty well, but I’ve never done anything serious.)

Now, suppose that, in some limited domain, AI art gets good enough that normal people will happily consume large amounts of its output. It seems like this might cause a phase change where human-labeled training data becomes cheap and plentiful (including human labels for the model’s output, a more valuable reward signal than e.g. a GAN’s discriminator); this makes better training feasible, which makes the output better, which makes more people consume and rate the output, in a virtuous cycle that probably ends with a significant chunk of that domain getting automated.

I expect that this, like all my most interesting ideas, is fundamentally flawed and will never work! I’d love to hear a Real ML Person’s take on why, if there’s an obvious reason.
- Optimization Process 13 Aug 2021 0:26 UTC
  3 points
  Parent
  Trying to spin this into a plausible story: OpenAI trains Jukebox-2, and finds that, though it struggles with lyrics, it can produce instrumental pieces in certain genres that people enjoy about as much as human-produced music, for about $100 a track. Pandora notices that it would only need to play each track ($100 / ($0.00133 per play) = 75k) times to break even with the royalties it wouldn’t have to pay. Pandora leases the model from OpenAI, throws $100k at this experiment to produce 1k tracks in popular genres, plays each track 100k times, gets ~1M thumbs-[up/down]s (plus ~100M “no rating” reactions, for whatever those are worth), and fine-tunes the model using that reward signal to produce a new crop of tracks people will like slightly more.
  
  Hmm. I’m not sure if this would work: sure, from one point of view, Pandora gets ~1M data points for free (on net), but from another reasonable point of view, each data point (a track) costs $100 -- definitely not cheaper than getting 100 ratings off Mechanical Turk, which is probably about as good a signal. This cycle might only work for less-expensive-to-synthesize art forms.