Ofer comments on What considerations influence whether I have more influence over short or long timelines?

Ofer 7 Nov 2020 15:24 UTC
3 points
I didn’t follow this. FB doesn’t need to run a model inference for each possible post that it considers showing (just like OpenAI doesn’t need to run a GPT-3 inference for each possible token that can come next).

(BTW, I think the phrase “context window” would correspond to the model’s input.)

FB’s revenue from advertising in 2019 was $69.7 billion, or $191 million per day. So yea, it seems possible that in 2019 they used a model with an inference cost similar to GPT-3′s, though not one that is 10x more expensive [EDIT: under this analysis’ assumptions]; so I was overconfident in my previous comment.
- Daniel Kokotajlo 7 Nov 2020 21:04 UTC
  3 points
  Parent
  Yeah maybe I was confused. FB does need to read all the posts it is considering though, and if it has thousands of posts to choose from, that’s probably a lot more than can fit in GPT-3′s context window, so FB’s algorithm needs to be bigger than GPT-3… at least, that’s what I was thinking. But yeah that’s not the right way of thinking about it. Better to just think about how much budget FB can possibly have for model inference, which as you say must be something like $100mil per day tops. That means that maybe it’s GPT-3 sized but can’t be much bigger, and IMO is probably smaller.
  - Ofer 7 Nov 2020 21:50 UTC
    3 points
    Parent
    (They may spend more on inference compute if doing so would sufficiently increase their revenue. They may train such a more-expensive model just to try it out for a short while, to see whether they’re better off using it.)
    - Daniel Kokotajlo 8 Nov 2020 6:58 UTC
      3 points
      Parent
      Good points, especially the second one.