Yeah maybe I was confused. FB does need to read all the posts it is considering though, and if it has thousands of posts to choose from, that’s probably a lot more than can fit in GPT-3′s context window, so FB’s algorithm needs to be bigger than GPT-3… at least, that’s what I was thinking. But yeah that’s not the right way of thinking about it. Better to just think about how much budget FB can possibly have for model inference, which as you say must be something like $100mil per day tops. That means that maybe it’s GPT-3 sized but can’t be much bigger, and IMO is probably smaller.
(They may spend more on inference compute if doing so would sufficiently increase their revenue. They may train such a more-expensive model just to try it out for a short while, to see whether they’re better off using it.)
Yeah maybe I was confused. FB does need to read all the posts it is considering though, and if it has thousands of posts to choose from, that’s probably a lot more than can fit in GPT-3′s context window, so FB’s algorithm needs to be bigger than GPT-3… at least, that’s what I was thinking. But yeah that’s not the right way of thinking about it. Better to just think about how much budget FB can possibly have for model inference, which as you say must be something like $100mil per day tops. That means that maybe it’s GPT-3 sized but can’t be much bigger, and IMO is probably smaller.
(They may spend more on inference compute if doing so would sufficiently increase their revenue. They may train such a more-expensive model just to try it out for a short while, to see whether they’re better off using it.)
Good points, especially the second one.