OpenAI’s prices seem too low to recoup even part of their capital costs in a reasonable time given the volatile nature of the AI industry. Surely I’m missing something obvious?
Yes: batching. Efficient GPU inference uses matrix matrix multiplication not vector matrix multiplication.
Current theme: default
Less Wrong (text)
Less Wrong (link)
Yes: batching. Efficient GPU inference uses matrix matrix multiplication not vector matrix multiplication.