Yes those margins are narrow and probably gamed. GPT4’s paper is from the base version and it has probably received modest capabilities upgrades since. Gemini also uses more advanced prompting tactics.
What do you think the compute investment was? They state they used multimodal inputs (more available tokens in the world) and 4096 processor tpuv5 nodes, but not how many or for how long.
Yes those margins are narrow and probably gamed. GPT4’s paper is from the base version and it has probably received modest capabilities upgrades since. Gemini also uses more advanced prompting tactics.
What do you think the compute investment was? They state they used multimodal inputs (more available tokens in the world) and 4096 processor tpuv5 nodes, but not how many or for how long.