What led you to the “equal” conclusion over the “modest advance” hypothesis? The “beat gpt-4 by a small numerical ratio on all tasks but 1, and is natively multimodal” is what I read from the report.
That leads me to “modest advance”, how did you interpret the report? Are you thinking the margins between the 2 models are too narrow and easily gamed?
Yes those margins are narrow and probably gamed. GPT4’s paper is from the base version and it has probably received modest capabilities upgrades since. Gemini also uses more advanced prompting tactics.
What do you think the compute investment was? They state they used multimodal inputs (more available tokens in the world) and 4096 processor tpuv5 nodes, but not how many or for how long.
What led you to the “equal” conclusion over the “modest advance” hypothesis? The “beat gpt-4 by a small numerical ratio on all tasks but 1, and is natively multimodal” is what I read from the report.
That leads me to “modest advance”, how did you interpret the report? Are you thinking the margins between the 2 models are too narrow and easily gamed?
Yes those margins are narrow and probably gamed. GPT4’s paper is from the base version and it has probably received modest capabilities upgrades since. Gemini also uses more advanced prompting tactics.
What do you think the compute investment was? They state they used multimodal inputs (more available tokens in the world) and 4096 processor tpuv5 nodes, but not how many or for how long.