Interesting that Claude 3 Sonnet is ahead of Mistral Large and Gemini 1.0 Pro with some margin in LMSYS Arena, this implies significant advancement in tuning after the Claude 2 tuning disaster (where it consistently ranked below Claude 1 and possibly inspired Goody 2). Still, LMSYS respondents only prefer Opus to the initial GPT 4 releases, not to GPT 4 Turbo, so OpenAI’s tuning advancements are probably stronger still, if GPT 4 Turbo is not built on a slightly stronger base model different from original GPT 4.
Interesting that Claude 3 Sonnet is ahead of Mistral Large and Gemini 1.0 Pro with some margin in LMSYS Arena, this implies significant advancement in tuning after the Claude 2 tuning disaster (where it consistently ranked below Claude 1 and possibly inspired Goody 2). Still, LMSYS respondents only prefer Opus to the initial GPT 4 releases, not to GPT 4 Turbo, so OpenAI’s tuning advancements are probably stronger still, if GPT 4 Turbo is not built on a slightly stronger base model different from original GPT 4.