Arena is less kind to DeepSeek, giving it an 1179, good for 21st and behind open model Gemma-2-9B.
DeepSeek Coder V2 is a coding-oriented model. To evaluate how it is doing at its targeted specialization, go to https://chat.lmsys.org/?leaderboard, but then make sure to switch to coding.
Then you’ll see that it is at position 10 (and with the Arena ways to group ranks, it is sharing rank 5 with models like Claude 3 Opus and GPT-4 January preview while being somewhat behind them; it is slightly above a number of formidable models, such as Gemini 1.5 Flash, an older version of Gemini 1.5 Pro, Claude 3 Sonnet, and Gemma-2-27B).
It does look pretty formidable, and a clear leader among open weight models in coding tasks.
DeepSeek Coder V2 is a coding-oriented model. To evaluate how it is doing at its targeted specialization, go to https://chat.lmsys.org/?leaderboard, but then make sure to switch to
coding
.Then you’ll see that it is at position 10 (and with the Arena ways to group ranks, it is sharing rank 5 with models like Claude 3 Opus and GPT-4 January preview while being somewhat behind them; it is slightly above a number of formidable models, such as Gemini 1.5 Flash, an older version of Gemini 1.5 Pro, Claude 3 Sonnet, and Gemma-2-27B).
It does look pretty formidable, and a clear leader among open weight models in coding tasks.