Llama-3.1-405B not as good as GPT-4o or Claude Sonnet. Certainly Llama-3.1-70B is not as good as the similarly sized Claude Sonnet. If you are going to straight up use an API or chat interface, there seems to be little reason to use Llama.
Some providers are offering 405b at costs lower than 3.5 sonnet. E.g., Fireworks is offering for $3 input / $3 output.
That said, I think output speed is notably worse for all providers right now.
Some providers are offering 405b at costs lower than 3.5 sonnet. E.g., Fireworks is offering for $3 input / $3 output.
That said, I think output speed is notably worse for all providers right now.