I have just used it for coding for 3+ hours and found it quite frustrating. Definitely faster than GPT 4.0 but less capable. More like an improvement for 3.5. To me a seems a lot like LLM progress is plateauing.
Anyway in order to be significantly more useful a coding assistant needs to be able to see debug output, in mostly real time, have the ability to start/stop the program, automatically make changes, keep the user in the loop and read/use GUI as that is often an important part of what we are doing. I havn’t used any LLM that are even low-average ability at debugging kind of thought processes yet.
I do think it implies something about what is happening behind the scenes when their new flagship model is smaller and less capable than what was released a year ago.
How soon with what degree of confidence do you have? I think they have a big slower model that isn’t that much of a performance improvement and hardly economic to release.
Interesting. I see a lot of people reporting their coding experience improving compared to GPT-4, but it looks like this is not uniform, that experience differs for different people (perhaps, depending on what they are doing)...
I have just used it for coding for 3+ hours and found it quite frustrating. Definitely faster than GPT 4.0 but less capable. More like an improvement for 3.5. To me a seems a lot like LLM progress is plateauing.
Anyway in order to be significantly more useful a coding assistant needs to be able to see debug output, in mostly real time, have the ability to start/stop the program, automatically make changes, keep the user in the loop and read/use GUI as that is often an important part of what we are doing. I havn’t used any LLM that are even low-average ability at debugging kind of thought processes yet.
I believe this is likely a smaller model rather than a bigger model so I wouldn’t take this as evidence that gains from scaling have plateaued.
I do think it implies something about what is happening behind the scenes when their new flagship model is smaller and less capable than what was released a year ago.
It’s a free model. Much more likely they have paid big boy model coming soon imo.
How soon with what degree of confidence do you have? I think they have a big slower model that isn’t that much of a performance improvement and hardly economic to release.
What’s your setup? Are you using it via ChatGPT interface or via API and a wrapper?
ChatGPT interface like I usually do for GPT4.0. some GPT4.0 queries done by cursor AI IDE
Thanks!
Interesting. I see a lot of people reporting their coding experience improving compared to GPT-4, but it looks like this is not uniform, that experience differs for different people (perhaps, depending on what they are doing)...