Are you implying that it is close to GPT-4 level? If yes, it is clearly wrong. Especially in regards to code: everything (maybe except StarCoder which was released literally yesterday) is worse than GPT-3.5, and much worse than GPT-4.
I’ve tried StarCoder recently, though, and it’s pretty impressive. I haven’t yet tried to really stress-test it, but at the very least it can generate basic code with a parameter count way lower than Copilot’s.
Does the code it writes work? ChatGPT can usually write a working Python module on its first try, and can make adjustments or fix bugs if you ask it to. All the local models I’ve tried so far could not keep it coherent for something that long. In one case it even tried to close a couple of Python blocks with curly braces. Maybe I’m just using the wrong settings.
This comment has gotten lots of upvotes but, has anyone here tried Vicuna-13B?
I have. It seems pretty good (not obviously worse than ChatGPT 3.5) at short conversational prompts, haven’t tried technical or reasoning tasks.
Are you implying that it is close to GPT-4 level? If yes, it is clearly wrong. Especially in regards to code: everything (maybe except StarCoder which was released literally yesterday) is worse than GPT-3.5, and much worse than GPT-4.
I’ve tried StarCoder recently, though, and it’s pretty impressive. I haven’t yet tried to really stress-test it, but at the very least it can generate basic code with a parameter count way lower than Copilot’s.
Does the code it writes work? ChatGPT can usually write a working Python module on its first try, and can make adjustments or fix bugs if you ask it to. All the local models I’ve tried so far could not keep it coherent for something that long. In one case it even tried to close a couple of Python blocks with curly braces. Maybe I’m just using the wrong settings.