It’s surprising that it’s taken this long, given how good public AI coding assistants were a year ago. I’m skeptical of anything with only closed demos and not interactive use by outside reviewers, but there’s nothing unbelievable about it.
As a consumer, I don’t look forward to the deluge of low-quality apps that’s coming (though we already have it to some extent with the sheer number of low-quality coders in the world). As a developer,I don’t like the competition (mostly for “my” junior programmers, not yet me directly), and I worry a lot about whether the software profession can make great stuff ever again.
It’s surprising that it’s taken this long, given how good public AI coding assistants were a year ago.
The way I explain this to people is that current LLMs can be modeled as having three parts:
1. The improv actor, which is is amazing. 2. The reasoner, which is inconsistent but not totally hopeless at simple things. 3. The planner/execution/troubleshooting engine, which is still inferior to the average squirrel trying to raid a bird feeder.
Copilot is designed to rely on (1) and (2), but it is still almost entirely reliant on humans for (3). (GPT 4 Code Interpeter is slightly better at (3).)
Since I don’t really believe in any reliable way to control a super-human intelligence for long, I do not look forward to people completely fixing (3). Sometime after that point, we’re either pets or paperclips.
Cognition Labs released a demo of Devin an “AI coder”, i.e., an LLM with agent scaffolding that can build and debug simple applications:
https://twitter.com/cognition_labs/status/1767548763134964000
Thoughts?
It’s surprising that it’s taken this long, given how good public AI coding assistants were a year ago. I’m skeptical of anything with only closed demos and not interactive use by outside reviewers, but there’s nothing unbelievable about it.
As a consumer, I don’t look forward to the deluge of low-quality apps that’s coming (though we already have it to some extent with the sheer number of low-quality coders in the world). As a developer,I don’t like the competition (mostly for “my” junior programmers, not yet me directly), and I worry a lot about whether the software profession can make great stuff ever again.
The way I explain this to people is that current LLMs can be modeled as having three parts:
1. The improv actor, which is is amazing.
2. The reasoner, which is inconsistent but not totally hopeless at simple things.
3. The planner/execution/troubleshooting engine, which is still inferior to the average squirrel trying to raid a bird feeder.
Copilot is designed to rely on (1) and (2), but it is still almost entirely reliant on humans for (3). (GPT 4 Code Interpeter is slightly better at (3).)
Since I don’t really believe in any reliable way to control a super-human intelligence for long, I do not look forward to people completely fixing (3). Sometime after that point, we’re either pets or paperclips.
It’s almost a year since Chaos GPT. I wonder what the technical progress in agent scaffolding for LLMs has been.