But let’s be more concrete and specific. I’d like to know what’s the least impressive task which cannot be done by a ‘non-agentic’ system, that you are very confident cannot be done safely and non-agentically in the next two years.
Focusing on the “minimal” part of that, maybe something like “receive a request to implement some new feature in a system it is not familiar with, recognize how the limitations of the architecture that system make that feature impractical to add, and perform a major refactoring of that program to an architecture that is not so limited, while ensuring that the refactored version does not contain any breaking changes”. Obviously it would have to have access to tools in order to do this, but my impression is that this is the sort of thing mid-level software developers can do fairly reliably as a nearly rote task, but is beyond the capabilities of modern LLM-based systems, even scaffolded ones.
Though also maybe don’t pay too much attention to my prediction, because my prediction for “least impressive thing GPT-4 will be unable to do” was “reverse a string”, and it did turn out to be able to do that fairly reliably.
Focusing on the “minimal” part of that, maybe something like “receive a request to implement some new feature in a system it is not familiar with, recognize how the limitations of the architecture that system make that feature impractical to add, and perform a major refactoring of that program to an architecture that is not so limited, while ensuring that the refactored version does not contain any breaking changes”. Obviously it would have to have access to tools in order to do this, but my impression is that this is the sort of thing mid-level software developers can do fairly reliably as a nearly rote task, but is beyond the capabilities of modern LLM-based systems, even scaffolded ones.
Though also maybe don’t pay too much attention to my prediction, because my prediction for “least impressive thing GPT-4 will be unable to do” was “reverse a string”, and it did turn out to be able to do that fairly reliably.