I expect for there to be a delay in deployment, but I think ultimately OpenAI is aiming as a near term goal to automate intellectually difficult portions of computer programming. Personally, as someone just getting into the tech industry, this is basically my biggest near-term concern, besides death. At what point might it be viable for most people to do most of what skilled computer programmer does with the help of a large language model, and how much should this hurt salaries and career expectations?
Some thoughts:
It will probably be less difficult to safely prompt a language model for an individual “LeetCode” function than to write a that function by hand within the next two years. Many more people will be able to do the former than could ever do the latter.
Yes, reducing the price of software engineering means more software engineering will be done, but it would be extremely odd if this meant software engineer salaries stayed the same, and I expect regulatory barriers to limit the amount that the software industry can grow to fill new niches.
Architecture seems difficult to automate with large language models, but a ridiculous architecture might be tolerable in some circumstances if your programmers are producing code at the speed GPT4 does.
Creativity is hard to test, and if former programmers are mostly now hired based on their ability to “innovate” or have interesting psychological characteristics beyond being able to generate code I expect income and jobs to shift away from people with no credentials and skills to people with lots of credentials and political acumen and no skills
At some point programmers will be sufficiently automated away that the singularity is here. This is not necessarily a comforting thought.
Edit: Many answers contesting the basic premise of the old title, “When will computer programming become an unskilled job?” The title of the post has been updated accordingly.
Since there’s a very broad spectrum of different kinds of computer programs with different constraints and desiderata, I think the transition will be very gradual. Consider the following things that are all computer programming tasks:
Helping non-technical people set up a simple blog.
Identifying and understanding the cause of unexpected behavior in a large, complicated existing system.
Figuring out how to make a much cheaper-to-run version of an existing system that uses too many resources.
Experimenting with a graphics shader in a game to see how you can make an effect that is really cool looking.
Implementing a specific known cryptographic algorithm securely.
Writing exploratory programs that answer questions about some data set to help you understand patterns in the data.
I have no doubt that sufficiently fancy AI can do or help human programmers do all these tasks, but probably in different ways and at different rates.
As an experienced programmer that can do most of these things well, I would be very surprised if my skillset were substantially obsolete in less than 5 years, and somewhat surprised if it was substantially obsolete in less than 10 years. It seems like GPT-3 and GPT-4 are not really very close to being able to do these things as well as me, or close to being able to help a less skilled human do these things as well as me.
One and a half years later it seems like AI tools are able to sort of help humans with very rote programming work (e.g. changing or writing code to accomplish a simple goal, implementing versions of things that are well-known to the AI like a textbook algorithm or a browser form to enter data, answering documentation-like questions about a system) but aren’t much help yet on the more skilled labor parts of software engineering.
What I expect to change quickly is that “programming languages” will go away completely. LLMs or similar tech will get us way closer to the DWIM level. Directly translating from a spec to executables will be something AI can excel at. The missing piece is the feedback: writing and executing unit tests and changing the executable (not the code!) to pass the tests.
Note that a lot of current CS concepts are human-oriented and make no sense when the human is not a part of the process: “architecture” is a crutch for the limitation of our brains. “Design” is another crutch. This can all be streamlined into “Spec->Binary”.
Even further, there is no reason to have a general-purpose computer when it is easy for an AI to convert a spec into actual hardware, such as FPGA.
Next on the list (or maybe even first on the list) is not needing the low-level executables at all: the LLM or equivalent just does what you ask of it.
Architecture is not JUST for brain limitations. Dividing a large task into separate testable (and formally provable now that we have AI to do the immense labor this takes) modules interconnected by message passing through shared memory with an eye towards performance is architecture.
It’s not simple either, for example performance requires someone or something to have a flow graph in their head or represented somewhere to know where the bottlenecks are.
I agree with you on the idea of AI bytecode authors: once you have a program into a representation tight enough that one and only one binary truth table can be constructed to model the behavior of the program (a property all programming languages have while English doesn’t), a second AI could just write the bytecode or fpga logic in an optimized form.
No need for compilers, the language could be python or higher where all languages below that are pointless.
I would think that this part is one of the easier ones to automate. If you don’t see it that way, what do you feel the impediments that require human input would be?
Edit:
Note that this is not quite the case already: different compilers or different versions of the same compiler, or different optimization levels of the same version output different binaries even on the same platform, let alone on x86 vs ARM or something! There are best-effort promises of “one binary truth table” but it is never a guarantee.
This means the top level of the program (what it will output given binary input I) produces the same behavior, which is true for all languages.
Output = f(I) can be represented by a truth table with a row for all permutations of I.
Implementation details don’t matter.
Multithreading / runtime delays can change the sequence things output but not the possibility space of outputs.
This is definitely not my experience, having worked with C in embedded systems for some decades. Every new port has at least slightly different behavior, which sometimes matters and sometimes does not. Some of the things that tend to break: timing, especially in a multitasking system, creating or exacerbating race conditions; values in memory left uninitialized (probably not as much of an issue for more modern languages, though I have seen this in Java as well); garbage collection (say, in JS when switching browsers); implementations of markup (breaks all the time for no reason).
We must be living in different worlds...
I have a substantial amount of embedded systems experience. More than 10 years.
Note that what you are describing is almost always in fact faulty system behavior which is exactly the reason you need better architectures. Many systems shipping today have faulty behavior, it passes acceptance tests but is undefined for the reasons you mention.
Determinism and timing guarantees is almost always correct software behavior. (RNGs being isolated exceptions)
We could not fix the architecture for many systems for the obvious reasons, the cost in labor to rewrite it. Which disappears if you can ask an AI to do it in a series of well defined steps.
Just to specify my claim a little harder, I am saying that individual threads/ISRs/processes/microservices: every actor has an initial input state, I.
And then the truth table rule above applies.
If a thread hits a mutex, and then either waits or proceeds, that mutex state is part of the truth table row.
Depending on when it got the lock and got to read a variable, the variable it actually read is still part of the same row.
Any internal variables it has are part of the row.
Ultimately for each actor, it is still a table lookup in practice.
Now as I mentioned earlier, this is generally bad design. Pure functional programming, which at all levels of embedded systems up to hyperscaler systems, has become the modern standard, and even low level embedded systems should be pure functional. This means they might hold a reader and writer lock on the system state they are modifying, for example, or other methods so that the entire state they operate on is atomic and coherent.
For example I’ve written a 10 khz motor controller, where it is a functional system of
PWM_outputs = f( phaseA, phaseB, resolver, PID_state, speed_filter[], current_filter[]) and a few other things. My actual implementation wasn’t so clean and the system had bugs.
This above system is atomic, I need all variables to be from the same timestep and if my code loop is too slow to do all the work before the next timestep, I need to release control of the motor (open all the gates) and shut down with an error.
If I had an AI to do work for me I would have asked it to do some fairly massive refactors and add more wrapper layers etc and then self review it’s own code by rubrics to make something clean and human readable.
All things that GPT-4 can do right now, especially if it gets a finetune on coding.
There are architectural problems with LLMs that I think prevent the future you are describing; they can only output so many tokens, and actual binaries are often up to thousands of times the token size of the actual programming languages, especially when compiled from high level languages. The compilation process is thus a useful compression step, and I don’t expect designed-for-LLM programming languages because the terabyte datasets currently necessary to train LLMs to use them won’t be available. In addition, at least for the next few years, it will be important for humans to be able to inspect and reason directly about the software the LLM has made.
But it’s possible these problems get solved soon.
I can foresee a near future where people can “program” by passing pseudo-code in plain English to LLMs, but I still think that we are nowhere near the point where programmer truly becomes an unskilled job. “Writing correct algorithms in plain English” is not a common skill, and you can’t have LLMs writing perfect code 100% of the times. At the very least, you will still need a competent human to find bugs in the machine-generated code… my best guess is that the software industry will hire less programmers rather than less competent programmers.
...But why? I understand that LLMs produce code with bugs right now, but, why expect that they will continue to do that to an economically meaningful degree in the future? I don’t expect that.
For the same reason we don’t have self-driving cars yet: you cannot expect those systems to be perfectly reliable 100% of the time (well, actually you can, but I don’t expect such improvements in the near future just from scaling).
Humans are much better drivers than they are programmers
Indeed it is not! But it is the one easier to automate than “create a requirement spec”. Here are plausible automation steps:
Get a feature request from a human, in plain English. (E.g. a mockup of the UI, or even “the app must take one-click payments from the shopping-cart screen”).
AI converts it into a series of screen flows etc. Repeat from step 1 until happy.
AI generates an internal test set. (Optionally reviewed by a human to see if they match the requirements. Go back to step 1 and adjust until happy.)
AI generates a mockup of all external APIs, matching the existing API specs.
AI generates the product (e.g. an app, or an ecosystem, or an FPGA, or whatever else).
AI validates the product against the test set and adjusts the product until the tests are passed.
I don’t think so, humans are worse at it than AI already. All you need is to close the feedback loop, and what I see online of the GPT-4 demos, giving an error message back to the AI already prompts it to correct the issue. These are of course syntax errors not semantic errors, but that is what the test suite is for, to obsolete the distinction between syntax and semantics, which the current LLMs are already pretty good at.
Yes, and they will not be “programmers”, they will be “AI shepherds” or something.
I suspect that we are thinking about different use cases here.
For very standard things without complicated logic like an e-commerce app or showcase site, I can concede that an automated workflow could work without anyone ever looking at the code. This is (sort of) already possible without LLMs: there are several Full Site Editing apps already for building standard websites without looking at the code.
But suppose that your customer needs a program able to solve a complicated scheduling or routing problem tailored to some specific needs. Maybe our non-programmer knows the theoretical structure of routing problems and can direct the LLM to write the correct algorithms, but in this case it is definitely not an unskilled job (I suspect that <1% of the general population would be able to describe a routing problem in formal terms).
If our non-programmer is actually unskilled and has no clue about routing problems… what are we supposed to do? Throw vague specs at the AI and hope for the best?
The person can… ask the AI about routing algorithms and related problems? Already now the bots are pretty good describing the current state of the field. And then come up with a workable approach interactively, before instructing the bot to spawn a specialized router app. That is to say, it will not be an unskilled job, it still requires someone who can learn, understand and make sensible decisions, which is in many ways harder than implementing a given algorithm. They just won’t be doing any “programming” as the term is understood now.
I have a major problem with the framing of your question.
Say we invented construction robots that given a blueprint for a building and an auto generated list of materials deliveries, take the materials off trucks and assemble the building. This means you no longer need ‘shift bosses’, the computers do that. You are essentially down to 5 main roles:
The architects of the building
The site foreman (who oversees construction)
lawyers to deal with getting permits and to sue the local jurisdiction when it denies them in violation of the local jurisdiction’s or state laws (which can be automated)
inspectors
Financing people
You may notice that the people replaced, welders and tradesmen and crane operators etc, are less skilled than the remaining people. (not claiming blue collar work is unskilled but the time to learn to do it ‘ok’ enough to work independently is a few months of on the job training, with some skill gain over the years since)
This would be true for software also. The remaining people required have to have more skills. The idea of a “pointy haired boss” with no understanding of software designing a whole app that works to production scale reliability is false.
You are imagining a scenario in which computer programmers are completely automated away, rather than one where the intellectual ceiling for becoming a computer programmer is reduced and thus more people migrate to software engineering from other jobs. I don’t find your scenario as plausible as my scenario but I suppose it could happen.
I am saying those less skilled people aren’t adding value because the remaining tasks are the hardest ones llms can’t do.
It’s all architecture and dealing with coupling and extremely difficult debugging where the error messages lie to you and Google doesn’t have anything on it.
So no, unskilled people won’t migrate in.
I mean did unskilled people flood into farming when tractors were invented, or are the remaining tasks (maintaining and operating heavy equipment and planning farm interventions) more skilled?
I think you are underestimating the level of exception handling required to completely automate the average software engineers job, as happened to unskilled farmhands and factory workers. A slightly atypical few hours for a software engineers at the moment, as an example, might be discovering the logging facility stopped working on an important VM, SSHing in and figuring out what went wrong, and then applying a patch to another related piece of software to fix the bug. LLMs could help coach regular people through that process over the shoulder like a senior engineer, but they couldn’t automate the whole process, not because the individual pieces are too intellectually difficult but because it requires too much diverse and unsupervised tool use and investigation. If some AI successor to LLMs could be trusted to do that in the next few years, then we probably only have a short while until something FOOMs.
This is arguing against your point from your last reply above this. You said “more people migrate to software engineering from other jobs”. Your above reply contradicts that.
Hm, did I? I think if an over-the-shoulder senior engineer becomes a rounding error in terms of expenses then the solution is in fact to hire three times more engineers and pay them three times less. What do you think the implications of what I said are?
Because anything the AI cannot figure out on it’s own from the error, or logging in and requesting logs then opening them up (which can be trivially added to current gen AI), is not something a “junior” human engineer is likely to figure out.
Like other industries all the other times this happened, I instead expect 1⁄3 the number of engineers (for a given quantity of software) paid 3 times as much.
And because what you actually just described is from faulty architecture. A big reason why current systems are often so hard to debug and so “exception filled” is because they have trash designs. As in, they are so bad that a competent architect could trivially create a better one, but it costs so much money to rebuild a software product from scratch that the architecture becomes locked in, the technical debt permanent.
This all vanishes if AI “senior engineers” can churn out all new code to satisfy a new design, satisfying product level tests, in a few months.
The main issue I see with this prediction is that ‘computer programming becoming easier’ has already happened before, and has not had this effect.
Programming has become easier as new languages sanded the rough edges off and automated a lot of rote Assembly work, and as widespread Internet use allowed you to find solutions fast. It was much harder to code on punch cards, or in Assembly, than to code today with StackOverflow’s help.
However, this didn’t lead to programming becoming less of a career, and I don’t think it led to programmers being less well paid either.
There’s a possibility that ‘the average programmer’ might become less well-paid as the definition of ‘programmer’ expands, but I don’t anticipate a given level of programming skill becoming less valuable until AIs reach a pretty-much-singularity level.
(Disclosure: am a programmer)
As a programmer, I extensively use GPT models in my work currently. It speeds things up. I do things that are anything but easy and repeatable, but I can usually break them into simpler parts that can be written by AI much quicker than I would even review documentation.
Nevertheless, I mostly currently do research-like parts of the project and PoCs. When I sometimes work with legacy code—GPT-3 is not that helpful. Did not yet try GPT-4 for that.
What do I see for the future of my industry? Few things—but those are loose extrapolations based on GPT progress and knowledge of the programming, not something very exact:
Speeding up of programmers’ work is already here. It started with GitHub Copilot and GPT-3 even before the Chat-GPT boom. It will get more popular and faster. The consequence is the higher performance of programmers, so more tasks can be done in a shorter time so the market pressure and market gap for employees will fall. This means that earnings will either stagnate or fall.
Solutions that could replace a junior developer totally—that has enough capability to write a program or useful fragment based on business requirements without being baby-sitted by a more professional programmer—are not yet there. I suppose GPT-5 might be it. So I would guess it can get here in 1-3 years from now. Then it is likely that many programmers will lose their jobs. There still will be work for seniors (that would work with AI assistance on more subtle and complex parts of systems and also review work of AI).
Solutions that could replace any developer, DevOps, and system admin—I think the current GPT-4 is not even close, but it may be here in a few years. It isn’t something very far away. It feels like 2 or 3 GPT versions away, when they make it more capable and also connect it with other types of models (which is already being done). I would guess that scope of 3-10 years. Then we likely will observe most of the programmers losing jobs and likely will observe AI singularity. Someone will surely use AI to iterate on AI and make it refine itself.
Curious for an update now that we have slight-better modals. In my brain-dead webdev use-cases, Claude 3.5 has passed some threshold of usability.
What about 3.5 pushes it over the threshold to you that was missing in previous models?
>The consequence is the higher performance of programmers, so more tasks can be done in a shorter time so the market pressure and market gap for employees will fall. This means that earnings will either stagnate or fall.
Mostly agree with your post. Historically higher productivity has generally lead to higher total compensation but how this affects individuals during the transition period depends on the details (eg how much pent-up demand for programming is there?).
You’re not accounting for an increase in demand for software. The tools to automate “basically every job on earth” are on the horizon but they won’t deploy or architect themselves. Plenty of work remaining.
And there are larger jobs you are not even considering. How many people need to supervise and work on a self replicating factory or a nanoforge research facility or a city replacement effort?
There are these big huge immense things we could do that we had nothing even vaguely close to the labor or technical ability to even try. Just because humans are more efficient per hour worked doesn’t mean the work won’t scale up even faster.
I wouldn’t want to be getting into the software engineering business right now. I have been doing this for close to 40 years. Current systems can’t replace a senior level developer, but CharGPT is close to a junior developer. I expect that in 2-3 years, we’ll have systems that can do the work of a junior dev.
I expect that my job will become describing specs and test cases, and more of the architectural stuff. I expect to be mostly obsolete in 6 years, but maybe as long as 10 for niche solutions.
Doesn’t this make you more valuable? I can understand feeling like you want to do other things in life but if you were planning to stay don’t these new tools expand what you can do in a day in a way that favors someone at your skill level more than someone junior?
Oh, for sure! I’m going to ride this career out into retirement if I can. I love what I do. And yeah, I’m more valuable than 10 junior devs in a box. I’ll be valuable up until there is a senior dev in a box.
If you were getting involved with programming as a novice today, is there a particular discipline or niche you’d recommend? Certain skills or tools you’d focus on? Also, are there non-technical aptitudes you think are worth training up on to stay relevant in the AI/LLM future?
🤔 So far LLMs don’t seem to be good at the big-picture stuff, so software architecture might be relevant for a while longer. The problem is that most information sources are going to be coming from the perspective of someone that understands code. I’m not really sure what that looks like in a world where the details are just handled.
Maybe Category Theory. Recency bias warning: This could be because I’m currently about 4⁄5 of the way through a lecture series on the Category Theory for Programmers. Category Theory is basically all about how abstractions can be transformed. As working programming is likely to be done at higher levels of abstraction, this seems relevant.
Honestly, my advice is this: follow what piques your interest. Don’t worry what the field will be like in 5-10 years. We’re likely wrong, and you may as well enjoy the ride. Every piece you pick up will build up to a greater understanding anyways. If you are interested in web development, back-end work will give you a better understanding of architecture, but if you just love front-end, go for it. You can always switch later. It’s good to be a generalist.
Thanks for the reply and thoughts. Given that no one really knows, it’s sound advice to follow one’s interests. Additional understanding and skills will also be a net positive. Still, being on the verge of something so transformative makes me wonder if we need to update our typical career questions/audits. Updated to what seems to be the current question though...
The most accurate answer is also the least helpful: none of us really know. Guido van Rossum has an opinion about GitHub Copilot in this interview:
but he’s really just talking about what LLMs can do know, not what they’ll be able to do in five or ten years.
Chris Lattner has an opinion about Software 2.0 here:
but Software 2.0 isn’t really the same thing. But he’s talking about Software 2.0, which is a little different. More info about Software 2.0 here:
and if you watch Chris Latter and Lex talk for a little while longer, you’ll see that Chris has no idea about how you can tell a computer to build you a webpage with a red button using just text, and admits that it’s out of his area of expertise.
I bring up these examples mostly to illustrate that nobody has any clue. Sam Altman addresses the topic the most out of all the people I’ve linked, in this video:
and the TLDR is that Lex and Sam both think LLMs can make programmers 10x more productive. Sam also thinks that instead of hiring 1/10th the number of programmers, we’ll just have 10x more code. He thinks there’s a “supply problem” of enough software engineers.
One thing I would advise is to make yourself more than just a software engineer. Lex says in his talk with Sam that he’s not worried because he’s an AI guy, not just a programmer. You might want to learn more about how AI works and try to get a job in the space, and ride the wave, or learn about information security in addition to software engineering (that’s what I’m doing, in no small part because of the influence of a one-on-one chat with 80,000 Hours), or maybe you learn a lot about oceanography or data science or something else in addition to software engineering.
Then I’d also just say that we have no idea and if anyone says they know, they really don’t, because look a bunch of smart people discussed it and they have no clue either.
“Computer programming” describes a pretty wide range of cognitive work. Stack Overflow has already reduced the training and knowledge required for many tasks, and LLMs push this a lot further. The next level of abstraction, understanding program flow and correctness of results hasn’t seen much evidence that LLMs can plan or understand things well enough to do. And the elements of understanding user and business needs and deciding what to build in the first place hasn’t really had a start.
As far as timelines, I suspect it’ll transform some amount of coding into being focused on prompt generation and testing, speeding devs up by a lot, but the work will expand to fill the space rather than putting very many out of work. As we figure out how to bolt on planning, modeling, and strategic modules coordinated with LLMs, it’ll move up the stack, but this is probably decades in the making.
Note that some of this coincides with “when does GPT become self-improving”?
Note gpt (or any other AI model) can become self improving while still lacking the skill to write a large and complex software package from spec.
Here’s how I’m tentatively thinking about it:
As it gets cheaper to build more and better features, businesses will face stiffer competition to deliver superior products. This will create work for SWEs to implement those features.
My coding work has already shifted heavily away from writing code first drafts and toward importing, tweaking, debugging, polishing and integrating LLM outputs.
Because I’m more productive, my build-test-refine cycle shortens, so I wind up putting more attention into thinking about business objectives.
Other SWEs may gravitate toward questions like “how do I build a plugin-based pipeline to integrate software A B and C, and hardware X Y and Z, using LLM outputs to control the pipeline?” And there are all kinds of interesting technical issues there for SWEs to solve.
Companies will increasingly focus on continuously importing their data into formats compatible with LLMs—everything from emails, to office layouts, to employee personalities and skill assessments, making it all searchable and interpretable. Despite this, there will always be a bleeding edge of data that hasn’t been integrated into the system (which right now is just the familiar everyday knowledge we have rattling around in our heads, or in our inboxes), and that data-edge will be a thing that individual people define their job roles around. What exactly that is, of course, will be in continuous flux.
I think that in general, people who work in tech will be OK as long as they’re keeping up with the new LLM-based tools and ways of working.
The shift we’re looking at is going from program code that’s very close to a computer’s inner workings to natural human language for specifying systems, but where the specification must still unambiguously describe the business interest the program needs to solve. We already have a profession for unambiguously specifying complex systems with multiple stakeholders and possibly complex interactions between its parts in natural language. It’s called a legislator and it’s very much not an unskilled job.