I very strongly disagree. In my opinion, this argument appears fatally confused about the concept of “software.”
As others have pointed out, this post seems to be getting at a distinction between code and data, but many of the examples of software given by OP contain both code and data, as most software does. Perhaps the title should have been “AI is Not Code,” but since it wasn’t I think mine is a legitimate rebuttal.
I’m not trying to make an argument by definition. My comment is about properties of software that I think we would likely agree on. I think OP both ignores some properties software can have while assuming all software shares other separate properties, to the detriment of the argument.
I think the post is correct in pointing out that traditional software is not similar to AI in many ways, but that’s where my agreement ends.
1: Software, I/O, and such
Most agree on the following basic definition: software is a set of both instructions and data, hosted on hardware, that governs how input data is transformed to some sort of output. As you point out, inputs and outputs are not software.
For example, photos of a wedding or a vacation aren’t software, even if they are created, edited, and stored using software.
Yes.
Second, when we run the model, it takes the input we give it and performs “inference” with the model. This is certainly run on the computer, but the program isn’t executing code that produces the output, it’s using the complicated probability model which grew, and was stored as a bunch of numbers.
No! It is quite literally executing code to produce the output! Just because this specific code and the data it interacts with specifies a complicated probability model that does not mean it is not software.
Every component of the model is software. Even the pseudorandomness of the model outputs is software (torch.randn(), often). There is no part of this inference process that generates outputs that is not software. To run inference is only to run software.
2: Stochasticity
The model responds to input by using the probability model to estimate the probability of difference responses, in order to output something akin to what the input data did—but it does so in often unexpected or unanticipated ways.
Software is often, but is not necessarily deterministic. Software can have stochastic or pseudorandom outputs. For example, software that generates pseudorandom numbers is still software. The fact that AI generates stochastic outputs humans don’t expect does not make it not software.
Also, software is not necessarily interpretable and outputs are not necessarily expected or expectable.
3: Made on Earth by Humans
First, we can talk about how it is created. Developers choose a model structure and data, and then a mathematical algorithm uses that structure and the training data to “grow” a very complicated probability model of different responses… The AI model itself, the probability model which was grown, is generating output based on a huge set of numbers that no human has directly chosen, or even seen. It’s not instructions written by a human.
Neither a software’s code nor its data is necessarily generated by humans.
4: I have bad news for you about software engineering
Does software work? Not always, but if not, it fails in ways that are entirely determined by the human’s instructions.
This is just not true, many bugs are caused by specific interactions between inputs and the code + data, some also caused by inputs, code, data, and hardware (buffer overflows being the canonical example). You could get an error due to cosmic bit flips, that has nothing to do with humans or instructions at all! Data corruption… I could go on and on.
For example, unit tests are written to verify that the software does what it is expected to do in different cases. The set of cases are specified in advance, based on what the programmer expected the software to do.
… or the test is incorrect. Or both the test and the software are incorrect. Of course this assumes you wrote tests, which you probably didn’t. Also, who said you can’t write unit tests for AI? You can, and people do. All you have to do is fix the temperature parameter and random seed. One could argue benchmarks are just stochastic tests...
If it fails a single unit test, the software is incorrect, and should be fixed.
Oh dear. I wish the world worked like this.
Badly written, buggy software is still software. Not all software works, and it isn’t always software’s fault. Not all software is fixable or easy to fix.
5: Implications
What we call AI in 2024 is not software. It’s kind of natural to put it in the same category as other things that run on a computer, but thinking about LLMs, or image generation, or deepfakes as software is misleading, and confuses most of the ethical, political, and technological discussions.
In my experience, thinking of AI as software leads to higher quality conversations about the issues. Everyone understands at some level that software can break, be misused, or be otherwise in-optimal for any number of reasons.
I have found that when people begin to think AI is not software, they often devolve into dorm room philosophy debates instead of dealing with its many concrete, logical, potentially fixable issues.
I very strongly disagree. In my opinion, this argument appears fatally confused about the concept of “software.”
As others have pointed out, this post seems to be getting at a distinction between code and data, but many of the examples of software given by OP contain both code and data, as most software does. Perhaps the title should have been “AI is Not Code,” but since it wasn’t I think mine is a legitimate rebuttal.
I’m not trying to make an argument by definition. My comment is about properties of software that I think we would likely agree on. I think OP both ignores some properties software can have while assuming all software shares other separate properties, to the detriment of the argument.
I think the post is correct in pointing out that traditional software is not similar to AI in many ways, but that’s where my agreement ends.
1: Software, I/O, and such
Most agree on the following basic definition: software is a set of both instructions and data, hosted on hardware, that governs how input data is transformed to some sort of output. As you point out, inputs and outputs are not software.
Yes.
No! It is quite literally executing code to produce the output! Just because this specific code and the data it interacts with specifies a complicated probability model that does not mean it is not software.
Every component of the model is software. Even the pseudorandomness of the model outputs is software (torch.randn(), often). There is no part of this inference process that generates outputs that is not software. To run inference is only to run software.
2: Stochasticity
Software is often, but is not necessarily deterministic. Software can have stochastic or pseudorandom outputs. For example, software that generates pseudorandom numbers is still software. The fact that AI generates stochastic outputs humans don’t expect does not make it not software.
Also, software is not necessarily interpretable and outputs are not necessarily expected or expectable.
3: Made on Earth by Humans
Neither a software’s code nor its data is necessarily generated by humans.
4: I have bad news for you about software engineering
This is just not true, many bugs are caused by specific interactions between inputs and the code + data, some also caused by inputs, code, data, and hardware (buffer overflows being the canonical example). You could get an error due to cosmic bit flips, that has nothing to do with humans or instructions at all! Data corruption… I could go on and on.
… or the test is incorrect. Or both the test and the software are incorrect. Of course this assumes you wrote tests, which you probably didn’t. Also, who said you can’t write unit tests for AI? You can, and people do. All you have to do is fix the temperature parameter and random seed. One could argue benchmarks are just stochastic tests...
Oh dear. I wish the world worked like this.
Badly written, buggy software is still software. Not all software works, and it isn’t always software’s fault. Not all software is fixable or easy to fix.
5: Implications
In my experience, thinking of AI as software leads to higher quality conversations about the issues. Everyone understands at some level that software can break, be misused, or be otherwise in-optimal for any number of reasons.
I have found that when people begin to think AI is not software, they often devolve into dorm room philosophy debates instead of dealing with its many concrete, logical, potentially fixable issues.