I’m not confident in the full answer to this question, but I can give some informed speculation. AI progress seems to rely principally on two driving forces:
Scaling hardware, i.e., making training runs larger, increasing model size, and scaling datasets.
Software progress, which includes everything from architectural improvements to methods of filtering datasets.
On the hardware scaling side, there’s very little that an AI lab can patent. The hardware itself may be patentable: for example, NVIDIA enjoys a patent on the H100. However, the mere idea of scaling hardware and training for longer are abstract ideas that are generally not legally possible to patent. This may help explain why NVIDIA currently has a virtual monopoly on producing AI GPUs, but there is essentially no barrier to entry for simply using NVIDIA’s GPUs to train a state of the art LLM.
On the software side, it gets a little more complicated. US courts have generally held that abstract specifications of algorithms are not subject to patents, even though specific implementations of those algorithms are often patentable. As one Federal Circuit Judge has explained,
In short, [software and business-method patents], although frequently dressed up in the argot of invention, simply describe a problem, announce purely functional steps that purport to solve the problem, and recite standard computer operations to perform some of those steps. The principal flaw in these patents is that they do not contain an “inventive concept” that solves practical problems and ensures that the patent is directed to something “significantly more than” the ineligible abstract idea itself. See CLS Bank, 134 S. Ct. at 2355, 2357; Mayo, 132 S. Ct. at 1294. As such, they represent little more than functional descriptions of objectives, rather than inventive solutions. In addition, because they describe the claimed methods in functional terms, they preempt any subsequent specific solutions to the problem at issue. See CLS Bank, 134 S. Ct. at 2354; Mayo, 132 S. Ct. at 1301-02. It is for those reasons that the Supreme Court has characterized such patents as claiming “abstract ideas” and has held that they are not directed to patentable subject matter.
This generally limits the degree to which an AI lab can patent the concepts underlying LLMs, and thereby try to restrict competition via the legal process.
Note, however, that standard economic models of economies of scale generally predict that there should be a high concentration of firms in capital-intensive industries, which seems to be true for AI as a result of massive hardware scaling. This happens even in the absence of regulatory barriers or government-granted monopolies, and it predicts what we observe fairly well: a small number of large companies at the forefront of AI development.
I’m not confident in the full answer to this question, but I can give some informed speculation. AI progress seems to rely principally on two driving forces:
Scaling hardware, i.e., making training runs larger, increasing model size, and scaling datasets.
Software progress, which includes everything from architectural improvements to methods of filtering datasets.
On the hardware scaling side, there’s very little that an AI lab can patent. The hardware itself may be patentable: for example, NVIDIA enjoys a patent on the H100. However, the mere idea of scaling hardware and training for longer are abstract ideas that are generally not legally possible to patent. This may help explain why NVIDIA currently has a virtual monopoly on producing AI GPUs, but there is essentially no barrier to entry for simply using NVIDIA’s GPUs to train a state of the art LLM.
On the software side, it gets a little more complicated. US courts have generally held that abstract specifications of algorithms are not subject to patents, even though specific implementations of those algorithms are often patentable. As one Federal Circuit Judge has explained,
This generally limits the degree to which an AI lab can patent the concepts underlying LLMs, and thereby try to restrict competition via the legal process.
Note, however, that standard economic models of economies of scale generally predict that there should be a high concentration of firms in capital-intensive industries, which seems to be true for AI as a result of massive hardware scaling. This happens even in the absence of regulatory barriers or government-granted monopolies, and it predicts what we observe fairly well: a small number of large companies at the forefront of AI development.