Here’s the most succinct and high information thing I can contribute.
Right now, each of these AI systems you describe, if they are using deep-learning at all, is using a hand-rolled solution.
You may notice that the general problems these AI systems are trying to solve are all in very similar forms to each other. You have some [measurements] → [some desired eventual outcome or desired classification]. You then need to subdivide the problem into separate submodules, and in many problems the submodules are going to be the same as everyone else’s way to solve the problem.
For example, you are going to want to classify and segment the images from a video feed into a state space of [identity, locations]. So does everyone else.
Similarly at a broader level, even if some of your algorithms have a different state space, the form of your algorithm is the same as everyone else.
And when you talk about your higher level graph—especially for realtime control—your system architecture is actually going to be identical to everyone else’s realtime system. You have a clock, you have deadlines, you have a directed graph, you have safety requirements. This code in particular is really expensive and difficult to get right—something you want to share with everyone else.
So the next major step forward is platforming. There will be some convergence to a few common platforms (and probably a round of platform wars than ultimately end up with 1-3 winners like every other format and tech war in the past). The platforms will handle:
a. Training and development of common components
b. Payment and cross-licensing agreements
c. Model selection and design
d. Compiling models to target-specific bytecode
e. Systems code for realtime system graphs
f. RTOS, driver components for realtime systems
g. (c&d) will have to be shared in common across a variety of neural network compute platforms. There’s about 100 of them now, Google’s “TPUs” are one of the earlier ones.
h. Probably housekeeping like DRM, updates, etc will end up getting platformed as well.
All this reuse means that larger and larger parts of AI systems will be shared with every other AI system. Moreover, common elements—solving the same problem—will automatically get better over timeas the shared parts get updated. This is how you get to a really smart factory robot that doesn’t get fooled by a piece of gum someone dropped—because it classifies it to [trash] because it’s sharing that part of the system with other robotic systems.
There is no economic justification to individually make that robot able to ID unexpected pieces of debris, but if it’s licensing a set of shared components that have this feature baked in, it will have that as well.
As a side note, this is why talk of a possible coming “AI winter” is bullshit. We may not reach AI sentience for many more decades, but there is still enormous room for forward progress.
Thanks for your reply! This is interesting, though I’m a little confused by some parts of it.
Is the following a good summary of your main point? A main feature of your model of AI development/deployment is that there will be many shared components of AI systems, perhaps owned by 1-3 companies, that get licensed out to people who want to use them. This is because many problems you want to solve with AI systems can be decomposed into the same kinds of subproblems, so you can reuse components that solve those subproblems many times, and there’s extra incentive to do this because designing those components is really hard. One implication of this is that progress will be faster than in a world where components are separately designed by different companies, because more training data per component so components will be able to generalise more quickly.
I guess I’m confused whether there is so much overlap in subproblems that this is how things will go.
For example, you are going to want to classify and segment the images from a video feed into a state space of [identity, locations]. So does everyone else.
Hmm, it seems this is a subproblem that only a smallish proportion of companies will want to solve (e.g. companies providing police surveillance software, contact tracing software, etc.) - but really, not that many economically relevant tasks involve facial recognition. But maybe I’m missing something?
Similarly at a broader level, even if some of your algorithms have a different state space, the form of your algorithm is the same as everyone else.
Hmm, just because the abstract form of your algorithm is the same as everyone else’s, this doesn’t mean you can reuse the same algorithm… In some sense, it’s trivial that abstract form of all algorithms is the same: [inputs] → [outputs]. But this doesn’t mean the same algorithm can be reused to solve all the problems.
The fact that companies exist seems like good evidence that economically relevant problems can be decomposed into subproblems that individual agents with human-level intelligence can solve. But I’m pretty uncertain whether economically relevant problems can be decomposed into subproblems that more narrow systems can solve? Maybe there’s an argument in your answer that I’m missing?
Ok, so please note I do work in the field. This doesn’t mean I know everything, and I could be wrong, but I have some knowledge, much of which is under NDA.
There are many levels of similarity.
From the platform level—the platform is the nn accelerator chips, all the support electronics, the RTOS, the drivers, the interfaces, and a host of other software tools—there is zero difference between AI systems at all. The platform’s role is to take an NN graph, usually defined as a *.onnx file, and to run that graph with deterministic timing, using inputs from many sensors which there have to be device drivers for.
So that’s one part of the platforming—everyone deploying any kind of autonomy system will need to purchase platforms to run it on. (and there will be only a few good enough for real time tasks where safety is a factor)
From the network architecture level, again, there are many similarities. In addition, networks that solve problems in the same class can often share the same architecture. For example, 2 networks that just identify images from a dataset can be very similar in architecture even if the datasets have totally different members.
There are technical reasons why you want to use an existing, ‘known to work’ architecture, a main one being that novel architectures will take a lot more work to run in real time on your target accelerator platform.
For different tasks that involve physical manipulations of objects in the real world , I expect there will be many similarities even if robots are doing different tasks.
Just a few : perception networks need to be similar, segmentation networks need to be similar, networks that predict how realworld objects will move, that predict damage, that predict what humans may do, that predict where an optimal path might be found, and so on and so forth.
I expect there will be far more similarities than differences.
In addition, even when the network weights are totally different, using the same software and network and platform architecture means that you can share code and you merely have to repeat training on a different dataset. Example: GPT-3 trained on a different language.
Hmm, just because the abstract form of your algorithm is the same as everyone else’s, this doesn’t mean you can reuse the same algorithm… In some sense, it’s trivial that abstract form of all algorithms is the same: [inputs] → [outputs]. But this doesn’t mean the same algorithm can be reused to solve all the problems.
This is incorrect. You’re also not thinking abstractly enough—you’re thinking what we see today, where AI systems are not platformed and are just a mess of python code defining some experimental algorithm. (eg Open AI’s examples). This isn’t production grade or reusable and it has to be or it will not be economical to use.
Here’s the most succinct and high information thing I can contribute.
Right now, each of these AI systems you describe, if they are using deep-learning at all, is using a hand-rolled solution.
You may notice that the general problems these AI systems are trying to solve are all in very similar forms to each other. You have some [measurements] → [some desired eventual outcome or desired classification]. You then need to subdivide the problem into separate submodules, and in many problems the submodules are going to be the same as everyone else’s way to solve the problem.
For example, you are going to want to classify and segment the images from a video feed into a state space of [identity, locations]. So does everyone else.
Similarly at a broader level, even if some of your algorithms have a different state space, the form of your algorithm is the same as everyone else.
And when you talk about your higher level graph—especially for realtime control—your system architecture is actually going to be identical to everyone else’s realtime system. You have a clock, you have deadlines, you have a directed graph, you have safety requirements. This code in particular is really expensive and difficult to get right—something you want to share with everyone else.
So the next major step forward is platforming. There will be some convergence to a few common platforms (and probably a round of platform wars than ultimately end up with 1-3 winners like every other format and tech war in the past). The platforms will handle:
a. Training and development of common components
b. Payment and cross-licensing agreements
c. Model selection and design
d. Compiling models to target-specific bytecode
e. Systems code for realtime system graphs
f. RTOS, driver components for realtime systems
g. (c&d) will have to be shared in common across a variety of neural network compute platforms. There’s about 100 of them now, Google’s “TPUs” are one of the earlier ones.
h. Probably housekeeping like DRM, updates, etc will end up getting platformed as well.
All this reuse means that larger and larger parts of AI systems will be shared with every other AI system. Moreover, common elements—solving the same problem—will automatically get better over time as the shared parts get updated. This is how you get to a really smart factory robot that doesn’t get fooled by a piece of gum someone dropped—because it classifies it to [trash] because it’s sharing that part of the system with other robotic systems.
There is no economic justification to individually make that robot able to ID unexpected pieces of debris, but if it’s licensing a set of shared components that have this feature baked in, it will have that as well.
As a side note, this is why talk of a possible coming “AI winter” is bullshit. We may not reach AI sentience for many more decades, but there is still enormous room for forward progress.
Thanks for your reply! This is interesting, though I’m a little confused by some parts of it.
Is the following a good summary of your main point? A main feature of your model of AI development/deployment is that there will be many shared components of AI systems, perhaps owned by 1-3 companies, that get licensed out to people who want to use them. This is because many problems you want to solve with AI systems can be decomposed into the same kinds of subproblems, so you can reuse components that solve those subproblems many times, and there’s extra incentive to do this because designing those components is really hard. One implication of this is that progress will be faster than in a world where components are separately designed by different companies, because more training data per component so components will be able to generalise more quickly.
I guess I’m confused whether there is so much overlap in subproblems that this is how things will go.
Hmm, it seems this is a subproblem that only a smallish proportion of companies will want to solve (e.g. companies providing police surveillance software, contact tracing software, etc.) - but really, not that many economically relevant tasks involve facial recognition. But maybe I’m missing something?
Hmm, just because the abstract form of your algorithm is the same as everyone else’s, this doesn’t mean you can reuse the same algorithm… In some sense, it’s trivial that abstract form of all algorithms is the same: [inputs] → [outputs]. But this doesn’t mean the same algorithm can be reused to solve all the problems.
The fact that companies exist seems like good evidence that economically relevant problems can be decomposed into subproblems that individual agents with human-level intelligence can solve. But I’m pretty uncertain whether economically relevant problems can be decomposed into subproblems that more narrow systems can solve? Maybe there’s an argument in your answer that I’m missing?
Ok, so please note I do work in the field. This doesn’t mean I know everything, and I could be wrong, but I have some knowledge, much of which is under NDA.
There are many levels of similarity.
From the platform level—the platform is the nn accelerator chips, all the support electronics, the RTOS, the drivers, the interfaces, and a host of other software tools—there is zero difference between AI systems at all. The platform’s role is to take an NN graph, usually defined as a *.onnx file, and to run that graph with deterministic timing, using inputs from many sensors which there have to be device drivers for.
So that’s one part of the platforming—everyone deploying any kind of autonomy system will need to purchase platforms to run it on. (and there will be only a few good enough for real time tasks where safety is a factor)
From the network architecture level, again, there are many similarities. In addition, networks that solve problems in the same class can often share the same architecture. For example, 2 networks that just identify images from a dataset can be very similar in architecture even if the datasets have totally different members.
There are technical reasons why you want to use an existing, ‘known to work’ architecture, a main one being that novel architectures will take a lot more work to run in real time on your target accelerator platform.
For different tasks that involve physical manipulations of objects in the real world , I expect there will be many similarities even if robots are doing different tasks.
Just a few : perception networks need to be similar, segmentation networks need to be similar, networks that predict how realworld objects will move, that predict damage, that predict what humans may do, that predict where an optimal path might be found, and so on and so forth.
I expect there will be far more similarities than differences.
In addition, even when the network weights are totally different, using the same software and network and platform architecture means that you can share code and you merely have to repeat training on a different dataset. Example: GPT-3 trained on a different language.
Hmm, just because the abstract form of your algorithm is the same as everyone else’s, this doesn’t mean you can reuse the same algorithm… In some sense, it’s trivial that abstract form of all algorithms is the same: [inputs] → [outputs]. But this doesn’t mean the same algorithm can be reused to solve all the problems.
This is incorrect. You’re also not thinking abstractly enough—you’re thinking what we see today, where AI systems are not platformed and are just a mess of python code defining some experimental algorithm. (eg Open AI’s examples). This isn’t production grade or reusable and it has to be or it will not be economical to use.