For convenience I shall gloss this “12 orders of magnitude’ thing to “suddenly impossibly fast”.
Are there energy and thermal implications here? If we did 12 orders of magnitude more computation for what it costs today, we could probably only do it underwater at a hydroelectric damn. Things, both our devices and eventually the rest of the atmosphere, would get much warmer.
Disk and memory are now our bottlenecks for everything that used to be compute-intensive. We probably set algorithms to designing future iterations which can be built on existing machinery, and end up with things that work great but are even further into incomprehensibility than current iterations. There’s a link out there where they had a computer design a thing on an FPGA and it ended up with an isolated bit of “inoperative” circuitry which the design failed to work without; expect a whole lot more of this kind of magic.
Socially, RIP all encryption from before the compute fairy arrived, and most from shortly afterwards. It’ll be nice to be able to easily break back into old diaries that I encrypted and lost the keys to, but that’s about the most ethical possible application and it’s still not great.
Socio-politically, imagine if deep fakes just suddenly appeared in like 2000. Their reception would be very different from what it is today because today we have a sort of memetic herd immunity—there exists a “we” for “we know things at or below a certain quality or plausibility could be fakes”. ‘We’ train our collective System 1 on a pattern of reacting to probable-fakes a certain way, even though the fakes can absolutely fool someone without those patterns. Letting a technology grow up in an environment with pressures from those who want to use it for “evil” and those who want to prevent that from happening shapes the tech, in a way that a sudden massive influx of computing power would not have time to be shaped by.
Programming small things by writing tests and then having your compiler search for a program that passes all the tests gets a lot more feasible. Thus, “small” computers (phone, watch, microwave, newish refrigerator, recent washing machine) can be tweaked into “doing more things”, which in practice will probably mean observing us and trying to do something useful with that data. Useful to whom? Well, that depends. Computers still do what humans ask them to, and I am unconvinced that we can articulate what we collectively mean by “general intelligence” well enough to ask even an impossibly fast machine for it. We could hook it up to some biofeedback and say “make the system which makes me the most excited/frightened/interested”, but isn’t that the faster version of what video games already are?
I forgot about memory! I guess I should have just said “Magically 12 OOMs cheaper” instead of “magically 12 OOMs faster” though that’s a bit weirder to imagine.
Why limit ourselves to our planet? 12 OOMs is well within reason if we were a type 2 civilization and had access to all the energy from our sun (our planet as a whole only receives 1/10^10 of it).
Encryption wouldn’t really be an issue—we can simply tune our algorithms to use slightly more complicated assumptions. After all, one can just pick a problem that scales as O(10^(6n)), where n could for example be secret key length. If you have 12 orders of magnitude more compute, just make your key 2 times larger and you still have your cryptography.
Thought of how small computers (phones etc) would scale also came to me. Basically, with 12 OOMs every phone becomes a computer powerful enough to train something as complicated as GPT-3. Everyone could carry their own personalized GPT-3 model with them. Actually, this is another way to improve AI performance—reduce the amount of things it needs to model. Training a personalized model specific for one problem would be cheaper and require less parameters/layers to get useful results.
Basically, we would be able to put a “small” specialized model with power like that of GPT-3 on every microcontroller.
You mentioned deep fakes. But with this much compute, why not “deepfake” brand new networks from scratch? Doing such experiments right now is expensive since they roughly quadruple the amount of computational resources needed to achieve this “second order” training mode.
Theoretically, there’s nothing preventing one from constructing a network that can assemble new networks based on training parameters. This meta-network could take into account network structure that it “learned” from training smaller models for other applications in order to generate new models with order of magnitude less parameters.
As an analogy, compare the amount of data a neural network needs to learn to differentiate between cats and dogs vs the amount of data a human needs to learn the same thing. Human only needs a couple of examples, while neural networks needs dozens of hundreds of examples just to learn the concept of shapes and topology.
More than 5mins, because it’s fun, but:
For convenience I shall gloss this “12 orders of magnitude’ thing to “suddenly impossibly fast”.
Are there energy and thermal implications here? If we did 12 orders of magnitude more computation for what it costs today, we could probably only do it underwater at a hydroelectric damn. Things, both our devices and eventually the rest of the atmosphere, would get much warmer.
Disk and memory are now our bottlenecks for everything that used to be compute-intensive. We probably set algorithms to designing future iterations which can be built on existing machinery, and end up with things that work great but are even further into incomprehensibility than current iterations. There’s a link out there where they had a computer design a thing on an FPGA and it ended up with an isolated bit of “inoperative” circuitry which the design failed to work without; expect a whole lot more of this kind of magic.
Socially, RIP all encryption from before the compute fairy arrived, and most from shortly afterwards. It’ll be nice to be able to easily break back into old diaries that I encrypted and lost the keys to, but that’s about the most ethical possible application and it’s still not great.
Socio-politically, imagine if deep fakes just suddenly appeared in like 2000. Their reception would be very different from what it is today because today we have a sort of memetic herd immunity—there exists a “we” for “we know things at or below a certain quality or plausibility could be fakes”. ‘We’ train our collective System 1 on a pattern of reacting to probable-fakes a certain way, even though the fakes can absolutely fool someone without those patterns. Letting a technology grow up in an environment with pressures from those who want to use it for “evil” and those who want to prevent that from happening shapes the tech, in a way that a sudden massive influx of computing power would not have time to be shaped by.
Programming small things by writing tests and then having your compiler search for a program that passes all the tests gets a lot more feasible. Thus, “small” computers (phone, watch, microwave, newish refrigerator, recent washing machine) can be tweaked into “doing more things”, which in practice will probably mean observing us and trying to do something useful with that data. Useful to whom? Well, that depends. Computers still do what humans ask them to, and I am unconvinced that we can articulate what we collectively mean by “general intelligence” well enough to ask even an impossibly fast machine for it. We could hook it up to some biofeedback and say “make the system which makes me the most excited/frightened/interested”, but isn’t that the faster version of what video games already are?
I forgot about memory! I guess I should have just said “Magically 12 OOMs cheaper” instead of “magically 12 OOMs faster” though that’s a bit weirder to imagine.
Why limit ourselves to our planet? 12 OOMs is well within reason if we were a type 2 civilization and had access to all the energy from our sun (our planet as a whole only receives 1/10^10 of it).
Encryption wouldn’t really be an issue—we can simply tune our algorithms to use slightly more complicated assumptions. After all, one can just pick a problem that scales as O(10^(6n)), where n could for example be secret key length. If you have 12 orders of magnitude more compute, just make your key 2 times larger and you still have your cryptography.
Thought of how small computers (phones etc) would scale also came to me. Basically, with 12 OOMs every phone becomes a computer powerful enough to train something as complicated as GPT-3. Everyone could carry their own personalized GPT-3 model with them. Actually, this is another way to improve AI performance—reduce the amount of things it needs to model. Training a personalized model specific for one problem would be cheaper and require less parameters/layers to get useful results.
Basically, we would be able to put a “small” specialized model with power like that of GPT-3 on every microcontroller.
You mentioned deep fakes. But with this much compute, why not “deepfake” brand new networks from scratch? Doing such experiments right now is expensive since they roughly quadruple the amount of computational resources needed to achieve this “second order” training mode.
Theoretically, there’s nothing preventing one from constructing a network that can assemble new networks based on training parameters. This meta-network could take into account network structure that it “learned” from training smaller models for other applications in order to generate new models with order of magnitude less parameters.
As an analogy, compare the amount of data a neural network needs to learn to differentiate between cats and dogs vs the amount of data a human needs to learn the same thing. Human only needs a couple of examples, while neural networks needs dozens of hundreds of examples just to learn the concept of shapes and topology.