Gerald Monroe comments on The Rocket Alignment Problem

Gerald Monroe 4 Oct 2018 19:56 UTC
5 points
2. “Don’t worry about developing calculus, questioning the geocentric model of the solar system, etc.” is the wrong decision in the fictional example Eliezer provided. You suggest, “once you start getting spaceplanes into orbit and notice that heading right for the moon isn’t making progress, you could probably get together some mathematicians and scrum together a rough model of orbital mechanics in time for the next launch”. I don’t think this is a realistic model of how basic research works. Possibly this is a crux between our models?
The theoretical framework behind current AI research is essentially “here’s what we are regressing between, X and Y, or here’s some input data X, outputs in response Y, and a reward R”. % correct or biggest R is the objective. And for more complex reasons that I’m going to compress here, you also care about the distribution of the responses.
This is something we can run with. We can iteratively deploy an overall framework—a massive AI platform that is supported by a consortium of companies and offers the best and most consistent performance—that supports ever more sophisticated agent architectures. That is, at first, supported architectures are for problems where the feedback is immediate and the environment the system is operating in is very markovian and clean of dirt, and later we will be able to solve more abstract problems.
With this basic idea we can replace most current jobs on earth and develop fully autonomous manufacturing, resource gathering, construction,
Automating scientific research—there’s a way to extend this kind of platform to design experiments autonomously. Essentially you build upon a lower level predictive model by predicting the outcomes of composite experiments that use multiple phenomena at once, and you conduct more experiments where the variance is high. It’s difficult to explain and I don’t have it fully mapped out, but I think developing a systematic model for how macroscale mechanical physical systems work could be done autonomously. And then the same idea scaled to how low level subatomic systems works, and to iteratively engineer nanotechnology, and maybe work through cell biology a similar way.
Umm, maybe big picture will explain it better : you have hundred story + megaliths of robotic test cells, where the robotic cells were made in an automated factory. And for cracking problems like nanotechnology or cell bio, each test cell is conducting an experiment at some level of integration to address unreliable parts. For example, if you have nanoscale gears and motors working well, but not switches, each test cell is exhaustively searching possible variants of a switch—not the entire grid, but using search trees to guess where a successful switch design might be—to get that piece to work.
And you have a simulator—a system using both learnable weights and some structure—that predict the switch designs that didn’t work. You feed into the simulator the error between what it predicted would happen and what the actual robotic test waldos are finding in reality. This update to the simulation model makes the overall effort more likely to design the next piece of the long process to developing nanoscale self replicating factories more probable to succeed.
And a mix of human scientists/engineer and scripts that call on machine learning models decide what to do next once a particular piece of the problem is reliably solved.
There are humans involved, it would not be a hands off system, and the robotic system operating in each test cell uses a well known and rigidly designed architecture that can be understood, even if you don’t know how the details of each module function since they are weighted combinations of multiple machine learning algorithms, some of which were in turn developed by other algorithms.
I have a pet theory that even if you could build a self improving AI, you would need to give it access to such megaliths (a cube of modular rooms as wide on each side as it is tall, where each room was made in a factory and trucked onto the site and installed by robots) to generate the clean information needed to do the kinds of magical things we think superintelligent AIs could do.
Robotic systems are the way to get that information because each step they do is replicable. And you subtract what happens without intervention by the robotic arm from what happens when you do, giving you clean data that only has the intervention in it, plus whatever variance the system you are analyzing has inherently. I have a theory that things like nanotechnology, or the kind of real medicine that could reverse human biology age and turn off all possible tumors, or all the other things we know the laws of physics permit but we cannot yet do, can’t be found in a vacuum. If you could build an AI “deity” it couldn’t come up with this solution from just what humans have published (whether it be all scientific journals ever written or every written word and recorded image) because far too much uncertainty would remain. You still wouldn’t know, even with all information analyzed, exactly what arrangements of nanoscale gears will do in a vacuum chamber. Or what the optimal drug regimen to prevent Ms. Smith from developing another mycardial infarction was. You could probably get closer than humans ever have—but you would need to manipulate the environment to find out what you needed to do.
This is the concrete reason for my assessment that out of control AGI are probably not as big a risk as we think. If such machines can’t find the information needed to kill us all without systematically looking into this with a large amount of infrastructure, and the host hardware for such a system is specialized and not just freely available on unsecured systems on the internet, and we haven’t actually designed these systems with anything like self reflectance much less awareness, it seems pretty implausible.
But I could be wrong. Having a detailed model of how I think such things would really work, based upon my previous work with present day AI, doesn’t necessarily make me correct. But I certainly feel more correct.