Even though I come from a somewhat different viewpoint, I was fairly impressed with the case you are presenting. Nonetheless . ..
Creating a powerful AI system without understanding why it works is dangerous.
A large portion of the risk from machine superintelligence comes from the possibility of people building systems that they do not fully understand. Currently, this is commonplace in practice: many modern AI researchers are pushing the capabilities of deep neural networks in the absence of theoretical foundations that describe why they’re working so well or a solid idea of what goes on beneath the hood.
This is not quite fully accurate. Yes anybody can download a powerful optimizer and use it to train a network that they don’t understand. But those are not the people you need to worry about, that is not where the danger lies.
The concern that ML has no solid theoretical foundations reflects the old computer science worldview, which is all based on finding bit exact solutions to problems within vague asymptotic resource constraints.
Old computer science gave us things like convex optimization, which is nothing interesting at all (it only works well for simple uninteresting problems). Modern AI/ML is much more like computer graphics or simulation, where everything is always an approximation and traditional computer science techniques are mostly useless. There is no ‘right answer’, there are just an endless sea of approximations that have varying utility/cost tradeoffs.
A good ML researcher absolutely needs a good idea of what is going on under the hood—at least at a sufficient level of abstraction. The optimization engine does most of the nitty gritty work—but it is equivalent to the researcher employing an army of engineers and dividing the responsibility up so that each engineer works on a tiny portion of the circuit. To manage the optimizer, the researcher needs a good high level understanding of the process, although not necessarily the details.
Also - we do have some theoretical foundations for DL—bayesian inference for one. Using gradient descent on the joint log PDF is a powerful approximate inference strategy.
When designing a superintelligent agent, by contrast, we will want an unusually high level of confidence in its safety before we begin online testing: trial and error alone won’t cut it, in that domain.
It appears you are making the problem unnecessarily difficult.
Why not test safety long before the system is superintelligent? - say when it is a population of 100 child like AGIs. As the population grows larger and more intelligent, the safest designs are propagated and made safer.
This is an approach researchers have used successfully in the past.
Our main open-problem generator — “what would we be unable to solve even if the problem were easier?” — is actually a fairly common one used across mathematics and computer science. It’s more easy to recognize if we rephrase it slightly: “can we reduce the problem of building a beneficial AI to some other, simpler problem?”
This again reflects the old ‘hard’ computer science worldview, and obsession with exact solutions.
If it seems really really really impossibly hard to solve a problem even with the ‘simplification’ of lots of computing power, perhaps the underlying assumptions are wrong. For example—perhaps using lots and lots of computing power makes the problem harder instead of easier.
How could that be? Because with lots and lots of compute power, you are naturally trying to extrapolate the world model far far into the future, where it branches enormously and grows in complexity exponentially. Then when you try to define a reasonable utility/value function over the future world model, it becomes almost impossible because the future world model has exploded exponentially in complexity.
So it may actually be easier to drop the traditional computer science approach completely. Start with a smaller more limited model that doesnt explode, and then approximately extrapolate both the world model and the utility/value function together.
This must be possible in principle, because human children learn that way. Realistically there isn’t room in the DNA for a full adult utility/value function, and it wouldn’t work in an infant brain anyway without the world model. But evolution solved this problem approximately, and we can learn from it and make do.
Thanks for the reply, Jacob! You make some good points.
Why not test safety long before the system is superintelligent? - say when it is a population of 100 child like AGIs. As the population grows larger and more intelligent, the safest designs are propagated and made safer.
I endorse eli_sennesh’s response to this part :-)
This again reflects the old ‘hard’ computer science worldview, and obsession with exact solutions.
I am not under the impression that there are “exact solutions” available, here. For example, in the case of “building world-models,” you can’t even get “exact” solutions using AIXI (which does Bayesian inference using a simplicity prior in order to guess what the environment looks like; and can never figure it out exactly). And this is in the simplified setting where AIXI is large enough to contain all possible environments! We, by contrast, need to understand algorithms which allow you to build a world model of the world that you’re inside of; exact solutions are clearly off the table (and, as eli_sennesh notes, huge amounts of statistical modeling are on it instead).
I would readily accept a statistical-modeling-heavy answer to the question of “but how do you build multi-level world-models from percepts, in principle?”; and indeed, I’d be astonished if you avoided it.
Perhaps you read “we need to know how to do X in principle before we do it in practice” as “we need a perfect algorithm that gives you bit-exact solutions to X”? That’s an understandable reading; my apologies. Let me assure you again that we’re not under the illusion you can get bit-exact solutions to most of the problems we’re working on.
For example—perhaps using lots and lots of computing power makes the problem harder instead of easier. How could that be? Because with lots and lots of compute power, you are naturally trying to extrapolate the world model far far into the future, where it branches enormously [...]
Hmm. If you have lots and lots of computing power, you can always just… not use it. It’s not clear to me how additional computing power can make the problem harder—at worst, it can make the problem no easier. I agree, though, that algorithms for modeling the world from the inside can’t just extrapolate arbitrarily, on pain of exponential complexity; so whatever it takes to build and use multi-level world-models, it can’t be that.
Perhaps the point where we disagree is that you think these hurdles suggest that figuring out how to do things we can’t yet do in principle is hopeless, whereas I’m under the impression that these shortcomings highlight places where we’re still confused?
Hmm. If you have lots and lots of computing power, you can always just… not use it. It’s not clear to me how additional computing power can make the problem harder—at worst, it can make the problem no easier.
Additional computing power might not make the problem literally harder, but the assumption of limitless computing power might direct your attention towards wrong parts of the search space.
For example, I suspect that the whole question about multilevel world-models might be something that arises from conceptualizing intelligence as something like AIXI, which implicitly assumes that there’s only one true model of the world. It can do this because it has infinite computing power and can just replace its high-level representation of the world with one where all high-level predictions are derived from the basic atom-level interactions, something that would be intractable for any real-world system to do. Instead real-world systems will need to flexibly switch between different kinds of models depending on the needs of the situation, and use lower-level models in situations where the extra precision is worth the expense of extra computing time. Furthermore, those lower-level models will have been defined in terms of what furthers the system’s goals, as defined on the higher-levels: it will pay preferential attention to those features of the lower-level model that allow it to further its higher-level goals.
In the AIXI framing, the question of multilevel world-models is “what happens when the AI realizes that the true world model doesn’t contain carbon atoms as an ontological primitive”. In the resource-limited framing, that whole question isn’t even coherent, because the system has no such thing as a single true world-model. Instead the resource-limited version of how to get multilevel world-models to work is something like “how to reliably ensure that the AI will create a set of world models in which the appropriate configuration of subatomic objects in the subatomic model gets mapped to the concept of carbon atoms in the higher-level model, while the AI’s utility function continues to evaluate outcomes in terms of this concept regardless of whether it’s using the lower- or higher-level representation of it”.
As an aside, this reframed version seems like the kind of question that you would need to solve in order to have any kind of AGI in the first place, and one which experimental machine learning work would seem the best suited for, so I’d assume it to get naturally solved by AGI researchers even if they weren’t directly concerned with AI risk.
Ohoho! Well, actually Nate, as I personally subscribe to the bounded-rationality school of thinking, and I do actually think this has implications for AI safety. Specifically: as the agent acquires more resources (speed and memory), it can handle larger problems and enlarge its impact on the world, so to make a bounded-rational agent safe, we should, hypothetically, be able to state safety properties explicitly in terms of how much cognitive stuff (philosophically, it all adds up to different ingredients to that magic word “intelligence”) the agent has.
With some kind of framework like that, we’d be able to state and prove safety theorems in the form of, “This design will grow increasingly uncertain about its value function as it grows its cognitive resources, and act more cautiously until receiving more training, and we have some analytic bound telling us exactly how fast this fall-off will happen.” I can even imagine it running along the simple lines of, “As the agent’s model of the world grows more complicated, the entropy/Kolmogorov complexity of that model penalizes hypotheses about the learned value function, thus causing the agent to grow increasingly passive and wait for value training as it learns and grows.”
This requires a framework for normative uncertainty that formalizes acting cautiously when under value-uncertainty, but didn’t someone publish a thesis on that at Oxford a year or two ago?
I would readily accept a statistical-modeling-heavy answer to the question of “but how do you build multi-level world-models from percepts, in principle?”; and indeed, I’d be astonished if you avoided it.
Hmm. If you have lots and lots of computing power, you can always just… not use it. It’s not clear to me how additional computing power can make the problem harder—at worst, it can make the problem no easier. I agree, though, that algorithms for modeling the world from the inside can’t just extrapolate arbitrarily, on pain of exponential complexity; so whatever it takes to build and use multi-level world-models, it can’t be that.
Well, as jacob_cannell pointed out, feeding more compute-power to a bounded-rational agent ought to make it enlarge its models in terms of theory-depth, theory-preorder-connectedness, variance-explanation, and time-horizon. In very short: the branching factors and the hypothesis class get larger, making it harder to learn (if we’re thinking about statistical learning theory).
There’s also the specific issue of assuming Turing-machine-level compute power, assuming that “available compute steps” and “available memory” is an unbounded but finite natural number. Since you’ve not bounded the number, it’s effectively infinite, which of course means that two agents, each of which is “programmed” as a Turing-machine with Turing-machine resources rather than strictly finite resources, can’t reason about each-other: either one would need ordinal numbers to think about what the other (or itself) can do, but actually using ordinal numbers in that analysis would be necessarily wrong (in that neither actually possesses a Turing Oracle, which is equivalent to having w_0 steps of computation).
So you get a bunch of paradox theorems making your job a lot harder.
In contrast, starting from the assumption of having strictly finite computing power is like when E.T. Jaynes starts from the assumption of having finite sample data, finite log-odds, countable hypotheses, etc.: we assume what must necessarily be true in reality to start with, and then analyze the infinite case as passing to the limit of some finite number. Pascal’s Mugging is solvable this way using normal computational Bayesian statistical techniques, for instance, if we assume that we can sample outcomes from our hypothesis distribution.
Let me assure you again that we’re not under the illusion you can get bit-exact solutions to most of the problems we’re working on.
Ok—then you are moving into the world of heuristics and approximations. Once one acknowledges that the bit exact ‘best’ solution either does not exist or cannot be found, then there is an enormous (infinite really) space of potential solutions which have different tradeoffs in their expected utillity in different scenarios/environments along with different cost structures. The most interesting solutions often are so complex than they are too difficult to analyze formally.
Consider the algorithms employed in computer graphics and simulation—which is naturally quite related to the world modelling problems in your maximize diamond example. The best algorithms and techniques employ some reasonably simple principles—such as hierarchical bounded approximations over octrees, or bidirectional path tracing—but a full system is built from a sea of special case approximations customized to particular types of spatio-temporal patterns. Nobody bothers trying to prove that new techniques are better than old, nobody bothers using formal tools to analyze the techniques, because the algorithmic approximation tradeoff surface is far too complex.
In an approximation driven field, new techniques are arrived at through intuitive natural reasoning and are evaluated experimentally. Modern machine learning seems confusing and ad-hoc to mathematicians and traditional computer scientists because it is also an approximation field.
Why not test safety long before the system is superintelligent? - say when it is a population of 100 child like AGIs. As the population grows larger and more intelligent, the safest designs are propagated and made safer.
I endorse eli_sennesh’s response to this part :-)
Ok, eli said:
Because that requires a way to state and demonstrate safety properties such that safety guarantees obtained with small amounts of resources remain strong when the system gets more resources. More on that below.
My perhaps predictable reply is that this safety could be demonstrated experimentally—for example by demonstrating altruism/benevolence as you scale up the AGI in terms of size/population, speed, and knowledge/intelligence. When working in an approximation framework where formal analysis does not work and everything must be proven experimentally—this is simply the best that we can do.
If we could somehow ‘guarantee’ saftey that would be nice, but can we guarantee safety of future human populations?
And now we get into that other issue—if you focus entirely on solving problems with unlimited computation, you avoid thinking about what the final practical resource efficient solutions look like, and you avoid the key question of how resource efficient the brain is. If the brain is efficient, then successful AGI is highly likely to take the form of artificial brains.
So if AGI is broad enough to include artificial brains or ems—then a friendly AI theory which can provide safety guarantees for AGI in general should be able to provide guarantees for artificial brains—correct? Or is it your view that the theory will be more narrow and will only cover particular types of AGI? If so—what types?
I think those scope questions are key, but I don’t want to come off as a hopeless negative critic—we can’t really experiment with AGI just yet, and we may have limited time for experimentation. So to the extent that theory could lead practice—that would be useful if at all possible.
Hmm. If you have lots and lots of computing power, you can always just… not use it. It’s not clear to me how additional computing power can make the problem harder
I hope the context indicated that I was referring to conceptual hardness/difficulty in finding the right algorithm. For example consider the problem of simulating an infinite universe. If you think about the problem first in the case of lots of compute power, it may actually become a red herring. The true solution will involve something like an output sensitive algorithm (asymptotic complexity does not depend at all on the world size) - as in some games—and thus having lots of compute is irrelevant.
I suspect that your maximize diamond across the universe problem is FAI-complete. The hard part is specifying the ‘diamond utility function’, because diamonds are a pattern in the mind that depends on the world model in the mind. The researcher needs to transfer a significant fraction of their world model or mind program into the machine—and if you go to all that trouble then you might as well use a better goal. The simplest solution probably involves uploading.
Thanks again, Jacob. I don’t have time to reply to all of this, but let me reply to one part:
Once one acknowledges that the bit exact ‘best’ solution either does not exist or cannot be found, then there is an enormous (infinite really) space of potential solutions which have different tradeoffs in their expected utillity in different scenarios/environments along with different cost structures. The most interesting solutions often are so complex than they are too difficult to analyze formally.
I don’t buy this. Consider the “expert systems” of the seventies, which used curated databases of logical sentences and reasoned from those using a whole lot of ad-hoc rules. They could just as easily have said “Well we need to build systems that deal with lots of special cases, and you can never be certain about the world. We cannot get exact solutions, and so we are doomed to the zone of heuristics and tradeoffs where the only interesting solutions are too complex to analyze formally.” But they would have been wrong. There were tools and concepts and data structures that they were missing. Judea Pearl (and a whole host of others) showed up, formalized probabilistic graphical models, related them to Bayesian inference, and suddenly a whole class of ad-hoc solutions were superseded.
So I don’t buy that “we can’t get exact solutions” implies “we’re consigned to complex heuristics.” People were using complicated ad-hoc rules to approximate logic, and then later they were using complex heuristics to approximate Bayesian inference, and this was progress.
My claim is that there are other steps such as those that haven’t been made yet, that there are tools on the order of “causal graphical models” that we are missing.
Imagine encountering a programmer from the future who knows how to program an AGI and asking them “How do you do that whole multi-level world-modeling thing? Can you show me the algorithm?” I strongly expect that they’d say something along the lines of “oh, well, you set up a system like this and then have it take percepts like that, and then you can see how if we run this for a while on lots of data it starts building multi-level descriptions of the universe. Here, let me walk you through what it looks like for the system to discover general relativity.”
Since I don’t know of a way to set up a system such that it would knowably and reliably start modeling the universe in this sense, I suspect that we’re missing some tools.
I’m not sure whether your view is of the form “actually the programmer of the future would say “I don’t know how it’s building a model of the world either, it’s just a big neural net that I trained for a long time”″ or whether it’s of the form “actually we do know how to set up that system already”, or whether it’s something else entirely. But if it’s the second one, then please tell! :-)
My claim is that there are other steps such as those that haven’t been made yet, that there are tools on the order of “causal graphical models” that we are missing.
I thought you hired Jessica for exactly that. I have these slides and everything that I was so sad I wouldn’t get to show you because you’d know all about probabilistic programming after hiring Jessica.
Thanks for the clarifications—I’ll make this short.
Judea Pearl (and a whole host of others) showed up, formalized probabilistic graphical models, related them to Bayesian inference, and suddenly a whole class of ad-hoc solutions were superseded.
Probabilistic graphical models were definitely a key theoretical development, but they hardly swept the field of expert systems. From what I remember, in terms of practical applications, they immediately replaced or supplemented expert systems in only a few domains—such as medical diagnostic systems. Complex ad hoc expert systems continued to dominate unchallenged in most fields for decades: in robotics, computer vision, speech recognition, game AI, fighter jets, etc etc basically everything important. As far as I am aware the current ANN revolution is truly unique in that it is finally replacing expert systems across most of the board—although there are still holdouts (as far as I know most robotic controllers are still expert systems, as are fighter jets, and most Go AI systems).
The ANN solutions are more complex than the manually crafted expert systems they replace—but the complexity is automatically generated. The code the developers actually need to implement and manage is vastly simpler—this is the great power and promise of machine learning.
Here is a simple general truth—the Occam simplicity prior does imply that simpler hypotheses/models are more likely, but for any simple model there are an infinite family of approximations to that model of escalating complexity. Thus more efficient approximations naturally tend to have greater code complexity, even though they approximate a much simpler model.
My claim is that there are other steps such as those that haven’t been made yet, that there are tools on the order of “causal graphical models” that we are missing.
Well, that would be interesting.
I’m not sure whether your view is of the form “actually the programmer of the future would say “I don’t know how it’s building a model of the world either, it’s just a big neural net that I trained for a long time”″ or whether it’s of the form “actually we do know how to set up that system [multi-level model] already”, or whether it’s something else entirely. But if it’s the second one, then by all means, please tell :-)
Anyone who has spent serious time working in graphics has also spent serious time thinking about how to create the matrix—if given enough computer power. If you got say a thousand of the various brightest engineers in different simulation related fields, from physics to graphics, and got them all working on a large mega project with huge funds it could probably be implemented today. You’d start with a hierarchical/multi-resolution modelling graph—using say octrees or kdtrees over voxel cells, and a general set of hierarchical bidirectional inference operators for tracing paths and interactions.
To make it efficient, you need a huge army of local approximation models for different phenomena at different scales—low level quantum codes just in case, particle level codes, molecular bio codes, fluid dynamics, rigid body, etc etc. It’s a sea of codes with decision tree like code to decide which models to use where and when.
Of course with machine learning we could automatically learn most of those codes—which suddenly makes it more tractable. And then you could use that big engine as your predictive world model, once it was trained.
The problem is to plan anything worthwhile you need to simulate human minds reasonably well, which means to be useful the sim engine would basically need to infer copies of everyone’s minds . . ..
And if you can do that, then you already have brain based AGI!
So I expect that the programmer from the future will say—yes at the low level we use various brain-like neural nets, and various non-brain like neural nets or learned virtual circuits, some operating over explicit space-time graphs. In all cases we have pretty detailed knowledge of what the circuits are doing—here take a look at that last goal update that just propagated in your left anterior prefrontal cortex . ..
While the methods for finding a solution to a well-formed problem currently used in Machine Learning are relatively well understood, the solutions found are not.
And that is what really matters from a safety perspective. We can and do make some headway in understanding the solutions, as well, but the trend is towards more autonomy for the learning algorithm, and correspondingly more opaqueness.
As you mentioned, the solutions found are extremely complex. So I don’t think it makes sense to view them only in terms of approximations to some conceptually simple (but expensive) ideal solution.
If we want to understand their behaviour, which is what actually matters for safety, we will have to grapple with this complexity somehow.
Personally, I’m not optimistic about experimentation (as it is currently practiced in the ML community) being a good enough solution. There is, at least, the problem of the treacherous turn. If we’re lucky, the AI jumps the gun, and society wakes up to the possibility of an AI trying to take over. If we’re unlucky, we don’t get any warning, and the AI only behaves for long enough to gain our trust and discover a nearly fail-proof strategy. VR could help here, but I think it’s rather far from a complete solution.
1) The divide between your so called “old CS” and “new CS” is more of a divide (or perhaps a continuum) between engineers and theorists. The former is concerned with on-the-ground systems, where quadratic time algorithms are costly and statistics is the better weapon at dealing with real world complexities. The latter is concerned with abstracted models where polynomial time is good enough and logical deduction is the only tool. These models will probably never be applied literally by engineers, but they provide human understanding of engineering problems, and because of their generality, they will last longer. The idea of a Turing machine will last centuries if not millenia, but a Pascal programmer might not find a job today and a Python programmer might not find a job in 20 years. Machine learning techniques constantly come in and out of vogue, but something like the PAC model will be here to stay for a long time. But of course at the end of the day it’s engineers who realize new inventions and technologies.
Theorists’ ideas can transform an entire engineering field, and engineering problems inspire new theories. We need both types of people (or rather, people across the spectrum from engineers to theorists).
2) With neural networks increasing in complexity, making the learning converge is no longer as simple as just running gradient descent. In particular, something like a K12 curriculum will probably emerge to guide the AGI past local optima. For example, the recent paper on neural Turing machines has already employed curriculum learning, as the authors couldn’t get good performance otherwise. So there is a nontrivial maintenance cost (in designing a curriculum) to a neural network so that it adapts to a changing environment, which will not lessen if we don’t better our understanding of it.
Of course expert systems also have maintenance costs, of a different type. But my point is that neural networks are not free lunches.
3) What caused the AI winter was that AI researchers didn’t realize how difficult it was to do what seems so natural to us—motion, language, vision, etc. They were overly optimistic because they succeeded in what were difficult to humans—chess, math, etc. I think it’s fair to say the ANNs have “swept the board” in the former category, the category of lower level functions (machine translation, machine vision, etc), but the high level stuff is still predominantly logical systems (formal verification, operations research, knowledge representation, etc). It’s unfortunate that the the neural camp and logical camp don’t interact too much, but I think it is a major objective to combine the flexibility of neural systems with the power and precision of logical systems.
Here is a simple general truth—the Occam simplicity prior does imply that simpler hypotheses/models are more likely, but for any simple model there are an infinite family of approximations to that model of escalating complexity. Thus more efficient approximations naturally tend to have greater code complexity, even though they approximate a much simpler model.
Schmidhuber invented something called the speed prior that weighs an algorithm according to how fast it generates the observation, rather than how simple it is. He makes some ridiculous claims about our (physical) universe assuming the speed prior. Ostensibly one can also weigh in accuracy of approximation in there to produce another variant of prior. (But of course all of these will lose the universality enjoyed by the Occam prior)
My perhaps predictable reply is that this safety could be demonstrated experimentally—for example by demonstrating altruism/benevolence as you scale up the AGI in terms of size/population, speed, and knowledge/intelligence.
There’s a big difference between the hopelessly empirical school of machine learning, in which things are shown in experiments and then accepted as true, and real empirical science, in which we show things in small-scale experiments to build theories of how the systems in question behave in the large scale.
You can’t actually get away without any theorizing, on the basis of “Oh well, it seems to work. Ship it.” That’s actually bad engineering, although it’s more commonly accepted in engineering than in science. In a real science, you look for the laws that underly your experimental results, or at least causally robust trends.
If the brain is efficient, then successful AGI is highly likely to take the form of artificial brains.
If the brain is efficient, and it is, then you shouldn’t try to cargo-cult copy the brain, any more than we cargo-culted feathery wings to make airplanes. You experiment, you theorize, you find out why it’s efficient, and then you strip that of its evolutionarily coincidental trappings and make an engine based on a clear theory of which natural forces govern the phenomenon in question—here, thought.
If the brain is efficient, and it is, then you shouldn’t try to cargo-cult copy the brain, any more than we cargo-culted feathery wings to make airplanes.
The wright brothers copied wings for lift and wing warping for 3D control both from birds. Only the forward propulsion was different.
make an engine based on a clear theory of which natural forces govern the phenomenon in question—here, thought.
We already have that—it’s called a computer. AGI is much more specific and anthropocentric because it is relative to our specific society/culture/economy. It requires predicting and modelling human minds—and the structure of efficient software that can predict a human mind is itself a human mind.
“the structure of efficient software that can predict a human mind is itself a human mind.”—I doubt that. Why do you think this is the case? I think there are already many examples where simple statistical models (e.g. linear regression) can do a better job of predicting some things about a human than an expert human can.
“Intelligence measures an agent’s ability to achieve goals in a wide range of environments.”
So, arguably that should include environments with humans in them. But to succeed, an AI would not necessarily have to predict or model human minds; it could instead, e.g. kill all humans, and/or create safeguards that would prevent its own destruction by any existing technology.
A computer is a bicycle for the mind. Logic is purified thought, computers are logic engines. General intelligence can be implemented by a computer, but it is much more anthrospecific.
With respect, no, it’s just thought with all the interesting bits cut away to leave something so stripped-down it’s completely deterministic.
computers are logic engines
Sorta-kinda. They’re also arithmetic engines, floating-point engines, recording engines. They can be made into probability engines, which is the beginnings of how you implement intelligence on a computer.
It appears you are making the problem unnecessarily difficult.
No, not really. In fact, I expect that given the right way of modelling, formal verification of learning systems up to epsilon-delta bounds (in the style of PAC-learning, for instance) should be quite doable. Why? Because, as mentioned regarding PAC learning, it’s the existing foundation for machine learning.
I do agree that this post reflects an “Old Computer Science” worldview, but to be fair, that’s not Nate’s personal fault, or MIRI’s organizational fault. It’s the fault of the entire subfield of AGI that still has not bloody learned the basic lessons of statistical machine learning: that real cognition just is about probably approximately correct statistical modelling.
So as you mention, for instance, there’s an immense amount of foundational theory behind modern neural networks. Hell, if I could find the paper showing that deep networks form a “funnel” in the model’s free-energy landscape—where local minima are concentrated in that funnel and all yield more-or-less as-good test error, while the global minimum reliably overfits—I’d be posting the link myself.
The problem with deep neural networks is not that they lack theoretical foundations. It’s that most of the people going “WOW SO COOL” at deep neural networks can’t be bothered to understand the theoretical foundations. The “deep learning cabal” of researchers (out of Toronto, IIRC), and the Switzerland Cabal of Schmidhuber-Hutter-and-Legg fame, all know damn well what they are doing on an analytical level.
(And to cheer for my favorite approach, the probabilistic programming cabal has even more analytical backing, since they can throw Bayesian statistics, traditional machine learning, and programming-languages theory at their problems.)
Sure, it does all require an unusual breadth of background knowledge, but they, this is how real science proceeds, people: shut up and read the textbooks and literature. Sorry, but if we (as in, this community) go around claiming that important problems can be tackled without background knowledge and active literature, or with as little as the “AGI” field seems to generate, then we are not being instrumentally rational. Period. Shut up and PhD.
Why not test safety long before the system is superintelligent?
Because that requires a way to state and demonstrate safety properties such that safety guarantees obtained with small amounts of resources remain strong when the system gets more resources. More on that below.
This again reflects the old ‘hard’ computer science worldview, and obsession with exact solutions.
If it seems really really really impossibly hard to solve a problem even with the ‘simplification’ of lots of computing power, perhaps the underlying assumptions are wrong. For example—perhaps using lots and lots of computing power makes the problem harder instead of easier.
You’re not really being fair to Nate here, but let’s be charitable to you: this is fundamentally a dispute between the heuristics-and-biases school of thought about cognition and the bounded/resource-rational school of thought.
In the heuristics-and-biases school of thought, the human mind uses heuristics or biases when it believes it doesn’t have the computing power on hand to use generally intelligent inference, or sometimes the general intelligence is even construed as an emergent computational behavior of an array of heuristics and biases that happened to get thrown together by evolution in the right way. Computationally, this is saying, “When we have enough resources that only asymptotic complexity matters, we use the Old Computer Science way of just running the damn algorithm that implements optimal behavior and optimal asymptotic complexity.” Trying to extend this approach into statistical inference gets you basic Bayesianism and AIXI, which appear to have nice “optimality” guarantees, but are computationally intractable and are only optimal up to the training data you give them.
In terms of LW posts, the former alternative is embodied in Eliezer’s Sequences, and the latter in jacob_cannell’s post on The Brain as a Universal Learning Machine and my book review of Plato’s Camera.
The kinds of steps needed to get both “AI” as such, and “Friendliness” as such, are substantively different in the “possible worlds” where the two different schools of thought apply. Or, perhaps, both are true in certain ways, and what we’re really talking about is just two different ways of building minds. Personally, I think the one true distinction is that Calude’s work on measuring nonhalting computations gives us a definitive way to deal with the kinds of self-reference scenarios that Old AGI’s “any finite computation” approach generates paradoxes in.
But time will tell and I am not a PhD, so everything I say should be taken with substantial sprinklings of salt. On the other hand, to wit, while you shouldn’t think for a second that I am one of them, I am certainly on the side of the PhDs.
(Nate: sorry for squabbling on your post. All these sorts of qualms with the research program were things I was going to bring up in person, in a much more constructive way. Still looking forward to meeting you in September!)
The problem with deep neural networks is not that they lack theoretical foundations. It’s that most of the people going “WOW SO COOL” at deep neural networks can’t be bothered to understand the theoretical foundations. The “deep learning cabal” of researchers (out of Toronto, IIRC), and the Switzerland Cabal of Schmidhuber-Hutter-and-Legg fame, all know damn well what they are doing on an analytical level.
This isn’t really a problem, because—as you point out—the formidable researchers all “know damn well what they are doing on an analytical level”.
Thus the argument that there are people using DL without understanding it—and moreover that this is dangerous—is specious and weak because these people are not the ones actually likely to develop AGI let alone superintelligence.
Why not test safety long before the system is superintelligent?
Because that requires a way to state and demonstrate safety properties such that safety guarantees obtained with small amounts of resources remain strong when the system gets more resources. More on that below.
Ah—the use of guarantees belies the viewpoint problem. Instead of thinking of ‘safety’ or ‘alignment’ as some absolute binary property we can guarantee, it is more profitable to think of a complex distribution over the relative amounts of ‘safety’ or ‘alignment’ in an AI population (and any realistic AI project will necessarily involve a population due to scaling constraints). Strong guarantees may be impossible, but we can at least influence or steer the distribution by selecting for agent types that are more safe/altruistic. We can develop a scaling theory of if, how, and when these desirable properties change as agents grow in capability.
In other words—these issues are so incredibly complex that we can’t really develop any good kind of theory without alot of experimental data to back it up.
Also—I should point out that one potential likely result of ANN based AGI is the creation of partial uploads through imitation and reverse reinforcement learning—agents which are intentionally close in mindspace to their human ‘parent’ or ‘model’.
Thus the argument that there are people using DL without understanding it—and moreover that this is dangerous—is specious and weak because these people are not the ones actually likely to develop AGI let alone superintelligence.
Yes, but I don’t think that’s an argument anyone has actually made. Nobody, to my knowledge, sincerely believes that we are right around the corner from superintelligent, self-improving AGI built out of deep neural networks, such that any old machine-learning professor experimenting with how to get a lower error rate in classification tasks is going to suddenly get the Earth covered in paper-clips.
Actually, no, I can think of one person who believed that: a radically underinformed layperson on reddit who, for some strange reason, believed that LessWrong is the only site with people doing “real AI” and that “[machine-learning researchers] build optimizers! They’ll destroy us all!”
Hopefully he was messing with me. Nobody else has ever made such ridiculous claims.
Sorry, wait, I’m forgetting to count sensationalistic journalists as people again. But that’s normal.
Instead of thinking of ‘safety’ or ‘alignment’ as some absolute binary property we can guarantee, it is more profitable to think of a complex distribution over the relative amounts of ‘safety’ or ‘alignment’ in an AI population
No, “guarantees” in this context meant PAC-style guarantees: “We guarantee that with probability 1-\delta, the system will only ‘go wrong’ from what its sample data taught it 1-\epsilon fraction of the time.” You then need to plug in the epsilons and deltas you want and solve for how much sample data you need to feed the learner. The links for intro PAC lectures in the other comment given to you were quite good, by the way, although I do recommend taking a rigorous introductory machine learning class (new grad-student level should be enough to inflict the PAC foundations on you).
we can at least influence or steer the distribution by selecting for agent types that are more safe/altruistic
“Altruistic” is already a social behavior, requiring the agent to have a theory of mind and care about the minds it believes it observes in its environment. It also assumes that we can build in some way to learn what the hypothesized minds want, learn how they (ie: human beings) think, and separate the map (of other minds) from the territory (of actual people).
Note that “don’t disturb this system over there (eg: a human being) because you need to receive data from it untainted by your own causal intervention in any way” is a constraint that at least I, personally, do not know how to state in computational terms.
I think you are overhyping the PAC model. It surely is an important foundation for probabilistic guarantees in machine learning, but there are some serious limitations when you want to use it to constrain something like an AGI:
It only deals with supervised learning
Simple things like finite automata are not learnable, but in practice it seems like humans pick them up fairly easily.
It doesn’t deal with temporal aspects of learning.
However, there are some modification of the PAC model that can ameliorate these problems, like learning with membership queries (item 2).
It’s also perhaps a bit optimistic to say that PAC-style bounds on a possibly very complex system like an AGI would be “quite doable”. We don’t even know, for example, whether DNF is learnable in polynomial time under the distribution free assumption.
I would definitely call it an open research problem to provide PAC-style bounds for more complicated hypothesis spaces and learning settings. But that doesn’t mean it’s impossible or un-doable, just that it’s an open research problem. I want a limitary theorem proved before I go calling things impossible.
In fact, I expect that given the right way of modelling, formal verification of learning systems up to epsilon-delta bounds (in the style of PAC-learning, for instance) should be quite doable. Why?
Dropping the ‘formal verification’ part and replacing it with approximate error bound variance reduction this is potentially interesting—although it also seems to be a general technique that would—if it worked well—be useful for practical training, safety aside.
Why? Because, as mentioned regarding PAC learning, it’s the existing foundation for machine learning.
Machine learning is an eclectic field with many mostly independent ‘foundations’ - bayesian statistics of course, optimization methods (hessian free, natural, etc), geometric methods and NLDR, statistical physics …
That being said—I’m not very familiar with the PAC learning literature yet—do you have a link to a good intro/summary/review?
Hell, if I could find the paper showing that deep networks form a “funnel” in the model’s free-energy landscape—where local minima are concentrated in that funnel and all yield more-or-less as-good test error, while the global minimum reliably overfits—I’d be posting the link myself.
That sounds kind of like the saddle point paper. It’s easy to show that in complex networks there are a large number of equivalent minima due to various symmetries and redundancies. Thus finding the actual technical ‘global optimum’ quickly becomes suboptimal when you discount for resource costs.
If it seems really really really impossibly hard to solve a problem even with the ‘simplification’ of lots of computing power, perhaps the underlying assumptions are wrong. For example—perhaps using lots and lots of computing power makes the problem harder instead of easier.
You’re not really being fair to Nate here, but let’s be charitable to you: this is fundamentally a dispute between the heuristics-and-biases school of thought about cognition and the bounded/resource-rational school of thought.
Yes that is the source of disagreement, but how am I not being fair? I said ‘perhaps’ - as in have you considered this? Not ‘here is why you are certainly wrong’.
Computationally, this is saying, “When we have enough resources that only asymptotic complexity matters, we use the Old Computer Science way of just running the damn algorithm that implements optimal behavior and optimal asymptotic complexity.” Trying to extend this approach into statistical inference gets you basic Bayesianism and AIXI, which appear to have nice “optimality” guarantees, but are computationally intractable and are only optimal up to the training data you give them.
Solonomoff/AIXI and more generally ‘full Bayesianism’ is useful as a thought model, but is perhaps over valued on this site compared to the machine learning field. Compare the number of references/hits to AIXI on this site (tons) to the number on r/MachineLearning (1!). Compare the number of references for AIXI papers (~100) to other ML papers and you will see that the ML community sees AIXI and related work as minor.
The important question is what does the optimal practical approximation of Solonomoff/Bayesian look like? And how different is that from what the brain does? By optimal I of course I mean optimal in terms of all that really matters, which is intelligence per unit resources.
Human intelligence—including that of Turing or Einstein, only requires 10 watts of energy and more surprisingly only around 10^14 switches/second or less—which is basically miraculous. A modern GPU uses more than 10^18 switches/second. You’d have to go back to a pentium or something to get down to 10^14 switches per second. Of course the difference is that switch events in an ANN are much more powerful because they are more like memory ops, but still.
It is really really hard to make any sort of case that actual computer tech is going to become significantly more efficient than the brain anytime in the near future (at least in terms of switch events/second). There is a very strong case that all the H&B stuff is just what actual practical intelligence looks like. There is no such thing as intelligence that is not resource efficient—or alternatively we could say that any useful definition of intelligence must be resource normalized (ie utility/cost).
I’m not sure what you’re looking for in terms of the PAC-learning summary, but for a quick intro, there’s this set of slides or thesetwo lectures notes from Scott Aaronson. For a more detailed review of the literature in all the field up until the mid 1990s, there’s this paper by David Haussler, though given its length you might as well read up Kearns and Vazirani’s 1994 textbook on the subject. I haven’t been able to find a more recent review of the literature though—if anyone had a link that’d be great.
Human intelligence—including that of Turing or Einstein, only requires 10 watts of energy and more surprisingly only around 10^14 switches/second or less—which is basically miraculous. A modern GPU uses more than 10^18 switches/second. You’d have to go back to a pentium or something to get down to 10^14 switches per second. Of course the difference is that switch events in an ANN are much more powerful because they are more like memory ops, but still.
It’s not that amazing when you understand PAC-learning or Markov processes well. A natively probabilistic (analogously: “natively neuromorphic”) computer can actually afford to sacrifice precision “cheaply”, in the sense that sizeable sacrifices of hardware precision actually entail fairly small injections of entropy into the distribution being modelled. Since what costs all that energy in modern computers is precision, that is, exactitude, a machine that simply expects to get things a little wrong all the time can still actually perform well, provided it is performing a fundamentally statistical task in the first place—which a mind is!
Eli this doesn’t make sense—the fact that digital logic switches are higher precision and more powerful and thus require more minimal energy makes the brain/mind more impressive, not less.
The energy efficiency per op in the brain is rather poor in one sense—perhaps 10^5 larger than the minimum imposed by physics for a low SNR analog op, but essentially all of this cost is wire energy.
The miraculous thing is how much intelligence the brain/mind achieves for such a tiny amount of computation in terms of low level equivalent bit ops/second. It suggests that brain-like ANNs will absolutely dominate the long term future of AI.
Eli this doesn’t make sense—the fact that digital logic switches are higher precision and more powerful and thus require more minimal energy makes the brain/mind more impressive, not less.
Nuh-uh :-p. The issue is that the brain’s calculations are probabilistic. When doing probabilistic calculations, you can either use very, very precise representations of computable real numbers to represent the probabilities, or you can use various lower-precision but natively stochastic representations, whose distribution over computation outcomes is the distribution being inferred.
Hence why the brain is, on the one hand, very impressive for extracting inferential power from energy and mass, but on the other hand, “not that amazing” in the sense that it, too, begins to add up to normality once you learn a little about how it works.
When doing probabilistic calculations, you can either use very, very precise representations of computable real numbers to represent the probabilities, or you can use various lower-precision but natively stochastic representations, whose distribution over computation outcomes is the distribution being inferred.
Of course—and using say a flop to implement a low precision synaptic op is inefficient by six orders of magnitude or so—but this just strengthens my point. Neuromorphic brain-like AGI thus has huge potential performance improvement to look forward to, even without Moore’s Law.
Neuromorphic brain-like AGI thus has huge potential performance improvement to look forward to, even without Moore’s Law.
Yes, if you could but dissolve your concept of “brain-like”/”neuromorphic” into actual principles about what calculations different neural nets embody.
Human intelligence—including that of Turing or Einstein, only requires 10 watts of energy and more surprisingly only around 10^14 switches/second or less—which is basically miraculous. A modern GPU uses more than 10^18 switches/second.
I don’t think that “switches” per second is a relevant metric here. The computation performed by a single neuron in a single firing cycle is much more complex than the computation performed by a logic gate in a single switching cycle.
The amount of computational power required to simulate a human brain in real time is estimated in the petaflops range. Only the largest supercomputer operate in that range, certainly not common GPUs.
You misunderstood me—the biological switch events I was referring to are synaptic ops, and they are comparable to transistor/gate switch ops in terms of minimum fundemental energy cost in Landauer analysis.
The amount of computational power required to simulate a human brain in real time is estimated in the petaflops range.
That is a tad too high, the more accurate figure is 10^14 ops/second (10^14 synapses * avg 1 hz spike rate). The minimal computation required to simulate a single GPU in real time is 10,000 times higher.
That is a tad too high, the more accurate figure is 10^14 ops/second (10^14 synapses * avg 1 hz spike rate).
I’ve seen various people give estimates in the order of 10^16 flops by considering the maximum firing rate of a typical neuron (~10^2 Hz) rather than the average firing rate, as you do.
On one hand, a neuron must do some computation whether it fires or not, and a “naive” simulation would necessarily use a cycle frequency of the order of 10^2 Hz or more, on the other hand, if the result of a computation is almost always “do not fire”, then as a random variable the result has little information entropy and this may perhaps be exploited to optimize the computation. I don’t have a strong intuition about this.
The minimal computation required to simulate a single GPU in real time is 10,000 times higher.
On a traditional CPU perhaps, on another GPU I don’t think so.
This approach yields increasingly fruitful analyses of such cognitive activities as causal learning, concept learning, and planning in arbitrary environments as probabilistic inference enriched with causal/logical structure.
It’s not obvious to me that the Church programming language and execution model is based on bounded rationality theory.
I mean, the idea of using MCMC to sample the executions of probabilistic programs is certainly neat, and you can trade off bias with computing time by varying the burn-in and samples lag parameters, but this trade-off is not provably optimal.
If I understand correctly, provably optimal bounded rationality is marred by unsolved theoretical questions such as the one-way functions conjecture and P != NP. Even assuming that these conjectures are true, the fact that we can’t prove them implies that we can’t often prove anything interesting about the optimality of many AI algorithms.
It’s not obvious to me that the Church programming language and execution model is based on bounded rationality theory.
That’s because it’s not. The probabilistic models of cognition (title drop!) implemented using Church tend to deal with what the authors call the resource-rational school of thought about cognition.
If I understand correctly, provably optimal bounded rationality is marred by unsolved theoretical questions such as the one-way functions conjecture and P != NP.
The paper about it that I read was actually using statistical thermodynamics to form its theory of bounded-optimal inference. These conjectures are irrelevant, in that we would be building reasoning systems that would make use of their own knowledge about these facts, such as it might be.
The concern that ML has no solid theoretical foundations reflects the old computer science worldview, which is all based on finding bit exact solutions to problems within vague asymptotic resource constraints.
It is an error to confuse the “exact / approximate” axis with the “theoretical / empirical” exis. There is plenty of theoretical work in complexity theory on approximate algorithms.
A good ML researcher absolutely needs a good idea of what is going on under the hood—at least at a sufficient level of abstraction.
There is difference between “having an idea” and “solid theoretical foundations”. Chemists before quantum mechanics had a lots of ideas. But they didn’t have a solid theoretical foundation.
Why not test safety long before the system is superintelligent? - say when it is a population of 100 child like AGIs. As the population grows larger and more intelligent, the safest designs are propagated and made safer.
Because this process is not guaranteed to yield good results. Evolution did the exact same thing to create humans, optimizing for genetic fitness. And humans still went and invented condoms.
So it may actually be easier to drop the traditional computer science approach completely.
When the entire future of mankind is at stake, you don’t drop approaches because it may be easier. You try every goddamn approach you have (unless “trying” is dangerous in itself of course).
There is difference between “having an idea” and “solid theoretical foundations”. Chemists before quantum mechanics had a lots of ideas. But they didn’t have a solid theoretical foundation.
That’s a bad example. You are essentially asking researchers to predict what they will discover 50 years down the road. A more appropriate example is a person thinking he has medical expertise after reading bodybuilding and nutrition blogs on the internet, vs a person who has gone through medical school and is an MD.
I’m not asking researchers to predict what they will discover. There are different mindsets of research. One mindset is looking for heuristics that maximize short term progress on problems of direct practical relevance. Another mindset is looking for a rigorously defined overarching theory. MIRI is using the latter mindset while most other AI researchers are much closer to the former mindset.
Evolution did the exact same thing to create humans, optimizing for genetic fitness. And humans still went and invented condoms.
Though humans are the most populous species of large animal on the planet.
Condoms were invented because evolution, being a blind watchmaker, forgot to make sex drive tunable with child mortality, hence humans found a loophole. But whatever function humans are collectively optimizing, it still closely resembles genetic fitness.
Why not test safety long before the system is superintelligent? - say when it is a population of 100 child like AGIs. As the population grows larger and more intelligent, the safest designs are propagated and made safer.
One problem is what Bostrom would call “the treacherous turn.” When the AGI is dependent on us, satisfying us is a very good idea for it—if it’s unsatisfactory it will be deleted. Behaving nicely is so good an idea that many different goal systems will independently choose this strategy. And so the fact that an AGI appears nice is only weak statistical evidence that it would be nice if it wasn’t dependent on us, and further trials are not independent and so don’t accumulate well. This type of problem appears when the AGI develops good enough long-term planning, and has information about its creators.
Another problem is the problem of expanding action spaces. Consider an AGI that wants to gather lots of stamps (example shamelessly stolen from Computerphile video). When the AGI is childlike, its effective action space only looks like spending money to purchase stamps. As it becomes as smart as a human its actions expand—now it might perform a job to make money to buy stamps, or try to steal money to buy stamps, or purchase a printing press to make its own stamps, or all the sorts of things you might do if you really wanted stamps. Then, as it becomes superintelligent, the stamp-gathering robot will proceed to take over the world and try to terraform the entire earth into stamps. This is a problem for using experimental evidence because as the set of actions expands, so do the possible preferences over actions. Which means that there are many possible sets of preferences that might lead to altruistic behavior among weak AIs; there’s some un-eliminable error when trying to predict “many-options” behavior just from “few-options” behavior.
One problem is what Bostrom would call “the treacherous turn.” When the AGI is dependent on us, satisfying us is a very good idea for it . . .This type of problem appears when the AGI develops good enough long-term planning, and has information about its creators.
Right—and I think you are enough of an old-timer to know one of my proposals for that particular problem : sim sandboxes where we test AGIs in an oblivious sim. Ideally the AGI is not only unware of its creators, but actually is atheist and thus believes there is no creator. This can solve the problem at the fundamental level.
When I proposed this long ago the knee-jerk reaction was—but super magic woo Bayesian SI will automagically hack its way out! Which of course is ridiculous—we control the AI’s knowledge.
Today we also have early experimental confirmation of sorts in the form of the DeepMind Atari agent which grows up in an Atari world and never becomes aware of it’s true existential status. Scaling up those techniques into the future I fully expect sandbox sim testing to remain the norm.
Another problem is the problem of expanding action spaces.
I agree this is a problem in theory, but it is surmountable in practice. You need to test an action space that provides sufficient coverage for the expected lifetime and impact of the agent. This can all be accomplished in comprehensive well designed virtual reality environments. These environments are needed anyway for high speed training and all successful DL systems already use this in simple form. You can’t time accelerate the real world.
As a more real world relevant example (why is it that people here always use weird examples with staples or paperclips—what’s with the office supplies?) - consider a self driving car agent. The most advanced current open world games already have highly realistic graphics and physics—you wouldn’t need much more in that department except for more realistic traffic, pedestrian and police modelling, etc. Agents can learn to drive safely in the environment—many in parallel, and it can all run much faster than real-time.
Such a sandbox seems fine for self-driving cars, but not so great for superintelligent agents. The sandbox will have limited resources that real-world agents might quickly exceed by acquiring more hardware. It would have to be much, much more realistic than a driving sim if you wanted to use it for general training of an AI that will interact with humans in very diverse ways, research physics, cause large economic disruption, etc. And if the AI itself has no plausible origin in the world, or if you leave other flaws, then sure, it might even figure out that it’s in a simulation, contaminating the experiment.
Sandboxing seems more useful for testing ideas that are well-understood enough to be inspected for success or failure, or tested without needing very good simulation of the real world. Like if you have an AI that is supposed to learn human values by doing futuristic unsupervised discovery of how the world works, and then assigns preference scores to local events by some futuristic procedure involving marked human feedback. This seems totally testable in simulation—you’ll get the wrong preferences, but might test the preference-learning method.
EDIT: It was brought to my attention that a similar sandbox sim testing idea was proposed by Chalmers in 2009 - he calls it a “Leakproof Singularity”.
I’ve tread this ground enough that I should just do a new writeup with my responses to standard objections.
Such a sandbox seems fine for self-driving cars, but not so great for superintelligent agents.
Practical AGI will first appear as sub-human animal level intelligence and human-level intelligence. Practical AGI will necessarily take the form of populations of agents due to computational constraints (costly memory transactions need to be amortized, parallel scaling constraints, etc). This is true today where you need to run at least around 100 AI instances on a single GPU at once to get good performance. This will remain true into the future—its a hard constraint from the physics of fast hardware.
Superintelligence can imply size (a big civilization), speed, or quality. All of this is under our control. We can test a smaller population in the sandbox, we can run them at manageable speed, and we control their knowledge. As far as we know the greeks had brains just as powerful as ours, but a population of a million AGIs with 2,000 year old knowledge are not that dangerous.
Obviously you don’t contain an entire superintelligent AGI civilization in the sandbox (and that would be a waste of resources regardless)! You use the sandbox to test new AGI architectures on smaller populations.
Sandboxing seems more useful for testing ideas that are well-understood enough to be inspected for success or failure, or tested without needing very good simulation of the real world
Computer graphics are advancing rapidly and will be completely revolutionized by machine learning in the decade ahead. Agents that grow up in a matrix will not be able to discern their status as easily as an agent that grew up in our world.
Sandboxing will test entire agent architectures—equivalent to DNA brain blueprints for humans—to determine if samples from those architectures have highly desirable mental properties such as altruism.
We can engineer entire world histories and scenarios to test the AGIs, and this could tie into the future of entertainment.
Remember AGI is going to be more similar to brain emulations than not—think Hansonian scenario but without the neeed for brain scanning.
Practical AGI will necessarily take the form of populations of agents due to computational constraints (costly memory transactions need to be amortized, parallel scaling constraints, etc). This is true today where you need to run at least around 100 AI instances on a single GPU at once to get good performance. This will remain true into the future—its a hard constraint from the physics of fast hardware.
I don’t know about this, but would be happy to hear more.
Superintelligence can imply size (a big civilization), speed, or quality. All of this is under our control. We can test a smaller population in the sandbox, we can run them at manageable speed, and we control their knowledge.
I don’t think the point is “controlling” these properties, I think the point is drawing conclusions about what an AI will do in the real world. Reduced speed might allow us to run “fast AIs” in simulation and draw conclusions about what they’ll do. Reduced speed might also let us run AI civilizations of large size (though it’s not obvious to me why you’d want such a thing) and draw conclusions about what they’ll do. Reducing the AI’s knowledge seems like a way to make a simulation more compuationally tractable and therefore get better predictions about what the AI will do—but it seems like a risky way that can introduce bias into a simulation.
Sandboxing will test entire agent architectures—equivalent to DNA brain blueprints for humans—to determine if samples from those architectures have highly desirable mental properties such as altruism.
My real problem is that I don’t think just testing for altruism (which I assume means altruistic behavior) is remotely good enough. If we could simulate our world out past an AI becoming more powerful than the human race, and select for altruism then, I’d be happy. But I am pretty confident that there will be big problems generalizing from a simulation to reality, if that simulation has both differences and restrictions on possible actions and possible values.
If we’re just testing a self-driving car, we can make a simulation that captures the available actions (both literal outputs and “effective actions” permitted by the dynamics) and has basically the right value function built in from the start. Additionally, self-driving cars generalize well from the model to reality. Suppose you have something unrealistic in the model (say, you have other cars follow set training trajectories rather than reacting to the actions of the car). A realistic self-driving car that does well in the simulation might be bad at some skills like negotiating for space on the road, but it won’t suddenly, say, try to use its tire tracks to spell out letters if it you put it into reality with humans.
To put what I think concretely, when exposed to a difference between training and reality, a “dumb, parametric AI” projects reality onto the space it learned in training and just keeps on plugging, making it somewhat insensitive to reality being complicated, and giving us a better idea about how it will generalize. But a “smart AI” doesn’t seem to have this property, it will learn the complications of reality that were omitted in testing, and can act very different as a result. This goes back to the problem of expanding sets of effective actions.
Even though I come from a somewhat different viewpoint, I was fairly impressed with the case you are presenting. Nonetheless . ..
This is not quite fully accurate. Yes anybody can download a powerful optimizer and use it to train a network that they don’t understand. But those are not the people you need to worry about, that is not where the danger lies.
The concern that ML has no solid theoretical foundations reflects the old computer science worldview, which is all based on finding bit exact solutions to problems within vague asymptotic resource constraints.
Old computer science gave us things like convex optimization, which is nothing interesting at all (it only works well for simple uninteresting problems). Modern AI/ML is much more like computer graphics or simulation, where everything is always an approximation and traditional computer science techniques are mostly useless. There is no ‘right answer’, there are just an endless sea of approximations that have varying utility/cost tradeoffs.
A good ML researcher absolutely needs a good idea of what is going on under the hood—at least at a sufficient level of abstraction. The optimization engine does most of the nitty gritty work—but it is equivalent to the researcher employing an army of engineers and dividing the responsibility up so that each engineer works on a tiny portion of the circuit. To manage the optimizer, the researcher needs a good high level understanding of the process, although not necessarily the details.
Also - we do have some theoretical foundations for DL—bayesian inference for one. Using gradient descent on the joint log PDF is a powerful approximate inference strategy.
It appears you are making the problem unnecessarily difficult.
Why not test safety long before the system is superintelligent? - say when it is a population of 100 child like AGIs. As the population grows larger and more intelligent, the safest designs are propagated and made safer.
This again reflects the old ‘hard’ computer science worldview, and obsession with exact solutions.
If it seems really really really impossibly hard to solve a problem even with the ‘simplification’ of lots of computing power, perhaps the underlying assumptions are wrong. For example—perhaps using lots and lots of computing power makes the problem harder instead of easier.
How could that be? Because with lots and lots of compute power, you are naturally trying to extrapolate the world model far far into the future, where it branches enormously and grows in complexity exponentially. Then when you try to define a reasonable utility/value function over the future world model, it becomes almost impossible because the future world model has exploded exponentially in complexity.
So it may actually be easier to drop the traditional computer science approach completely. Start with a smaller more limited model that doesnt explode, and then approximately extrapolate both the world model and the utility/value function together.
This must be possible in principle, because human children learn that way. Realistically there isn’t room in the DNA for a full adult utility/value function, and it wouldn’t work in an infant brain anyway without the world model. But evolution solved this problem approximately, and we can learn from it and make do.
Thanks for the reply, Jacob! You make some good points.
I endorse eli_sennesh’s response to this part :-)
I am not under the impression that there are “exact solutions” available, here. For example, in the case of “building world-models,” you can’t even get “exact” solutions using AIXI (which does Bayesian inference using a simplicity prior in order to guess what the environment looks like; and can never figure it out exactly). And this is in the simplified setting where AIXI is large enough to contain all possible environments! We, by contrast, need to understand algorithms which allow you to build a world model of the world that you’re inside of; exact solutions are clearly off the table (and, as eli_sennesh notes, huge amounts of statistical modeling are on it instead).
I would readily accept a statistical-modeling-heavy answer to the question of “but how do you build multi-level world-models from percepts, in principle?”; and indeed, I’d be astonished if you avoided it.
Perhaps you read “we need to know how to do X in principle before we do it in practice” as “we need a perfect algorithm that gives you bit-exact solutions to X”? That’s an understandable reading; my apologies. Let me assure you again that we’re not under the illusion you can get bit-exact solutions to most of the problems we’re working on.
Hmm. If you have lots and lots of computing power, you can always just… not use it. It’s not clear to me how additional computing power can make the problem harder—at worst, it can make the problem no easier. I agree, though, that algorithms for modeling the world from the inside can’t just extrapolate arbitrarily, on pain of exponential complexity; so whatever it takes to build and use multi-level world-models, it can’t be that.
Perhaps the point where we disagree is that you think these hurdles suggest that figuring out how to do things we can’t yet do in principle is hopeless, whereas I’m under the impression that these shortcomings highlight places where we’re still confused?
Additional computing power might not make the problem literally harder, but the assumption of limitless computing power might direct your attention towards wrong parts of the search space.
For example, I suspect that the whole question about multilevel world-models might be something that arises from conceptualizing intelligence as something like AIXI, which implicitly assumes that there’s only one true model of the world. It can do this because it has infinite computing power and can just replace its high-level representation of the world with one where all high-level predictions are derived from the basic atom-level interactions, something that would be intractable for any real-world system to do. Instead real-world systems will need to flexibly switch between different kinds of models depending on the needs of the situation, and use lower-level models in situations where the extra precision is worth the expense of extra computing time. Furthermore, those lower-level models will have been defined in terms of what furthers the system’s goals, as defined on the higher-levels: it will pay preferential attention to those features of the lower-level model that allow it to further its higher-level goals.
In the AIXI framing, the question of multilevel world-models is “what happens when the AI realizes that the true world model doesn’t contain carbon atoms as an ontological primitive”. In the resource-limited framing, that whole question isn’t even coherent, because the system has no such thing as a single true world-model. Instead the resource-limited version of how to get multilevel world-models to work is something like “how to reliably ensure that the AI will create a set of world models in which the appropriate configuration of subatomic objects in the subatomic model gets mapped to the concept of carbon atoms in the higher-level model, while the AI’s utility function continues to evaluate outcomes in terms of this concept regardless of whether it’s using the lower- or higher-level representation of it”.
As an aside, this reframed version seems like the kind of question that you would need to solve in order to have any kind of AGI in the first place, and one which experimental machine learning work would seem the best suited for, so I’d assume it to get naturally solved by AGI researchers even if they weren’t directly concerned with AI risk.
+1
Ohoho! Well, actually Nate, as I personally subscribe to the bounded-rationality school of thinking, and I do actually think this has implications for AI safety. Specifically: as the agent acquires more resources (speed and memory), it can handle larger problems and enlarge its impact on the world, so to make a bounded-rational agent safe, we should, hypothetically, be able to state safety properties explicitly in terms of how much cognitive stuff (philosophically, it all adds up to different ingredients to that magic word “intelligence”) the agent has.
With some kind of framework like that, we’d be able to state and prove safety theorems in the form of, “This design will grow increasingly uncertain about its value function as it grows its cognitive resources, and act more cautiously until receiving more training, and we have some analytic bound telling us exactly how fast this fall-off will happen.” I can even imagine it running along the simple lines of, “As the agent’s model of the world grows more complicated, the entropy/Kolmogorov complexity of that model penalizes hypotheses about the learned value function, thus causing the agent to grow increasingly passive and wait for value training as it learns and grows.”
This requires a framework for normative uncertainty that formalizes acting cautiously when under value-uncertainty, but didn’t someone publish a thesis on that at Oxford a year or two ago?
Can I laugh maniacally at least a little bit now?
Well, as jacob_cannell pointed out, feeding more compute-power to a bounded-rational agent ought to make it enlarge its models in terms of theory-depth, theory-preorder-connectedness, variance-explanation, and time-horizon. In very short: the branching factors and the hypothesis class get larger, making it harder to learn (if we’re thinking about statistical learning theory).
There’s also the specific issue of assuming Turing-machine-level compute power, assuming that “available compute steps” and “available memory” is an unbounded but finite natural number. Since you’ve not bounded the number, it’s effectively infinite, which of course means that two agents, each of which is “programmed” as a Turing-machine with Turing-machine resources rather than strictly finite resources, can’t reason about each-other: either one would need ordinal numbers to think about what the other (or itself) can do, but actually using ordinal numbers in that analysis would be necessarily wrong (in that neither actually possesses a Turing Oracle, which is equivalent to having w_0 steps of computation).
So you get a bunch of paradox theorems making your job a lot harder.
In contrast, starting from the assumption of having strictly finite computing power is like when E.T. Jaynes starts from the assumption of having finite sample data, finite log-odds, countable hypotheses, etc.: we assume what must necessarily be true in reality to start with, and then analyze the infinite case as passing to the limit of some finite number. Pascal’s Mugging is solvable this way using normal computational Bayesian statistical techniques, for instance, if we assume that we can sample outcomes from our hypothesis distribution.
Ok—then you are moving into the world of heuristics and approximations. Once one acknowledges that the bit exact ‘best’ solution either does not exist or cannot be found, then there is an enormous (infinite really) space of potential solutions which have different tradeoffs in their expected utillity in different scenarios/environments along with different cost structures. The most interesting solutions often are so complex than they are too difficult to analyze formally.
Consider the algorithms employed in computer graphics and simulation—which is naturally quite related to the world modelling problems in your maximize diamond example. The best algorithms and techniques employ some reasonably simple principles—such as hierarchical bounded approximations over octrees, or bidirectional path tracing—but a full system is built from a sea of special case approximations customized to particular types of spatio-temporal patterns. Nobody bothers trying to prove that new techniques are better than old, nobody bothers using formal tools to analyze the techniques, because the algorithmic approximation tradeoff surface is far too complex.
In an approximation driven field, new techniques are arrived at through intuitive natural reasoning and are evaluated experimentally. Modern machine learning seems confusing and ad-hoc to mathematicians and traditional computer scientists because it is also an approximation field.
Ok, eli said:
My perhaps predictable reply is that this safety could be demonstrated experimentally—for example by demonstrating altruism/benevolence as you scale up the AGI in terms of size/population, speed, and knowledge/intelligence. When working in an approximation framework where formal analysis does not work and everything must be proven experimentally—this is simply the best that we can do.
If we could somehow ‘guarantee’ saftey that would be nice, but can we guarantee safety of future human populations?
And now we get into that other issue—if you focus entirely on solving problems with unlimited computation, you avoid thinking about what the final practical resource efficient solutions look like, and you avoid the key question of how resource efficient the brain is. If the brain is efficient, then successful AGI is highly likely to take the form of artificial brains.
So if AGI is broad enough to include artificial brains or ems—then a friendly AI theory which can provide safety guarantees for AGI in general should be able to provide guarantees for artificial brains—correct? Or is it your view that the theory will be more narrow and will only cover particular types of AGI? If so—what types?
I think those scope questions are key, but I don’t want to come off as a hopeless negative critic—we can’t really experiment with AGI just yet, and we may have limited time for experimentation. So to the extent that theory could lead practice—that would be useful if at all possible.
I hope the context indicated that I was referring to conceptual hardness/difficulty in finding the right algorithm. For example consider the problem of simulating an infinite universe. If you think about the problem first in the case of lots of compute power, it may actually become a red herring. The true solution will involve something like an output sensitive algorithm (asymptotic complexity does not depend at all on the world size) - as in some games—and thus having lots of compute is irrelevant.
I suspect that your maximize diamond across the universe problem is FAI-complete. The hard part is specifying the ‘diamond utility function’, because diamonds are a pattern in the mind that depends on the world model in the mind. The researcher needs to transfer a significant fraction of their world model or mind program into the machine—and if you go to all that trouble then you might as well use a better goal. The simplest solution probably involves uploading.
Thanks again, Jacob. I don’t have time to reply to all of this, but let me reply to one part:
I don’t buy this. Consider the “expert systems” of the seventies, which used curated databases of logical sentences and reasoned from those using a whole lot of ad-hoc rules. They could just as easily have said “Well we need to build systems that deal with lots of special cases, and you can never be certain about the world. We cannot get exact solutions, and so we are doomed to the zone of heuristics and tradeoffs where the only interesting solutions are too complex to analyze formally.” But they would have been wrong. There were tools and concepts and data structures that they were missing. Judea Pearl (and a whole host of others) showed up, formalized probabilistic graphical models, related them to Bayesian inference, and suddenly a whole class of ad-hoc solutions were superseded.
So I don’t buy that “we can’t get exact solutions” implies “we’re consigned to complex heuristics.” People were using complicated ad-hoc rules to approximate logic, and then later they were using complex heuristics to approximate Bayesian inference, and this was progress.
My claim is that there are other steps such as those that haven’t been made yet, that there are tools on the order of “causal graphical models” that we are missing.
Imagine encountering a programmer from the future who knows how to program an AGI and asking them “How do you do that whole multi-level world-modeling thing? Can you show me the algorithm?” I strongly expect that they’d say something along the lines of “oh, well, you set up a system like this and then have it take percepts like that, and then you can see how if we run this for a while on lots of data it starts building multi-level descriptions of the universe. Here, let me walk you through what it looks like for the system to discover general relativity.”
Since I don’t know of a way to set up a system such that it would knowably and reliably start modeling the universe in this sense, I suspect that we’re missing some tools.
I’m not sure whether your view is of the form “actually the programmer of the future would say “I don’t know how it’s building a model of the world either, it’s just a big neural net that I trained for a long time”″ or whether it’s of the form “actually we do know how to set up that system already”, or whether it’s something else entirely. But if it’s the second one, then please tell! :-)
I thought you hired Jessica for exactly that. I have these slides and everything that I was so sad I wouldn’t get to show you because you’d know all about probabilistic programming after hiring Jessica.
Thanks for the clarifications—I’ll make this short.
Probabilistic graphical models were definitely a key theoretical development, but they hardly swept the field of expert systems. From what I remember, in terms of practical applications, they immediately replaced or supplemented expert systems in only a few domains—such as medical diagnostic systems. Complex ad hoc expert systems continued to dominate unchallenged in most fields for decades: in robotics, computer vision, speech recognition, game AI, fighter jets, etc etc basically everything important. As far as I am aware the current ANN revolution is truly unique in that it is finally replacing expert systems across most of the board—although there are still holdouts (as far as I know most robotic controllers are still expert systems, as are fighter jets, and most Go AI systems).
The ANN solutions are more complex than the manually crafted expert systems they replace—but the complexity is automatically generated. The code the developers actually need to implement and manage is vastly simpler—this is the great power and promise of machine learning.
Here is a simple general truth—the Occam simplicity prior does imply that simpler hypotheses/models are more likely, but for any simple model there are an infinite family of approximations to that model of escalating complexity. Thus more efficient approximations naturally tend to have greater code complexity, even though they approximate a much simpler model.
Well, that would be interesting.
Anyone who has spent serious time working in graphics has also spent serious time thinking about how to create the matrix—if given enough computer power. If you got say a thousand of the various brightest engineers in different simulation related fields, from physics to graphics, and got them all working on a large mega project with huge funds it could probably be implemented today. You’d start with a hierarchical/multi-resolution modelling graph—using say octrees or kdtrees over voxel cells, and a general set of hierarchical bidirectional inference operators for tracing paths and interactions.
To make it efficient, you need a huge army of local approximation models for different phenomena at different scales—low level quantum codes just in case, particle level codes, molecular bio codes, fluid dynamics, rigid body, etc etc. It’s a sea of codes with decision tree like code to decide which models to use where and when.
Of course with machine learning we could automatically learn most of those codes—which suddenly makes it more tractable. And then you could use that big engine as your predictive world model, once it was trained.
The problem is to plan anything worthwhile you need to simulate human minds reasonably well, which means to be useful the sim engine would basically need to infer copies of everyone’s minds . . ..
And if you can do that, then you already have brain based AGI!
So I expect that the programmer from the future will say—yes at the low level we use various brain-like neural nets, and various non-brain like neural nets or learned virtual circuits, some operating over explicit space-time graphs. In all cases we have pretty detailed knowledge of what the circuits are doing—here take a look at that last goal update that just propagated in your left anterior prefrontal cortex . ..
While the methods for finding a solution to a well-formed problem currently used in Machine Learning are relatively well understood, the solutions found are not.
And that is what really matters from a safety perspective. We can and do make some headway in understanding the solutions, as well, but the trend is towards more autonomy for the learning algorithm, and correspondingly more opaqueness.
As you mentioned, the solutions found are extremely complex. So I don’t think it makes sense to view them only in terms of approximations to some conceptually simple (but expensive) ideal solution.
If we want to understand their behaviour, which is what actually matters for safety, we will have to grapple with this complexity somehow.
Personally, I’m not optimistic about experimentation (as it is currently practiced in the ML community) being a good enough solution. There is, at least, the problem of the treacherous turn. If we’re lucky, the AI jumps the gun, and society wakes up to the possibility of an AI trying to take over. If we’re unlucky, we don’t get any warning, and the AI only behaves for long enough to gain our trust and discover a nearly fail-proof strategy. VR could help here, but I think it’s rather far from a complete solution.
BTW, SOTA for Computer Go uses ConvNets (before that, it was Monte-Carlo Tree Search, IIRC): http://machinelearning.wustl.edu/mlpapers/paper_files/icml2015_clark15.pdf ;)
I just want to point out some nuiances.
1) The divide between your so called “old CS” and “new CS” is more of a divide (or perhaps a continuum) between engineers and theorists. The former is concerned with on-the-ground systems, where quadratic time algorithms are costly and statistics is the better weapon at dealing with real world complexities. The latter is concerned with abstracted models where polynomial time is good enough and logical deduction is the only tool. These models will probably never be applied literally by engineers, but they provide human understanding of engineering problems, and because of their generality, they will last longer. The idea of a Turing machine will last centuries if not millenia, but a Pascal programmer might not find a job today and a Python programmer might not find a job in 20 years. Machine learning techniques constantly come in and out of vogue, but something like the PAC model will be here to stay for a long time. But of course at the end of the day it’s engineers who realize new inventions and technologies.
Theorists’ ideas can transform an entire engineering field, and engineering problems inspire new theories. We need both types of people (or rather, people across the spectrum from engineers to theorists).
2) With neural networks increasing in complexity, making the learning converge is no longer as simple as just running gradient descent. In particular, something like a K12 curriculum will probably emerge to guide the AGI past local optima. For example, the recent paper on neural Turing machines has already employed curriculum learning, as the authors couldn’t get good performance otherwise. So there is a nontrivial maintenance cost (in designing a curriculum) to a neural network so that it adapts to a changing environment, which will not lessen if we don’t better our understanding of it.
Of course expert systems also have maintenance costs, of a different type. But my point is that neural networks are not free lunches.
3) What caused the AI winter was that AI researchers didn’t realize how difficult it was to do what seems so natural to us—motion, language, vision, etc. They were overly optimistic because they succeeded in what were difficult to humans—chess, math, etc. I think it’s fair to say the ANNs have “swept the board” in the former category, the category of lower level functions (machine translation, machine vision, etc), but the high level stuff is still predominantly logical systems (formal verification, operations research, knowledge representation, etc). It’s unfortunate that the the neural camp and logical camp don’t interact too much, but I think it is a major objective to combine the flexibility of neural systems with the power and precision of logical systems.
Schmidhuber invented something called the speed prior that weighs an algorithm according to how fast it generates the observation, rather than how simple it is. He makes some ridiculous claims about our (physical) universe assuming the speed prior. Ostensibly one can also weigh in accuracy of approximation in there to produce another variant of prior. (But of course all of these will lose the universality enjoyed by the Occam prior)
There’s a big difference between the hopelessly empirical school of machine learning, in which things are shown in experiments and then accepted as true, and real empirical science, in which we show things in small-scale experiments to build theories of how the systems in question behave in the large scale.
You can’t actually get away without any theorizing, on the basis of “Oh well, it seems to work. Ship it.” That’s actually bad engineering, although it’s more commonly accepted in engineering than in science. In a real science, you look for the laws that underly your experimental results, or at least causally robust trends.
If the brain is efficient, and it is, then you shouldn’t try to cargo-cult copy the brain, any more than we cargo-culted feathery wings to make airplanes. You experiment, you theorize, you find out why it’s efficient, and then you strip that of its evolutionarily coincidental trappings and make an engine based on a clear theory of which natural forces govern the phenomenon in question—here, thought.
The wright brothers copied wings for lift and wing warping for 3D control both from birds. Only the forward propulsion was different.
We already have that—it’s called a computer. AGI is much more specific and anthropocentric because it is relative to our specific society/culture/economy. It requires predicting and modelling human minds—and the structure of efficient software that can predict a human mind is itself a human mind.
“the structure of efficient software that can predict a human mind is itself a human mind.”—I doubt that. Why do you think this is the case? I think there are already many examples where simple statistical models (e.g. linear regression) can do a better job of predicting some things about a human than an expert human can.
Also, although I don’t think there is “one true definition” of AGI, I think there is a meaningful one which is not particularly anthropocentric, see Chapter 1 of Shane Legg’s thesis: http://www.vetta.org/documents/Machine_Super_Intelligence.pdf.
“Intelligence measures an agent’s ability to achieve goals in a wide range of environments.”
So, arguably that should include environments with humans in them. But to succeed, an AI would not necessarily have to predict or model human minds; it could instead, e.g. kill all humans, and/or create safeguards that would prevent its own destruction by any existing technology.
What? No.
A computer is a bicycle for the mind. Logic is purified thought, computers are logic engines. General intelligence can be implemented by a computer, but it is much more anthrospecific.
With respect, no, it’s just thought with all the interesting bits cut away to leave something so stripped-down it’s completely deterministic.
Sorta-kinda. They’re also arithmetic engines, floating-point engines, recording engines. They can be made into probability engines, which is the beginnings of how you implement intelligence on a computer.
No, not really. In fact, I expect that given the right way of modelling, formal verification of learning systems up to epsilon-delta bounds (in the style of PAC-learning, for instance) should be quite doable. Why? Because, as mentioned regarding PAC learning, it’s the existing foundation for machine learning.
I do agree that this post reflects an “Old Computer Science” worldview, but to be fair, that’s not Nate’s personal fault, or MIRI’s organizational fault. It’s the fault of the entire subfield of AGI that still has not bloody learned the basic lessons of statistical machine learning: that real cognition just is about probably approximately correct statistical modelling.
So as you mention, for instance, there’s an immense amount of foundational theory behind modern neural networks. Hell, if I could find the paper showing that deep networks form a “funnel” in the model’s free-energy landscape—where local minima are concentrated in that funnel and all yield more-or-less as-good test error, while the global minimum reliably overfits—I’d be posting the link myself.
The problem with deep neural networks is not that they lack theoretical foundations. It’s that most of the people going “WOW SO COOL” at deep neural networks can’t be bothered to understand the theoretical foundations. The “deep learning cabal” of researchers (out of Toronto, IIRC), and the Switzerland Cabal of Schmidhuber-Hutter-and-Legg fame, all know damn well what they are doing on an analytical level.
(And to cheer for my favorite approach, the probabilistic programming cabal has even more analytical backing, since they can throw Bayesian statistics, traditional machine learning, and programming-languages theory at their problems.)
Sure, it does all require an unusual breadth of background knowledge, but they, this is how real science proceeds, people: shut up and read the textbooks and literature. Sorry, but if we (as in, this community) go around claiming that important problems can be tackled without background knowledge and active literature, or with as little as the “AGI” field seems to generate, then we are not being instrumentally rational. Period. Shut up and PhD.
Because that requires a way to state and demonstrate safety properties such that safety guarantees obtained with small amounts of resources remain strong when the system gets more resources. More on that below.
You’re not really being fair to Nate here, but let’s be charitable to you: this is fundamentally a dispute between the heuristics-and-biases school of thought about cognition and the bounded/resource-rational school of thought.
In the heuristics-and-biases school of thought, the human mind uses heuristics or biases when it believes it doesn’t have the computing power on hand to use generally intelligent inference, or sometimes the general intelligence is even construed as an emergent computational behavior of an array of heuristics and biases that happened to get thrown together by evolution in the right way. Computationally, this is saying, “When we have enough resources that only asymptotic complexity matters, we use the Old Computer Science way of just running the damn algorithm that implements optimal behavior and optimal asymptotic complexity.” Trying to extend this approach into statistical inference gets you basic Bayesianism and AIXI, which appear to have nice “optimality” guarantees, but are computationally intractable and are only optimal up to the training data you give them.
In the bounded-rationality school of thought, computing power is considered a strictly (not asymptotically) finite resource, which must be exploited in an optimal way. I’ve seen a very nice paper on how thermodynamics actually yields a formal theory for how to do this. Cognition is then analyzed as a algorithmic ways to tractably build and evaluate models that deal well with the data. This approach yields increasingly fruitful analyses of such cognitive activities as causal learning, concept learning, and planning in arbitrary environments as probabilistic inference enriched with causal/logical structure.
In terms of LW posts, the former alternative is embodied in Eliezer’s Sequences, and the latter in jacob_cannell’s post on The Brain as a Universal Learning Machine and my book review of Plato’s Camera.
The kinds of steps needed to get both “AI” as such, and “Friendliness” as such, are substantively different in the “possible worlds” where the two different schools of thought apply. Or, perhaps, both are true in certain ways, and what we’re really talking about is just two different ways of building minds. Personally, I think the one true distinction is that Calude’s work on measuring nonhalting computations gives us a definitive way to deal with the kinds of self-reference scenarios that Old AGI’s “any finite computation” approach generates paradoxes in.
But time will tell and I am not a PhD, so everything I say should be taken with substantial sprinklings of salt. On the other hand, to wit, while you shouldn’t think for a second that I am one of them, I am certainly on the side of the PhDs.
(Nate: sorry for squabbling on your post. All these sorts of qualms with the research program were things I was going to bring up in person, in a much more constructive way. Still looking forward to meeting you in September!)
This isn’t really a problem, because—as you point out—the formidable researchers all “know damn well what they are doing on an analytical level”.
Thus the argument that there are people using DL without understanding it—and moreover that this is dangerous—is specious and weak because these people are not the ones actually likely to develop AGI let alone superintelligence.
Ah—the use of guarantees belies the viewpoint problem. Instead of thinking of ‘safety’ or ‘alignment’ as some absolute binary property we can guarantee, it is more profitable to think of a complex distribution over the relative amounts of ‘safety’ or ‘alignment’ in an AI population (and any realistic AI project will necessarily involve a population due to scaling constraints). Strong guarantees may be impossible, but we can at least influence or steer the distribution by selecting for agent types that are more safe/altruistic. We can develop a scaling theory of if, how, and when these desirable properties change as agents grow in capability.
In other words—these issues are so incredibly complex that we can’t really develop any good kind of theory without alot of experimental data to back it up.
Also—I should point out that one potential likely result of ANN based AGI is the creation of partial uploads through imitation and reverse reinforcement learning—agents which are intentionally close in mindspace to their human ‘parent’ or ‘model’.
Yes, but I don’t think that’s an argument anyone has actually made. Nobody, to my knowledge, sincerely believes that we are right around the corner from superintelligent, self-improving AGI built out of deep neural networks, such that any old machine-learning professor experimenting with how to get a lower error rate in classification tasks is going to suddenly get the Earth covered in paper-clips.
Actually, no, I can think of one person who believed that: a radically underinformed layperson on reddit who, for some strange reason, believed that LessWrong is the only site with people doing “real AI” and that “[machine-learning researchers] build optimizers! They’ll destroy us all!”
Hopefully he was messing with me. Nobody else has ever made such ridiculous claims.
Sorry, wait, I’m forgetting to count sensationalistic journalists as people again. But that’s normal.
No, “guarantees” in this context meant PAC-style guarantees: “We guarantee that with probability 1-\delta, the system will only ‘go wrong’ from what its sample data taught it 1-\epsilon fraction of the time.” You then need to plug in the epsilons and deltas you want and solve for how much sample data you need to feed the learner. The links for intro PAC lectures in the other comment given to you were quite good, by the way, although I do recommend taking a rigorous introductory machine learning class (new grad-student level should be enough to inflict the PAC foundations on you).
“Altruistic” is already a social behavior, requiring the agent to have a theory of mind and care about the minds it believes it observes in its environment. It also assumes that we can build in some way to learn what the hypothesized minds want, learn how they (ie: human beings) think, and separate the map (of other minds) from the territory (of actual people).
Note that “don’t disturb this system over there (eg: a human being) because you need to receive data from it untainted by your own causal intervention in any way” is a constraint that at least I, personally, do not know how to state in computational terms.
I think you are overhyping the PAC model. It surely is an important foundation for probabilistic guarantees in machine learning, but there are some serious limitations when you want to use it to constrain something like an AGI:
It only deals with supervised learning
Simple things like finite automata are not learnable, but in practice it seems like humans pick them up fairly easily.
It doesn’t deal with temporal aspects of learning.
However, there are some modification of the PAC model that can ameliorate these problems, like learning with membership queries (item 2).
It’s also perhaps a bit optimistic to say that PAC-style bounds on a possibly very complex system like an AGI would be “quite doable”. We don’t even know, for example, whether DNF is learnable in polynomial time under the distribution free assumption.
I would definitely call it an open research problem to provide PAC-style bounds for more complicated hypothesis spaces and learning settings. But that doesn’t mean it’s impossible or un-doable, just that it’s an open research problem. I want a limitary theorem proved before I go calling things impossible.
Dropping the ‘formal verification’ part and replacing it with approximate error bound variance reduction this is potentially interesting—although it also seems to be a general technique that would—if it worked well—be useful for practical training, safety aside.
Machine learning is an eclectic field with many mostly independent ‘foundations’ - bayesian statistics of course, optimization methods (hessian free, natural, etc), geometric methods and NLDR, statistical physics …
That being said—I’m not very familiar with the PAC learning literature yet—do you have a link to a good intro/summary/review?
That sounds kind of like the saddle point paper. It’s easy to show that in complex networks there are a large number of equivalent minima due to various symmetries and redundancies. Thus finding the actual technical ‘global optimum’ quickly becomes suboptimal when you discount for resource costs.
Yes that is the source of disagreement, but how am I not being fair? I said ‘perhaps’ - as in have you considered this? Not ‘here is why you are certainly wrong’.
Solonomoff/AIXI and more generally ‘full Bayesianism’ is useful as a thought model, but is perhaps over valued on this site compared to the machine learning field. Compare the number of references/hits to AIXI on this site (tons) to the number on r/MachineLearning (1!). Compare the number of references for AIXI papers (~100) to other ML papers and you will see that the ML community sees AIXI and related work as minor.
The important question is what does the optimal practical approximation of Solonomoff/Bayesian look like? And how different is that from what the brain does? By optimal I of course I mean optimal in terms of all that really matters, which is intelligence per unit resources.
Human intelligence—including that of Turing or Einstein, only requires 10 watts of energy and more surprisingly only around 10^14 switches/second or less—which is basically miraculous. A modern GPU uses more than 10^18 switches/second. You’d have to go back to a pentium or something to get down to 10^14 switches per second. Of course the difference is that switch events in an ANN are much more powerful because they are more like memory ops, but still.
It is really really hard to make any sort of case that actual computer tech is going to become significantly more efficient than the brain anytime in the near future (at least in terms of switch events/second). There is a very strong case that all the H&B stuff is just what actual practical intelligence looks like. There is no such thing as intelligence that is not resource efficient—or alternatively we could say that any useful definition of intelligence must be resource normalized (ie utility/cost).
I’m not sure what you’re looking for in terms of the PAC-learning summary, but for a quick intro, there’s this set of slides or these two lectures notes from Scott Aaronson. For a more detailed review of the literature in all the field up until the mid 1990s, there’s this paper by David Haussler, though given its length you might as well read up Kearns and Vazirani’s 1994 textbook on the subject. I haven’t been able to find a more recent review of the literature though—if anyone had a link that’d be great.
It’s not that amazing when you understand PAC-learning or Markov processes well. A natively probabilistic (analogously: “natively neuromorphic”) computer can actually afford to sacrifice precision “cheaply”, in the sense that sizeable sacrifices of hardware precision actually entail fairly small injections of entropy into the distribution being modelled. Since what costs all that energy in modern computers is precision, that is, exactitude, a machine that simply expects to get things a little wrong all the time can still actually perform well, provided it is performing a fundamentally statistical task in the first place—which a mind is!
Eli this doesn’t make sense—the fact that digital logic switches are higher precision and more powerful and thus require more minimal energy makes the brain/mind more impressive, not less.
The energy efficiency per op in the brain is rather poor in one sense—perhaps 10^5 larger than the minimum imposed by physics for a low SNR analog op, but essentially all of this cost is wire energy.
The miraculous thing is how much intelligence the brain/mind achieves for such a tiny amount of computation in terms of low level equivalent bit ops/second. It suggests that brain-like ANNs will absolutely dominate the long term future of AI.
Nuh-uh :-p. The issue is that the brain’s calculations are probabilistic. When doing probabilistic calculations, you can either use very, very precise representations of computable real numbers to represent the probabilities, or you can use various lower-precision but natively stochastic representations, whose distribution over computation outcomes is the distribution being inferred.
Hence why the brain is, on the one hand, very impressive for extracting inferential power from energy and mass, but on the other hand, “not that amazing” in the sense that it, too, begins to add up to normality once you learn a little about how it works.
Of course—and using say a flop to implement a low precision synaptic op is inefficient by six orders of magnitude or so—but this just strengthens my point. Neuromorphic brain-like AGI thus has huge potential performance improvement to look forward to, even without Moore’s Law.
Yes, if you could but dissolve your concept of “brain-like”/”neuromorphic” into actual principles about what calculations different neural nets embody.
I don’t think that “switches” per second is a relevant metric here. The computation performed by a single neuron in a single firing cycle is much more complex than the computation performed by a logic gate in a single switching cycle.
The amount of computational power required to simulate a human brain in real time is estimated in the petaflops range. Only the largest supercomputer operate in that range, certainly not common GPUs.
You misunderstood me—the biological switch events I was referring to are synaptic ops, and they are comparable to transistor/gate switch ops in terms of minimum fundemental energy cost in Landauer analysis.
That is a tad too high, the more accurate figure is 10^14 ops/second (10^14 synapses * avg 1 hz spike rate). The minimal computation required to simulate a single GPU in real time is 10,000 times higher.
I’ve seen various people give estimates in the order of 10^16 flops by considering the maximum firing rate of a typical neuron (~10^2 Hz) rather than the average firing rate, as you do.
On one hand, a neuron must do some computation whether it fires or not, and a “naive” simulation would necessarily use a cycle frequency of the order of 10^2 Hz or more, on the other hand, if the result of a computation is almost always “do not fire”, then as a random variable the result has little information entropy and this may perhaps be exploited to optimize the computation. I don’t have a strong intuition about this.
On a traditional CPU perhaps, on another GPU I don’t think so.
It’s not obvious to me that the Church programming language and execution model is based on bounded rationality theory.
I mean, the idea of using MCMC to sample the executions of probabilistic programs is certainly neat, and you can trade off bias with computing time by varying the burn-in and samples lag parameters, but this trade-off is not provably optimal.
If I understand correctly, provably optimal bounded rationality is marred by unsolved theoretical questions such as the one-way functions conjecture and P != NP. Even assuming that these conjectures are true, the fact that we can’t prove them implies that we can’t often prove anything interesting about the optimality of many AI algorithms.
That’s because it’s not. The probabilistic models of cognition (title drop!) implemented using Church tend to deal with what the authors call the resource-rational school of thought about cognition.
The paper about it that I read was actually using statistical thermodynamics to form its theory of bounded-optimal inference. These conjectures are irrelevant, in that we would be building reasoning systems that would make use of their own knowledge about these facts, such as it might be.
Sounds interesting, do you have a reference?
Sure. If you know statistical mechanics/thermodynamics, I’d be happy to hear your view on the paper, since I don’t know those fields.
Thanks, I’ll read it, though I’m not an expert in statistical mechanics and thermodynamics.
It is an error to confuse the “exact / approximate” axis with the “theoretical / empirical” exis. There is plenty of theoretical work in complexity theory on approximate algorithms.
There is difference between “having an idea” and “solid theoretical foundations”. Chemists before quantum mechanics had a lots of ideas. But they didn’t have a solid theoretical foundation.
Because this process is not guaranteed to yield good results. Evolution did the exact same thing to create humans, optimizing for genetic fitness. And humans still went and invented condoms.
When the entire future of mankind is at stake, you don’t drop approaches because it may be easier. You try every goddamn approach you have (unless “trying” is dangerous in itself of course).
That’s a bad example. You are essentially asking researchers to predict what they will discover 50 years down the road. A more appropriate example is a person thinking he has medical expertise after reading bodybuilding and nutrition blogs on the internet, vs a person who has gone through medical school and is an MD.
I’m not asking researchers to predict what they will discover. There are different mindsets of research. One mindset is looking for heuristics that maximize short term progress on problems of direct practical relevance. Another mindset is looking for a rigorously defined overarching theory. MIRI is using the latter mindset while most other AI researchers are much closer to the former mindset.
Though humans are the most populous species of large animal on the planet.
Condoms were invented because evolution, being a blind watchmaker, forgot to make sex drive tunable with child mortality, hence humans found a loophole. But whatever function humans are collectively optimizing, it still closely resembles genetic fitness.
Looking at Japan, that’s not self-evident to me :-/
Google “waifu”. No wait, don’t. :D
I’m familiar with the term :-)
So, you were talking about what humans are optimizing for..? X-)
One problem is what Bostrom would call “the treacherous turn.” When the AGI is dependent on us, satisfying us is a very good idea for it—if it’s unsatisfactory it will be deleted. Behaving nicely is so good an idea that many different goal systems will independently choose this strategy. And so the fact that an AGI appears nice is only weak statistical evidence that it would be nice if it wasn’t dependent on us, and further trials are not independent and so don’t accumulate well. This type of problem appears when the AGI develops good enough long-term planning, and has information about its creators.
Another problem is the problem of expanding action spaces. Consider an AGI that wants to gather lots of stamps (example shamelessly stolen from Computerphile video). When the AGI is childlike, its effective action space only looks like spending money to purchase stamps. As it becomes as smart as a human its actions expand—now it might perform a job to make money to buy stamps, or try to steal money to buy stamps, or purchase a printing press to make its own stamps, or all the sorts of things you might do if you really wanted stamps. Then, as it becomes superintelligent, the stamp-gathering robot will proceed to take over the world and try to terraform the entire earth into stamps. This is a problem for using experimental evidence because as the set of actions expands, so do the possible preferences over actions. Which means that there are many possible sets of preferences that might lead to altruistic behavior among weak AIs; there’s some un-eliminable error when trying to predict “many-options” behavior just from “few-options” behavior.
Right—and I think you are enough of an old-timer to know one of my proposals for that particular problem : sim sandboxes where we test AGIs in an oblivious sim. Ideally the AGI is not only unware of its creators, but actually is atheist and thus believes there is no creator. This can solve the problem at the fundamental level.
When I proposed this long ago the knee-jerk reaction was—but super magic woo Bayesian SI will automagically hack its way out! Which of course is ridiculous—we control the AI’s knowledge.
Today we also have early experimental confirmation of sorts in the form of the DeepMind Atari agent which grows up in an Atari world and never becomes aware of it’s true existential status. Scaling up those techniques into the future I fully expect sandbox sim testing to remain the norm.
I agree this is a problem in theory, but it is surmountable in practice. You need to test an action space that provides sufficient coverage for the expected lifetime and impact of the agent. This can all be accomplished in comprehensive well designed virtual reality environments. These environments are needed anyway for high speed training and all successful DL systems already use this in simple form. You can’t time accelerate the real world.
As a more real world relevant example (why is it that people here always use weird examples with staples or paperclips—what’s with the office supplies?) - consider a self driving car agent. The most advanced current open world games already have highly realistic graphics and physics—you wouldn’t need much more in that department except for more realistic traffic, pedestrian and police modelling, etc. Agents can learn to drive safely in the environment—many in parallel, and it can all run much faster than real-time.
Such a sandbox seems fine for self-driving cars, but not so great for superintelligent agents. The sandbox will have limited resources that real-world agents might quickly exceed by acquiring more hardware. It would have to be much, much more realistic than a driving sim if you wanted to use it for general training of an AI that will interact with humans in very diverse ways, research physics, cause large economic disruption, etc. And if the AI itself has no plausible origin in the world, or if you leave other flaws, then sure, it might even figure out that it’s in a simulation, contaminating the experiment.
Sandboxing seems more useful for testing ideas that are well-understood enough to be inspected for success or failure, or tested without needing very good simulation of the real world. Like if you have an AI that is supposed to learn human values by doing futuristic unsupervised discovery of how the world works, and then assigns preference scores to local events by some futuristic procedure involving marked human feedback. This seems totally testable in simulation—you’ll get the wrong preferences, but might test the preference-learning method.
EDIT: It was brought to my attention that a similar sandbox sim testing idea was proposed by Chalmers in 2009 - he calls it a “Leakproof Singularity”.
I’ve tread this ground enough that I should just do a new writeup with my responses to standard objections.
Practical AGI will first appear as sub-human animal level intelligence and human-level intelligence. Practical AGI will necessarily take the form of populations of agents due to computational constraints (costly memory transactions need to be amortized, parallel scaling constraints, etc). This is true today where you need to run at least around 100 AI instances on a single GPU at once to get good performance. This will remain true into the future—its a hard constraint from the physics of fast hardware.
Superintelligence can imply size (a big civilization), speed, or quality. All of this is under our control. We can test a smaller population in the sandbox, we can run them at manageable speed, and we control their knowledge. As far as we know the greeks had brains just as powerful as ours, but a population of a million AGIs with 2,000 year old knowledge are not that dangerous.
Obviously you don’t contain an entire superintelligent AGI civilization in the sandbox (and that would be a waste of resources regardless)! You use the sandbox to test new AGI architectures on smaller populations.
Computer graphics are advancing rapidly and will be completely revolutionized by machine learning in the decade ahead. Agents that grow up in a matrix will not be able to discern their status as easily as an agent that grew up in our world.
Sandboxing will test entire agent architectures—equivalent to DNA brain blueprints for humans—to determine if samples from those architectures have highly desirable mental properties such as altruism.
We can engineer entire world histories and scenarios to test the AGIs, and this could tie into the future of entertainment.
Remember AGI is going to be more similar to brain emulations than not—think Hansonian scenario but without the neeed for brain scanning.
I don’t know about this, but would be happy to hear more.
I don’t think the point is “controlling” these properties, I think the point is drawing conclusions about what an AI will do in the real world. Reduced speed might allow us to run “fast AIs” in simulation and draw conclusions about what they’ll do. Reduced speed might also let us run AI civilizations of large size (though it’s not obvious to me why you’d want such a thing) and draw conclusions about what they’ll do. Reducing the AI’s knowledge seems like a way to make a simulation more compuationally tractable and therefore get better predictions about what the AI will do—but it seems like a risky way that can introduce bias into a simulation.
My real problem is that I don’t think just testing for altruism (which I assume means altruistic behavior) is remotely good enough. If we could simulate our world out past an AI becoming more powerful than the human race, and select for altruism then, I’d be happy. But I am pretty confident that there will be big problems generalizing from a simulation to reality, if that simulation has both differences and restrictions on possible actions and possible values.
If we’re just testing a self-driving car, we can make a simulation that captures the available actions (both literal outputs and “effective actions” permitted by the dynamics) and has basically the right value function built in from the start. Additionally, self-driving cars generalize well from the model to reality. Suppose you have something unrealistic in the model (say, you have other cars follow set training trajectories rather than reacting to the actions of the car). A realistic self-driving car that does well in the simulation might be bad at some skills like negotiating for space on the road, but it won’t suddenly, say, try to use its tire tracks to spell out letters if it you put it into reality with humans.
To put what I think concretely, when exposed to a difference between training and reality, a “dumb, parametric AI” projects reality onto the space it learned in training and just keeps on plugging, making it somewhat insensitive to reality being complicated, and giving us a better idea about how it will generalize. But a “smart AI” doesn’t seem to have this property, it will learn the complications of reality that were omitted in testing, and can act very different as a result. This goes back to the problem of expanding sets of effective actions.