Thanks for the reply, Jacob! You make some good points.
Why not test safety long before the system is superintelligent? - say when it is a population of 100 child like AGIs. As the population grows larger and more intelligent, the safest designs are propagated and made safer.
I endorse eli_sennesh’s response to this part :-)
This again reflects the old ‘hard’ computer science worldview, and obsession with exact solutions.
I am not under the impression that there are “exact solutions” available, here. For example, in the case of “building world-models,” you can’t even get “exact” solutions using AIXI (which does Bayesian inference using a simplicity prior in order to guess what the environment looks like; and can never figure it out exactly). And this is in the simplified setting where AIXI is large enough to contain all possible environments! We, by contrast, need to understand algorithms which allow you to build a world model of the world that you’re inside of; exact solutions are clearly off the table (and, as eli_sennesh notes, huge amounts of statistical modeling are on it instead).
I would readily accept a statistical-modeling-heavy answer to the question of “but how do you build multi-level world-models from percepts, in principle?”; and indeed, I’d be astonished if you avoided it.
Perhaps you read “we need to know how to do X in principle before we do it in practice” as “we need a perfect algorithm that gives you bit-exact solutions to X”? That’s an understandable reading; my apologies. Let me assure you again that we’re not under the illusion you can get bit-exact solutions to most of the problems we’re working on.
For example—perhaps using lots and lots of computing power makes the problem harder instead of easier. How could that be? Because with lots and lots of compute power, you are naturally trying to extrapolate the world model far far into the future, where it branches enormously [...]
Hmm. If you have lots and lots of computing power, you can always just… not use it. It’s not clear to me how additional computing power can make the problem harder—at worst, it can make the problem no easier. I agree, though, that algorithms for modeling the world from the inside can’t just extrapolate arbitrarily, on pain of exponential complexity; so whatever it takes to build and use multi-level world-models, it can’t be that.
Perhaps the point where we disagree is that you think these hurdles suggest that figuring out how to do things we can’t yet do in principle is hopeless, whereas I’m under the impression that these shortcomings highlight places where we’re still confused?
Hmm. If you have lots and lots of computing power, you can always just… not use it. It’s not clear to me how additional computing power can make the problem harder—at worst, it can make the problem no easier.
Additional computing power might not make the problem literally harder, but the assumption of limitless computing power might direct your attention towards wrong parts of the search space.
For example, I suspect that the whole question about multilevel world-models might be something that arises from conceptualizing intelligence as something like AIXI, which implicitly assumes that there’s only one true model of the world. It can do this because it has infinite computing power and can just replace its high-level representation of the world with one where all high-level predictions are derived from the basic atom-level interactions, something that would be intractable for any real-world system to do. Instead real-world systems will need to flexibly switch between different kinds of models depending on the needs of the situation, and use lower-level models in situations where the extra precision is worth the expense of extra computing time. Furthermore, those lower-level models will have been defined in terms of what furthers the system’s goals, as defined on the higher-levels: it will pay preferential attention to those features of the lower-level model that allow it to further its higher-level goals.
In the AIXI framing, the question of multilevel world-models is “what happens when the AI realizes that the true world model doesn’t contain carbon atoms as an ontological primitive”. In the resource-limited framing, that whole question isn’t even coherent, because the system has no such thing as a single true world-model. Instead the resource-limited version of how to get multilevel world-models to work is something like “how to reliably ensure that the AI will create a set of world models in which the appropriate configuration of subatomic objects in the subatomic model gets mapped to the concept of carbon atoms in the higher-level model, while the AI’s utility function continues to evaluate outcomes in terms of this concept regardless of whether it’s using the lower- or higher-level representation of it”.
As an aside, this reframed version seems like the kind of question that you would need to solve in order to have any kind of AGI in the first place, and one which experimental machine learning work would seem the best suited for, so I’d assume it to get naturally solved by AGI researchers even if they weren’t directly concerned with AI risk.
Ohoho! Well, actually Nate, as I personally subscribe to the bounded-rationality school of thinking, and I do actually think this has implications for AI safety. Specifically: as the agent acquires more resources (speed and memory), it can handle larger problems and enlarge its impact on the world, so to make a bounded-rational agent safe, we should, hypothetically, be able to state safety properties explicitly in terms of how much cognitive stuff (philosophically, it all adds up to different ingredients to that magic word “intelligence”) the agent has.
With some kind of framework like that, we’d be able to state and prove safety theorems in the form of, “This design will grow increasingly uncertain about its value function as it grows its cognitive resources, and act more cautiously until receiving more training, and we have some analytic bound telling us exactly how fast this fall-off will happen.” I can even imagine it running along the simple lines of, “As the agent’s model of the world grows more complicated, the entropy/Kolmogorov complexity of that model penalizes hypotheses about the learned value function, thus causing the agent to grow increasingly passive and wait for value training as it learns and grows.”
This requires a framework for normative uncertainty that formalizes acting cautiously when under value-uncertainty, but didn’t someone publish a thesis on that at Oxford a year or two ago?
I would readily accept a statistical-modeling-heavy answer to the question of “but how do you build multi-level world-models from percepts, in principle?”; and indeed, I’d be astonished if you avoided it.
Hmm. If you have lots and lots of computing power, you can always just… not use it. It’s not clear to me how additional computing power can make the problem harder—at worst, it can make the problem no easier. I agree, though, that algorithms for modeling the world from the inside can’t just extrapolate arbitrarily, on pain of exponential complexity; so whatever it takes to build and use multi-level world-models, it can’t be that.
Well, as jacob_cannell pointed out, feeding more compute-power to a bounded-rational agent ought to make it enlarge its models in terms of theory-depth, theory-preorder-connectedness, variance-explanation, and time-horizon. In very short: the branching factors and the hypothesis class get larger, making it harder to learn (if we’re thinking about statistical learning theory).
There’s also the specific issue of assuming Turing-machine-level compute power, assuming that “available compute steps” and “available memory” is an unbounded but finite natural number. Since you’ve not bounded the number, it’s effectively infinite, which of course means that two agents, each of which is “programmed” as a Turing-machine with Turing-machine resources rather than strictly finite resources, can’t reason about each-other: either one would need ordinal numbers to think about what the other (or itself) can do, but actually using ordinal numbers in that analysis would be necessarily wrong (in that neither actually possesses a Turing Oracle, which is equivalent to having w_0 steps of computation).
So you get a bunch of paradox theorems making your job a lot harder.
In contrast, starting from the assumption of having strictly finite computing power is like when E.T. Jaynes starts from the assumption of having finite sample data, finite log-odds, countable hypotheses, etc.: we assume what must necessarily be true in reality to start with, and then analyze the infinite case as passing to the limit of some finite number. Pascal’s Mugging is solvable this way using normal computational Bayesian statistical techniques, for instance, if we assume that we can sample outcomes from our hypothesis distribution.
Let me assure you again that we’re not under the illusion you can get bit-exact solutions to most of the problems we’re working on.
Ok—then you are moving into the world of heuristics and approximations. Once one acknowledges that the bit exact ‘best’ solution either does not exist or cannot be found, then there is an enormous (infinite really) space of potential solutions which have different tradeoffs in their expected utillity in different scenarios/environments along with different cost structures. The most interesting solutions often are so complex than they are too difficult to analyze formally.
Consider the algorithms employed in computer graphics and simulation—which is naturally quite related to the world modelling problems in your maximize diamond example. The best algorithms and techniques employ some reasonably simple principles—such as hierarchical bounded approximations over octrees, or bidirectional path tracing—but a full system is built from a sea of special case approximations customized to particular types of spatio-temporal patterns. Nobody bothers trying to prove that new techniques are better than old, nobody bothers using formal tools to analyze the techniques, because the algorithmic approximation tradeoff surface is far too complex.
In an approximation driven field, new techniques are arrived at through intuitive natural reasoning and are evaluated experimentally. Modern machine learning seems confusing and ad-hoc to mathematicians and traditional computer scientists because it is also an approximation field.
Why not test safety long before the system is superintelligent? - say when it is a population of 100 child like AGIs. As the population grows larger and more intelligent, the safest designs are propagated and made safer.
I endorse eli_sennesh’s response to this part :-)
Ok, eli said:
Because that requires a way to state and demonstrate safety properties such that safety guarantees obtained with small amounts of resources remain strong when the system gets more resources. More on that below.
My perhaps predictable reply is that this safety could be demonstrated experimentally—for example by demonstrating altruism/benevolence as you scale up the AGI in terms of size/population, speed, and knowledge/intelligence. When working in an approximation framework where formal analysis does not work and everything must be proven experimentally—this is simply the best that we can do.
If we could somehow ‘guarantee’ saftey that would be nice, but can we guarantee safety of future human populations?
And now we get into that other issue—if you focus entirely on solving problems with unlimited computation, you avoid thinking about what the final practical resource efficient solutions look like, and you avoid the key question of how resource efficient the brain is. If the brain is efficient, then successful AGI is highly likely to take the form of artificial brains.
So if AGI is broad enough to include artificial brains or ems—then a friendly AI theory which can provide safety guarantees for AGI in general should be able to provide guarantees for artificial brains—correct? Or is it your view that the theory will be more narrow and will only cover particular types of AGI? If so—what types?
I think those scope questions are key, but I don’t want to come off as a hopeless negative critic—we can’t really experiment with AGI just yet, and we may have limited time for experimentation. So to the extent that theory could lead practice—that would be useful if at all possible.
Hmm. If you have lots and lots of computing power, you can always just… not use it. It’s not clear to me how additional computing power can make the problem harder
I hope the context indicated that I was referring to conceptual hardness/difficulty in finding the right algorithm. For example consider the problem of simulating an infinite universe. If you think about the problem first in the case of lots of compute power, it may actually become a red herring. The true solution will involve something like an output sensitive algorithm (asymptotic complexity does not depend at all on the world size) - as in some games—and thus having lots of compute is irrelevant.
I suspect that your maximize diamond across the universe problem is FAI-complete. The hard part is specifying the ‘diamond utility function’, because diamonds are a pattern in the mind that depends on the world model in the mind. The researcher needs to transfer a significant fraction of their world model or mind program into the machine—and if you go to all that trouble then you might as well use a better goal. The simplest solution probably involves uploading.
Thanks again, Jacob. I don’t have time to reply to all of this, but let me reply to one part:
Once one acknowledges that the bit exact ‘best’ solution either does not exist or cannot be found, then there is an enormous (infinite really) space of potential solutions which have different tradeoffs in their expected utillity in different scenarios/environments along with different cost structures. The most interesting solutions often are so complex than they are too difficult to analyze formally.
I don’t buy this. Consider the “expert systems” of the seventies, which used curated databases of logical sentences and reasoned from those using a whole lot of ad-hoc rules. They could just as easily have said “Well we need to build systems that deal with lots of special cases, and you can never be certain about the world. We cannot get exact solutions, and so we are doomed to the zone of heuristics and tradeoffs where the only interesting solutions are too complex to analyze formally.” But they would have been wrong. There were tools and concepts and data structures that they were missing. Judea Pearl (and a whole host of others) showed up, formalized probabilistic graphical models, related them to Bayesian inference, and suddenly a whole class of ad-hoc solutions were superseded.
So I don’t buy that “we can’t get exact solutions” implies “we’re consigned to complex heuristics.” People were using complicated ad-hoc rules to approximate logic, and then later they were using complex heuristics to approximate Bayesian inference, and this was progress.
My claim is that there are other steps such as those that haven’t been made yet, that there are tools on the order of “causal graphical models” that we are missing.
Imagine encountering a programmer from the future who knows how to program an AGI and asking them “How do you do that whole multi-level world-modeling thing? Can you show me the algorithm?” I strongly expect that they’d say something along the lines of “oh, well, you set up a system like this and then have it take percepts like that, and then you can see how if we run this for a while on lots of data it starts building multi-level descriptions of the universe. Here, let me walk you through what it looks like for the system to discover general relativity.”
Since I don’t know of a way to set up a system such that it would knowably and reliably start modeling the universe in this sense, I suspect that we’re missing some tools.
I’m not sure whether your view is of the form “actually the programmer of the future would say “I don’t know how it’s building a model of the world either, it’s just a big neural net that I trained for a long time”″ or whether it’s of the form “actually we do know how to set up that system already”, or whether it’s something else entirely. But if it’s the second one, then please tell! :-)
My claim is that there are other steps such as those that haven’t been made yet, that there are tools on the order of “causal graphical models” that we are missing.
I thought you hired Jessica for exactly that. I have these slides and everything that I was so sad I wouldn’t get to show you because you’d know all about probabilistic programming after hiring Jessica.
Thanks for the clarifications—I’ll make this short.
Judea Pearl (and a whole host of others) showed up, formalized probabilistic graphical models, related them to Bayesian inference, and suddenly a whole class of ad-hoc solutions were superseded.
Probabilistic graphical models were definitely a key theoretical development, but they hardly swept the field of expert systems. From what I remember, in terms of practical applications, they immediately replaced or supplemented expert systems in only a few domains—such as medical diagnostic systems. Complex ad hoc expert systems continued to dominate unchallenged in most fields for decades: in robotics, computer vision, speech recognition, game AI, fighter jets, etc etc basically everything important. As far as I am aware the current ANN revolution is truly unique in that it is finally replacing expert systems across most of the board—although there are still holdouts (as far as I know most robotic controllers are still expert systems, as are fighter jets, and most Go AI systems).
The ANN solutions are more complex than the manually crafted expert systems they replace—but the complexity is automatically generated. The code the developers actually need to implement and manage is vastly simpler—this is the great power and promise of machine learning.
Here is a simple general truth—the Occam simplicity prior does imply that simpler hypotheses/models are more likely, but for any simple model there are an infinite family of approximations to that model of escalating complexity. Thus more efficient approximations naturally tend to have greater code complexity, even though they approximate a much simpler model.
My claim is that there are other steps such as those that haven’t been made yet, that there are tools on the order of “causal graphical models” that we are missing.
Well, that would be interesting.
I’m not sure whether your view is of the form “actually the programmer of the future would say “I don’t know how it’s building a model of the world either, it’s just a big neural net that I trained for a long time”″ or whether it’s of the form “actually we do know how to set up that system [multi-level model] already”, or whether it’s something else entirely. But if it’s the second one, then by all means, please tell :-)
Anyone who has spent serious time working in graphics has also spent serious time thinking about how to create the matrix—if given enough computer power. If you got say a thousand of the various brightest engineers in different simulation related fields, from physics to graphics, and got them all working on a large mega project with huge funds it could probably be implemented today. You’d start with a hierarchical/multi-resolution modelling graph—using say octrees or kdtrees over voxel cells, and a general set of hierarchical bidirectional inference operators for tracing paths and interactions.
To make it efficient, you need a huge army of local approximation models for different phenomena at different scales—low level quantum codes just in case, particle level codes, molecular bio codes, fluid dynamics, rigid body, etc etc. It’s a sea of codes with decision tree like code to decide which models to use where and when.
Of course with machine learning we could automatically learn most of those codes—which suddenly makes it more tractable. And then you could use that big engine as your predictive world model, once it was trained.
The problem is to plan anything worthwhile you need to simulate human minds reasonably well, which means to be useful the sim engine would basically need to infer copies of everyone’s minds . . ..
And if you can do that, then you already have brain based AGI!
So I expect that the programmer from the future will say—yes at the low level we use various brain-like neural nets, and various non-brain like neural nets or learned virtual circuits, some operating over explicit space-time graphs. In all cases we have pretty detailed knowledge of what the circuits are doing—here take a look at that last goal update that just propagated in your left anterior prefrontal cortex . ..
While the methods for finding a solution to a well-formed problem currently used in Machine Learning are relatively well understood, the solutions found are not.
And that is what really matters from a safety perspective. We can and do make some headway in understanding the solutions, as well, but the trend is towards more autonomy for the learning algorithm, and correspondingly more opaqueness.
As you mentioned, the solutions found are extremely complex. So I don’t think it makes sense to view them only in terms of approximations to some conceptually simple (but expensive) ideal solution.
If we want to understand their behaviour, which is what actually matters for safety, we will have to grapple with this complexity somehow.
Personally, I’m not optimistic about experimentation (as it is currently practiced in the ML community) being a good enough solution. There is, at least, the problem of the treacherous turn. If we’re lucky, the AI jumps the gun, and society wakes up to the possibility of an AI trying to take over. If we’re unlucky, we don’t get any warning, and the AI only behaves for long enough to gain our trust and discover a nearly fail-proof strategy. VR could help here, but I think it’s rather far from a complete solution.
1) The divide between your so called “old CS” and “new CS” is more of a divide (or perhaps a continuum) between engineers and theorists. The former is concerned with on-the-ground systems, where quadratic time algorithms are costly and statistics is the better weapon at dealing with real world complexities. The latter is concerned with abstracted models where polynomial time is good enough and logical deduction is the only tool. These models will probably never be applied literally by engineers, but they provide human understanding of engineering problems, and because of their generality, they will last longer. The idea of a Turing machine will last centuries if not millenia, but a Pascal programmer might not find a job today and a Python programmer might not find a job in 20 years. Machine learning techniques constantly come in and out of vogue, but something like the PAC model will be here to stay for a long time. But of course at the end of the day it’s engineers who realize new inventions and technologies.
Theorists’ ideas can transform an entire engineering field, and engineering problems inspire new theories. We need both types of people (or rather, people across the spectrum from engineers to theorists).
2) With neural networks increasing in complexity, making the learning converge is no longer as simple as just running gradient descent. In particular, something like a K12 curriculum will probably emerge to guide the AGI past local optima. For example, the recent paper on neural Turing machines has already employed curriculum learning, as the authors couldn’t get good performance otherwise. So there is a nontrivial maintenance cost (in designing a curriculum) to a neural network so that it adapts to a changing environment, which will not lessen if we don’t better our understanding of it.
Of course expert systems also have maintenance costs, of a different type. But my point is that neural networks are not free lunches.
3) What caused the AI winter was that AI researchers didn’t realize how difficult it was to do what seems so natural to us—motion, language, vision, etc. They were overly optimistic because they succeeded in what were difficult to humans—chess, math, etc. I think it’s fair to say the ANNs have “swept the board” in the former category, the category of lower level functions (machine translation, machine vision, etc), but the high level stuff is still predominantly logical systems (formal verification, operations research, knowledge representation, etc). It’s unfortunate that the the neural camp and logical camp don’t interact too much, but I think it is a major objective to combine the flexibility of neural systems with the power and precision of logical systems.
Here is a simple general truth—the Occam simplicity prior does imply that simpler hypotheses/models are more likely, but for any simple model there are an infinite family of approximations to that model of escalating complexity. Thus more efficient approximations naturally tend to have greater code complexity, even though they approximate a much simpler model.
Schmidhuber invented something called the speed prior that weighs an algorithm according to how fast it generates the observation, rather than how simple it is. He makes some ridiculous claims about our (physical) universe assuming the speed prior. Ostensibly one can also weigh in accuracy of approximation in there to produce another variant of prior. (But of course all of these will lose the universality enjoyed by the Occam prior)
My perhaps predictable reply is that this safety could be demonstrated experimentally—for example by demonstrating altruism/benevolence as you scale up the AGI in terms of size/population, speed, and knowledge/intelligence.
There’s a big difference between the hopelessly empirical school of machine learning, in which things are shown in experiments and then accepted as true, and real empirical science, in which we show things in small-scale experiments to build theories of how the systems in question behave in the large scale.
You can’t actually get away without any theorizing, on the basis of “Oh well, it seems to work. Ship it.” That’s actually bad engineering, although it’s more commonly accepted in engineering than in science. In a real science, you look for the laws that underly your experimental results, or at least causally robust trends.
If the brain is efficient, then successful AGI is highly likely to take the form of artificial brains.
If the brain is efficient, and it is, then you shouldn’t try to cargo-cult copy the brain, any more than we cargo-culted feathery wings to make airplanes. You experiment, you theorize, you find out why it’s efficient, and then you strip that of its evolutionarily coincidental trappings and make an engine based on a clear theory of which natural forces govern the phenomenon in question—here, thought.
If the brain is efficient, and it is, then you shouldn’t try to cargo-cult copy the brain, any more than we cargo-culted feathery wings to make airplanes.
The wright brothers copied wings for lift and wing warping for 3D control both from birds. Only the forward propulsion was different.
make an engine based on a clear theory of which natural forces govern the phenomenon in question—here, thought.
We already have that—it’s called a computer. AGI is much more specific and anthropocentric because it is relative to our specific society/culture/economy. It requires predicting and modelling human minds—and the structure of efficient software that can predict a human mind is itself a human mind.
“the structure of efficient software that can predict a human mind is itself a human mind.”—I doubt that. Why do you think this is the case? I think there are already many examples where simple statistical models (e.g. linear regression) can do a better job of predicting some things about a human than an expert human can.
“Intelligence measures an agent’s ability to achieve goals in a wide range of environments.”
So, arguably that should include environments with humans in them. But to succeed, an AI would not necessarily have to predict or model human minds; it could instead, e.g. kill all humans, and/or create safeguards that would prevent its own destruction by any existing technology.
A computer is a bicycle for the mind. Logic is purified thought, computers are logic engines. General intelligence can be implemented by a computer, but it is much more anthrospecific.
With respect, no, it’s just thought with all the interesting bits cut away to leave something so stripped-down it’s completely deterministic.
computers are logic engines
Sorta-kinda. They’re also arithmetic engines, floating-point engines, recording engines. They can be made into probability engines, which is the beginnings of how you implement intelligence on a computer.
Thanks for the reply, Jacob! You make some good points.
I endorse eli_sennesh’s response to this part :-)
I am not under the impression that there are “exact solutions” available, here. For example, in the case of “building world-models,” you can’t even get “exact” solutions using AIXI (which does Bayesian inference using a simplicity prior in order to guess what the environment looks like; and can never figure it out exactly). And this is in the simplified setting where AIXI is large enough to contain all possible environments! We, by contrast, need to understand algorithms which allow you to build a world model of the world that you’re inside of; exact solutions are clearly off the table (and, as eli_sennesh notes, huge amounts of statistical modeling are on it instead).
I would readily accept a statistical-modeling-heavy answer to the question of “but how do you build multi-level world-models from percepts, in principle?”; and indeed, I’d be astonished if you avoided it.
Perhaps you read “we need to know how to do X in principle before we do it in practice” as “we need a perfect algorithm that gives you bit-exact solutions to X”? That’s an understandable reading; my apologies. Let me assure you again that we’re not under the illusion you can get bit-exact solutions to most of the problems we’re working on.
Hmm. If you have lots and lots of computing power, you can always just… not use it. It’s not clear to me how additional computing power can make the problem harder—at worst, it can make the problem no easier. I agree, though, that algorithms for modeling the world from the inside can’t just extrapolate arbitrarily, on pain of exponential complexity; so whatever it takes to build and use multi-level world-models, it can’t be that.
Perhaps the point where we disagree is that you think these hurdles suggest that figuring out how to do things we can’t yet do in principle is hopeless, whereas I’m under the impression that these shortcomings highlight places where we’re still confused?
Additional computing power might not make the problem literally harder, but the assumption of limitless computing power might direct your attention towards wrong parts of the search space.
For example, I suspect that the whole question about multilevel world-models might be something that arises from conceptualizing intelligence as something like AIXI, which implicitly assumes that there’s only one true model of the world. It can do this because it has infinite computing power and can just replace its high-level representation of the world with one where all high-level predictions are derived from the basic atom-level interactions, something that would be intractable for any real-world system to do. Instead real-world systems will need to flexibly switch between different kinds of models depending on the needs of the situation, and use lower-level models in situations where the extra precision is worth the expense of extra computing time. Furthermore, those lower-level models will have been defined in terms of what furthers the system’s goals, as defined on the higher-levels: it will pay preferential attention to those features of the lower-level model that allow it to further its higher-level goals.
In the AIXI framing, the question of multilevel world-models is “what happens when the AI realizes that the true world model doesn’t contain carbon atoms as an ontological primitive”. In the resource-limited framing, that whole question isn’t even coherent, because the system has no such thing as a single true world-model. Instead the resource-limited version of how to get multilevel world-models to work is something like “how to reliably ensure that the AI will create a set of world models in which the appropriate configuration of subatomic objects in the subatomic model gets mapped to the concept of carbon atoms in the higher-level model, while the AI’s utility function continues to evaluate outcomes in terms of this concept regardless of whether it’s using the lower- or higher-level representation of it”.
As an aside, this reframed version seems like the kind of question that you would need to solve in order to have any kind of AGI in the first place, and one which experimental machine learning work would seem the best suited for, so I’d assume it to get naturally solved by AGI researchers even if they weren’t directly concerned with AI risk.
+1
Ohoho! Well, actually Nate, as I personally subscribe to the bounded-rationality school of thinking, and I do actually think this has implications for AI safety. Specifically: as the agent acquires more resources (speed and memory), it can handle larger problems and enlarge its impact on the world, so to make a bounded-rational agent safe, we should, hypothetically, be able to state safety properties explicitly in terms of how much cognitive stuff (philosophically, it all adds up to different ingredients to that magic word “intelligence”) the agent has.
With some kind of framework like that, we’d be able to state and prove safety theorems in the form of, “This design will grow increasingly uncertain about its value function as it grows its cognitive resources, and act more cautiously until receiving more training, and we have some analytic bound telling us exactly how fast this fall-off will happen.” I can even imagine it running along the simple lines of, “As the agent’s model of the world grows more complicated, the entropy/Kolmogorov complexity of that model penalizes hypotheses about the learned value function, thus causing the agent to grow increasingly passive and wait for value training as it learns and grows.”
This requires a framework for normative uncertainty that formalizes acting cautiously when under value-uncertainty, but didn’t someone publish a thesis on that at Oxford a year or two ago?
Can I laugh maniacally at least a little bit now?
Well, as jacob_cannell pointed out, feeding more compute-power to a bounded-rational agent ought to make it enlarge its models in terms of theory-depth, theory-preorder-connectedness, variance-explanation, and time-horizon. In very short: the branching factors and the hypothesis class get larger, making it harder to learn (if we’re thinking about statistical learning theory).
There’s also the specific issue of assuming Turing-machine-level compute power, assuming that “available compute steps” and “available memory” is an unbounded but finite natural number. Since you’ve not bounded the number, it’s effectively infinite, which of course means that two agents, each of which is “programmed” as a Turing-machine with Turing-machine resources rather than strictly finite resources, can’t reason about each-other: either one would need ordinal numbers to think about what the other (or itself) can do, but actually using ordinal numbers in that analysis would be necessarily wrong (in that neither actually possesses a Turing Oracle, which is equivalent to having w_0 steps of computation).
So you get a bunch of paradox theorems making your job a lot harder.
In contrast, starting from the assumption of having strictly finite computing power is like when E.T. Jaynes starts from the assumption of having finite sample data, finite log-odds, countable hypotheses, etc.: we assume what must necessarily be true in reality to start with, and then analyze the infinite case as passing to the limit of some finite number. Pascal’s Mugging is solvable this way using normal computational Bayesian statistical techniques, for instance, if we assume that we can sample outcomes from our hypothesis distribution.
Ok—then you are moving into the world of heuristics and approximations. Once one acknowledges that the bit exact ‘best’ solution either does not exist or cannot be found, then there is an enormous (infinite really) space of potential solutions which have different tradeoffs in their expected utillity in different scenarios/environments along with different cost structures. The most interesting solutions often are so complex than they are too difficult to analyze formally.
Consider the algorithms employed in computer graphics and simulation—which is naturally quite related to the world modelling problems in your maximize diamond example. The best algorithms and techniques employ some reasonably simple principles—such as hierarchical bounded approximations over octrees, or bidirectional path tracing—but a full system is built from a sea of special case approximations customized to particular types of spatio-temporal patterns. Nobody bothers trying to prove that new techniques are better than old, nobody bothers using formal tools to analyze the techniques, because the algorithmic approximation tradeoff surface is far too complex.
In an approximation driven field, new techniques are arrived at through intuitive natural reasoning and are evaluated experimentally. Modern machine learning seems confusing and ad-hoc to mathematicians and traditional computer scientists because it is also an approximation field.
Ok, eli said:
My perhaps predictable reply is that this safety could be demonstrated experimentally—for example by demonstrating altruism/benevolence as you scale up the AGI in terms of size/population, speed, and knowledge/intelligence. When working in an approximation framework where formal analysis does not work and everything must be proven experimentally—this is simply the best that we can do.
If we could somehow ‘guarantee’ saftey that would be nice, but can we guarantee safety of future human populations?
And now we get into that other issue—if you focus entirely on solving problems with unlimited computation, you avoid thinking about what the final practical resource efficient solutions look like, and you avoid the key question of how resource efficient the brain is. If the brain is efficient, then successful AGI is highly likely to take the form of artificial brains.
So if AGI is broad enough to include artificial brains or ems—then a friendly AI theory which can provide safety guarantees for AGI in general should be able to provide guarantees for artificial brains—correct? Or is it your view that the theory will be more narrow and will only cover particular types of AGI? If so—what types?
I think those scope questions are key, but I don’t want to come off as a hopeless negative critic—we can’t really experiment with AGI just yet, and we may have limited time for experimentation. So to the extent that theory could lead practice—that would be useful if at all possible.
I hope the context indicated that I was referring to conceptual hardness/difficulty in finding the right algorithm. For example consider the problem of simulating an infinite universe. If you think about the problem first in the case of lots of compute power, it may actually become a red herring. The true solution will involve something like an output sensitive algorithm (asymptotic complexity does not depend at all on the world size) - as in some games—and thus having lots of compute is irrelevant.
I suspect that your maximize diamond across the universe problem is FAI-complete. The hard part is specifying the ‘diamond utility function’, because diamonds are a pattern in the mind that depends on the world model in the mind. The researcher needs to transfer a significant fraction of their world model or mind program into the machine—and if you go to all that trouble then you might as well use a better goal. The simplest solution probably involves uploading.
Thanks again, Jacob. I don’t have time to reply to all of this, but let me reply to one part:
I don’t buy this. Consider the “expert systems” of the seventies, which used curated databases of logical sentences and reasoned from those using a whole lot of ad-hoc rules. They could just as easily have said “Well we need to build systems that deal with lots of special cases, and you can never be certain about the world. We cannot get exact solutions, and so we are doomed to the zone of heuristics and tradeoffs where the only interesting solutions are too complex to analyze formally.” But they would have been wrong. There were tools and concepts and data structures that they were missing. Judea Pearl (and a whole host of others) showed up, formalized probabilistic graphical models, related them to Bayesian inference, and suddenly a whole class of ad-hoc solutions were superseded.
So I don’t buy that “we can’t get exact solutions” implies “we’re consigned to complex heuristics.” People were using complicated ad-hoc rules to approximate logic, and then later they were using complex heuristics to approximate Bayesian inference, and this was progress.
My claim is that there are other steps such as those that haven’t been made yet, that there are tools on the order of “causal graphical models” that we are missing.
Imagine encountering a programmer from the future who knows how to program an AGI and asking them “How do you do that whole multi-level world-modeling thing? Can you show me the algorithm?” I strongly expect that they’d say something along the lines of “oh, well, you set up a system like this and then have it take percepts like that, and then you can see how if we run this for a while on lots of data it starts building multi-level descriptions of the universe. Here, let me walk you through what it looks like for the system to discover general relativity.”
Since I don’t know of a way to set up a system such that it would knowably and reliably start modeling the universe in this sense, I suspect that we’re missing some tools.
I’m not sure whether your view is of the form “actually the programmer of the future would say “I don’t know how it’s building a model of the world either, it’s just a big neural net that I trained for a long time”″ or whether it’s of the form “actually we do know how to set up that system already”, or whether it’s something else entirely. But if it’s the second one, then please tell! :-)
I thought you hired Jessica for exactly that. I have these slides and everything that I was so sad I wouldn’t get to show you because you’d know all about probabilistic programming after hiring Jessica.
Thanks for the clarifications—I’ll make this short.
Probabilistic graphical models were definitely a key theoretical development, but they hardly swept the field of expert systems. From what I remember, in terms of practical applications, they immediately replaced or supplemented expert systems in only a few domains—such as medical diagnostic systems. Complex ad hoc expert systems continued to dominate unchallenged in most fields for decades: in robotics, computer vision, speech recognition, game AI, fighter jets, etc etc basically everything important. As far as I am aware the current ANN revolution is truly unique in that it is finally replacing expert systems across most of the board—although there are still holdouts (as far as I know most robotic controllers are still expert systems, as are fighter jets, and most Go AI systems).
The ANN solutions are more complex than the manually crafted expert systems they replace—but the complexity is automatically generated. The code the developers actually need to implement and manage is vastly simpler—this is the great power and promise of machine learning.
Here is a simple general truth—the Occam simplicity prior does imply that simpler hypotheses/models are more likely, but for any simple model there are an infinite family of approximations to that model of escalating complexity. Thus more efficient approximations naturally tend to have greater code complexity, even though they approximate a much simpler model.
Well, that would be interesting.
Anyone who has spent serious time working in graphics has also spent serious time thinking about how to create the matrix—if given enough computer power. If you got say a thousand of the various brightest engineers in different simulation related fields, from physics to graphics, and got them all working on a large mega project with huge funds it could probably be implemented today. You’d start with a hierarchical/multi-resolution modelling graph—using say octrees or kdtrees over voxel cells, and a general set of hierarchical bidirectional inference operators for tracing paths and interactions.
To make it efficient, you need a huge army of local approximation models for different phenomena at different scales—low level quantum codes just in case, particle level codes, molecular bio codes, fluid dynamics, rigid body, etc etc. It’s a sea of codes with decision tree like code to decide which models to use where and when.
Of course with machine learning we could automatically learn most of those codes—which suddenly makes it more tractable. And then you could use that big engine as your predictive world model, once it was trained.
The problem is to plan anything worthwhile you need to simulate human minds reasonably well, which means to be useful the sim engine would basically need to infer copies of everyone’s minds . . ..
And if you can do that, then you already have brain based AGI!
So I expect that the programmer from the future will say—yes at the low level we use various brain-like neural nets, and various non-brain like neural nets or learned virtual circuits, some operating over explicit space-time graphs. In all cases we have pretty detailed knowledge of what the circuits are doing—here take a look at that last goal update that just propagated in your left anterior prefrontal cortex . ..
While the methods for finding a solution to a well-formed problem currently used in Machine Learning are relatively well understood, the solutions found are not.
And that is what really matters from a safety perspective. We can and do make some headway in understanding the solutions, as well, but the trend is towards more autonomy for the learning algorithm, and correspondingly more opaqueness.
As you mentioned, the solutions found are extremely complex. So I don’t think it makes sense to view them only in terms of approximations to some conceptually simple (but expensive) ideal solution.
If we want to understand their behaviour, which is what actually matters for safety, we will have to grapple with this complexity somehow.
Personally, I’m not optimistic about experimentation (as it is currently practiced in the ML community) being a good enough solution. There is, at least, the problem of the treacherous turn. If we’re lucky, the AI jumps the gun, and society wakes up to the possibility of an AI trying to take over. If we’re unlucky, we don’t get any warning, and the AI only behaves for long enough to gain our trust and discover a nearly fail-proof strategy. VR could help here, but I think it’s rather far from a complete solution.
BTW, SOTA for Computer Go uses ConvNets (before that, it was Monte-Carlo Tree Search, IIRC): http://machinelearning.wustl.edu/mlpapers/paper_files/icml2015_clark15.pdf ;)
I just want to point out some nuiances.
1) The divide between your so called “old CS” and “new CS” is more of a divide (or perhaps a continuum) between engineers and theorists. The former is concerned with on-the-ground systems, where quadratic time algorithms are costly and statistics is the better weapon at dealing with real world complexities. The latter is concerned with abstracted models where polynomial time is good enough and logical deduction is the only tool. These models will probably never be applied literally by engineers, but they provide human understanding of engineering problems, and because of their generality, they will last longer. The idea of a Turing machine will last centuries if not millenia, but a Pascal programmer might not find a job today and a Python programmer might not find a job in 20 years. Machine learning techniques constantly come in and out of vogue, but something like the PAC model will be here to stay for a long time. But of course at the end of the day it’s engineers who realize new inventions and technologies.
Theorists’ ideas can transform an entire engineering field, and engineering problems inspire new theories. We need both types of people (or rather, people across the spectrum from engineers to theorists).
2) With neural networks increasing in complexity, making the learning converge is no longer as simple as just running gradient descent. In particular, something like a K12 curriculum will probably emerge to guide the AGI past local optima. For example, the recent paper on neural Turing machines has already employed curriculum learning, as the authors couldn’t get good performance otherwise. So there is a nontrivial maintenance cost (in designing a curriculum) to a neural network so that it adapts to a changing environment, which will not lessen if we don’t better our understanding of it.
Of course expert systems also have maintenance costs, of a different type. But my point is that neural networks are not free lunches.
3) What caused the AI winter was that AI researchers didn’t realize how difficult it was to do what seems so natural to us—motion, language, vision, etc. They were overly optimistic because they succeeded in what were difficult to humans—chess, math, etc. I think it’s fair to say the ANNs have “swept the board” in the former category, the category of lower level functions (machine translation, machine vision, etc), but the high level stuff is still predominantly logical systems (formal verification, operations research, knowledge representation, etc). It’s unfortunate that the the neural camp and logical camp don’t interact too much, but I think it is a major objective to combine the flexibility of neural systems with the power and precision of logical systems.
Schmidhuber invented something called the speed prior that weighs an algorithm according to how fast it generates the observation, rather than how simple it is. He makes some ridiculous claims about our (physical) universe assuming the speed prior. Ostensibly one can also weigh in accuracy of approximation in there to produce another variant of prior. (But of course all of these will lose the universality enjoyed by the Occam prior)
There’s a big difference between the hopelessly empirical school of machine learning, in which things are shown in experiments and then accepted as true, and real empirical science, in which we show things in small-scale experiments to build theories of how the systems in question behave in the large scale.
You can’t actually get away without any theorizing, on the basis of “Oh well, it seems to work. Ship it.” That’s actually bad engineering, although it’s more commonly accepted in engineering than in science. In a real science, you look for the laws that underly your experimental results, or at least causally robust trends.
If the brain is efficient, and it is, then you shouldn’t try to cargo-cult copy the brain, any more than we cargo-culted feathery wings to make airplanes. You experiment, you theorize, you find out why it’s efficient, and then you strip that of its evolutionarily coincidental trappings and make an engine based on a clear theory of which natural forces govern the phenomenon in question—here, thought.
The wright brothers copied wings for lift and wing warping for 3D control both from birds. Only the forward propulsion was different.
We already have that—it’s called a computer. AGI is much more specific and anthropocentric because it is relative to our specific society/culture/economy. It requires predicting and modelling human minds—and the structure of efficient software that can predict a human mind is itself a human mind.
“the structure of efficient software that can predict a human mind is itself a human mind.”—I doubt that. Why do you think this is the case? I think there are already many examples where simple statistical models (e.g. linear regression) can do a better job of predicting some things about a human than an expert human can.
Also, although I don’t think there is “one true definition” of AGI, I think there is a meaningful one which is not particularly anthropocentric, see Chapter 1 of Shane Legg’s thesis: http://www.vetta.org/documents/Machine_Super_Intelligence.pdf.
“Intelligence measures an agent’s ability to achieve goals in a wide range of environments.”
So, arguably that should include environments with humans in them. But to succeed, an AI would not necessarily have to predict or model human minds; it could instead, e.g. kill all humans, and/or create safeguards that would prevent its own destruction by any existing technology.
What? No.
A computer is a bicycle for the mind. Logic is purified thought, computers are logic engines. General intelligence can be implemented by a computer, but it is much more anthrospecific.
With respect, no, it’s just thought with all the interesting bits cut away to leave something so stripped-down it’s completely deterministic.
Sorta-kinda. They’re also arithmetic engines, floating-point engines, recording engines. They can be made into probability engines, which is the beginnings of how you implement intelligence on a computer.