When Marcus Hutter attempted to translate his intuitions about optimal intelligence into an equation, he was moving from philosophy to mathematics. You could say that Yudkowsky’s initial objections to AIXI were then a step backwards, into more philosophical / informal questions: ‘Can an equation distill the essence of intelligence without distilling the essence of efficient search?’ ‘Can true intelligence be unreflective?’ (Of course, both Hutter and Yudkowsky were making their proposals and criticisms with an eye toward engineering applications; talking about concrete examples like the anvil problem is more likely to be productive than talking about ‘true essences’. The real import of ‘the essence of intelligence’ is how useful and illuminating a framework is for mathematical and engineering progress.)
In practice, then, progress toward engineering can involve moving two steps forward, then one (or two, or three) steps back. The highest-value things MIRI can do right now mostly involving moving toward mathematics—including better formalizing the limitations of the AIXI equation, and coming up with formally specified alternatives—but that’s probably not true of all Friendly AI questions, and it doesn’t mean we should never take a step back and reassess whether our formal accomplishments represent actual progress toward our informal goals.
more likely to be productive than talking about ‘true essences’.
So who, in (contemporary, analytical) philosophy talks about true essences?
In practice, then, progress toward engineering can involve moving two steps forward, then one (or two, or three) steps back.
But that’s inefficient. It’s wasted effort to quantify what doesn’t work conceptually. It may be impossible to always get the conceptual stage right first time, but one can adopt a policy of getting it as firm as possible...rather than a policy of cpnnitatiinally associating conceptual analysis with “semantics”, “true essences” and other bad things, and going straight to maths.
The highest-value things MIRI can do right now mostly involving moving toward mathematics—including better formalizing the limitations of the AIXI equation, and coming up with formally specified alternatives
I would have thought that the highest value work is work that is relevant to systems that exist, or will argue in the near future. …but what I see is a lot of work on AIXI (not computably tractable), Bayes (ditto), goal stable agents (no one knows how to build one).
There’s nothing wrong with talking about true essences. When I described Yudkowsky’s “step backward” into philosophy (and thinking about ‘the true essence of intelligence’), I was speaking positively, of a behavior I endorse and want to see more of. My point was that progress toward engineering can exhibit a zigzagging pattern; I think we currently need more zags than zigs, but it’s certainly possible that at some future date we’ll be more zig-deprived than zag-deprived, and philosophy will get prioritized.
So who, in (contemporary, analytical) philosophy talks about true essences?
How is that relevant?
But that’s inefficient. It’s wasted effort to quantify what doesn’t work conceptually. It may be impossible to always get the conceptual stage right first time, but one can adopt a policy of getting it as firm as possible...
Writing your intuitions up in a formal, precise way can often help you better understand what they are, and whether they’re coherent. It’s a good way to inspire new ideas and spot counter-intuitive relationships between old ones, and it’s also a good way to do a sanity check on an entire framework. So I don’t think steering clear of math and logic notation is a particularly good way to enhance the quality of philosophical thought; I think it’s frequently more efficient to quickly test your ideas’ coherence and univocality.
The argument is that AIXI and Bayes assume infinite computing power, and thus simplify the problem by allowing you to work on it without needing to consider computing power limitations. If you can’t solve the easier form of the problem where you’re allowed infinite computing power, you definitely can’t solve the harder real-world version either, so you should start with the easier problem first.
But the difference between infinity and any finite value is infinity . Intelligence itself, or a substantial subset if it, is easy, given infinite resources, as AIXI shows. But that’s been of no use in developing real world AI: tractable approximations to AIXI aren’t powerful enough to be dangerous.
It would be embarrassing to MIRI if someone cobbled together AI smart enough to be dangerous, and came to the worlds experts on AI safety for some safety features, only to be told “sorry guys, we haven’t got anything that’s compatible with your system, because it’s finite”.
A Monte-Carlo approximation of AIXI can play Pac-Man and other simple games
(Veness et al. 2011), but some experts think AIXI approximation isn’t a fruitful path
toward human-level AI. Even if that’s true, AIXI is the first model of cross-domain
intelligent behavior to be so completely and formally specified that we can use it to
make formal arguments about the properties which would obtain in certain classes of
hypothetical agents if we could build them today. Moreover, the formality of AIXI-like
agents allows researchers to uncover potential safety problems with AI agents of increasingly
general capability—problems which could be addressed by additional research, as
happened in the field of computer security after Lampson’s article on the confinement
problem.
AIXI-like agents model a critical property of future AI systems: that they will need
to explore and learn models of the world. This distinguishes AIXI-like agents from
current systems that use predefined world models, or learn parameters of predefined
world models. Existing verification techniques for autonomous agents (Fisher, Dennis,
and Webster 2013) apply only to particular systems, and to avoiding unwanted optima
in specific utility functions. In contrast, the problems described below apply to broad
classes of agents, such as those that seek to maximize rewards from the environment.
For example, in 2011 Mark Ring and Laurent Orseau analyzed some classes of AIXIlike
agents to show that several kinds of advanced agents will maximize their rewards
by taking direct control of their input stimuli (Ring and Orseau 2011). To understand
what this means, recall the experiments of the 1950s in which rats could push a lever
to activate a wire connected to the reward circuitry in their brains. The rats pressed the
lever again and again, even to the exclusion of eating. Once the rats were given direct
control of the input stimuli to their reward circuitry, they stopped bothering with more
indirect ways of stimulating their reward circuitry, such as eating. Some humans also
engage in this kind of “wireheading” behavior when they discover that they can directly
modify the input stimuli to their brain’s reward circuitry by consuming addictive narcotics.
What Ring and Orseau showed was that some classes of artificial agents will
wirehead—that is, they will behave like drug addicts.
Fortunately, there may be some ways to avoid the problem. In their 2011 paper, Ring
and Orseau showed that some types of agents will resist wireheading. And in 2012,
Bill Hibbard (2012) showed that the wireheading problem can also be avoided if three
conditions are met: (1) the agent has some foreknowledge of a stochastic environment,
(2) the agent uses a utility function instead of a reward function, and (3) we define
the agent’s utility function in terms of its internal mental model of the environment.
Hibbard’s solution was inspired by thinking about how humans solve the wireheading
problem: we can stimulate the reward circuitry in our brains with drugs, yet most of us
avoid this temptation because our models of the world tell us that drug addiction will
change our motives in ways that are bad according to our current preferences.
Relatedly, Daniel Dewey (2011) showed that in general, AIXI-like agents will locate
and modify the parts of their environment that generate their rewards. For example,
an agent dependent on rewards from human users will seek to replace those humans
with a mechanism that gives rewards more reliably. As a potential solution to this problem,
Dewey proposed a new class of agents called value learners, which can be designed
to learn and satisfy any initially unknown preferences, so long as the agent’s designers
provide it with an idea of what constitutes evidence about those preferences.
Practical AI systems are embedded in physical environments, and some experimental
systems employ their environments for storing information. Now AIXI-inspired work
is creating theoretical models for dissolving the agent-environment boundary used as
a simplifying assumption in reinforcement learning and other models, including the
original AIXI formulation (Orseau and Ring 2012b). When agents’ computations must
be performed by pieces of the environment, they may be spied on or hacked by other,
competing agents. One consequence shown in another paper by Orseau and Ring is
that, if the environment can modify the agent’s memory, then in some situations even
the simplest stochastic agent can outperform the most intelligent possible deterministic
agent (Orseau and Ring 2012a).
I feel as though you’re engaging in pedantry for pedantry’s sake. The point is that if we can’t even solve the simplified version of the problem, there’s no way we’re going to solve the hard version—effectively, it’s saying that you have to crawl before you can walk. Your response was to point out that walking is more useful than crawling, which is really orthogonal to the problem here—the problem being, of course, the fact that we haven’t even learned to crawl yet. AIXI and Bayes are useful in that solving AGI problems in the context provided can act as a “stepping stone” to larger and bigger problems. What are you suggesting as an alternative? That MIRI tackle the bigger problems immediately? That’s not going to work.
You are still assuming that infinite systems count as simple versions of real world finite systems, but that is the assumption I am challenging: our best real world AIs aren’t cut down AIXI systems, they are something different entirely, so there is no linear progression from crawling to walking in your terms,
You are still assuming that infinite systems count as simple versions of real world finite systems
That’s not just an assumption; that’s the null hypothesis, the default position. Sure, you can challenge it if you want, but if you do, you’re going to have to provide some evidence why you think there’s going to be a qualitative difference. And even if there is some such difference, it’s still unlikely that we’re going to get literally zero insights about the problem from studying AIXI. That’s an extremely strong absolute claim, and absolute claims are almost always false. Ultimately, if you’re going to criticize MIRI’s approach, you need to provide some sort of plausible alternative, and right now, unfortunately, it doesn’t seem like there are any. As far as I can tell, AIXI is the best way to bet.
That’s not just an assumption; that’s the null hypothesis, the default position. Sure, you can challenge it if you want, but if you do, you’re going to have to provide some evidence why you think there’s going to be a qualitative difference.
I have already pointed out that the best AI systems currently existing are not cut down infinite systems.
And even if there is some such difference, it’s still unlikely that we’re going to get literally zero insights about the problem from studying AIXI. That’s an extremely strong absolute claim, and absolute claims are almost always false.
Something doesn’t have to be completely worthless to be sub optimal.
It may be impossible to always get the conceptual stage right first time, but one can adopt a policy of getting it as firm as possible...rather than a policy of cpnnitatiinally associating conceptual analysis with “semantics”, “true essences” and other bad things, and going straight to maths.
I think you’ve got this backward. Conceptual understanding comes from formal understanding—not the other way around. First, you lay out the math in rigorous fashion with no errors. Then you do things with the math—very carefully. Only then do you get to have a good conceptual understanding of the problem. That’s just the way these things work; try finding a good theory of truth dating from before we had mathematical logic. Trying for conceptual understanding before actually formalizing the problem is likely to be as ineffectual as going around in the eighteenth century talking about “phlogiston” without knowing the chemical processes behind combustion.
When Marcus Hutter attempted to translate his intuitions about optimal intelligence into an equation, he was moving from philosophy to mathematics. You could say that Yudkowsky’s initial objections to AIXI were then a step backwards, into more philosophical / informal questions: ‘Can an equation distill the essence of intelligence without distilling the essence of efficient search?’ ‘Can true intelligence be unreflective?’ (Of course, both Hutter and Yudkowsky were making their proposals and criticisms with an eye toward engineering applications; talking about concrete examples like the anvil problem is more likely to be productive than talking about ‘true essences’. The real import of ‘the essence of intelligence’ is how useful and illuminating a framework is for mathematical and engineering progress.)
In practice, then, progress toward engineering can involve moving two steps forward, then one (or two, or three) steps back. The highest-value things MIRI can do right now mostly involving moving toward mathematics—including better formalizing the limitations of the AIXI equation, and coming up with formally specified alternatives—but that’s probably not true of all Friendly AI questions, and it doesn’t mean we should never take a step back and reassess whether our formal accomplishments represent actual progress toward our informal goals.
So who, in (contemporary, analytical) philosophy talks about true essences?
But that’s inefficient. It’s wasted effort to quantify what doesn’t work conceptually. It may be impossible to always get the conceptual stage right first time, but one can adopt a policy of getting it as firm as possible...rather than a policy of cpnnitatiinally associating conceptual analysis with “semantics”, “true essences” and other bad things, and going straight to maths.
I would have thought that the highest value work is work that is relevant to systems that exist, or will argue in the near future. …but what I see is a lot of work on AIXI (not computably tractable), Bayes (ditto), goal stable agents (no one knows how to build one).
David Oderberg. Thus the danger of asking rhetorical questions.
And he taints the whole field? Thus the danger of supposing I would ask one rhetorical question without having another up my sleeve.
There’s nothing wrong with talking about true essences. When I described Yudkowsky’s “step backward” into philosophy (and thinking about ‘the true essence of intelligence’), I was speaking positively, of a behavior I endorse and want to see more of. My point was that progress toward engineering can exhibit a zigzagging pattern; I think we currently need more zags than zigs, but it’s certainly possible that at some future date we’ll be more zig-deprived than zag-deprived, and philosophy will get prioritized.
How is that relevant?
Writing your intuitions up in a formal, precise way can often help you better understand what they are, and whether they’re coherent. It’s a good way to inspire new ideas and spot counter-intuitive relationships between old ones, and it’s also a good way to do a sanity check on an entire framework. So I don’t think steering clear of math and logic notation is a particularly good way to enhance the quality of philosophical thought; I think it’s frequently more efficient to quickly test your ideas’ coherence and univocality.
It’s relevant to my preference for factually based critique.
Indeed. I was talking about quantification, not formalisation.
‘Formalization’ and mathematical logic is closer to what MIRI has in mind when it says ‘mathematics’. See http://intelligence.org/research-guide.
The argument is that AIXI and Bayes assume infinite computing power, and thus simplify the problem by allowing you to work on it without needing to consider computing power limitations. If you can’t solve the easier form of the problem where you’re allowed infinite computing power, you definitely can’t solve the harder real-world version either, so you should start with the easier problem first.
But the difference between infinity and any finite value is infinity . Intelligence itself, or a substantial subset if it, is easy, given infinite resources, as AIXI shows. But that’s been of no use in developing real world AI: tractable approximations to AIXI aren’t powerful enough to be dangerous.
It would be embarrassing to MIRI if someone cobbled together AI smart enough to be dangerous, and came to the worlds experts on AI safety for some safety features, only to be told “sorry guys, we haven’t got anything that’s compatible with your system, because it’s finite”.
What’s high value again?
It’s arguably been useful in building models of AI safety. To quote Exploratory Engineering in AI:
I feel as though you’re engaging in pedantry for pedantry’s sake. The point is that if we can’t even solve the simplified version of the problem, there’s no way we’re going to solve the hard version—effectively, it’s saying that you have to crawl before you can walk. Your response was to point out that walking is more useful than crawling, which is really orthogonal to the problem here—the problem being, of course, the fact that we haven’t even learned to crawl yet. AIXI and Bayes are useful in that solving AGI problems in the context provided can act as a “stepping stone” to larger and bigger problems. What are you suggesting as an alternative? That MIRI tackle the bigger problems immediately? That’s not going to work.
You are still assuming that infinite systems count as simple versions of real world finite systems, but that is the assumption I am challenging: our best real world AIs aren’t cut down AIXI systems, they are something different entirely, so there is no linear progression from crawling to walking in your terms,
That’s not just an assumption; that’s the null hypothesis, the default position. Sure, you can challenge it if you want, but if you do, you’re going to have to provide some evidence why you think there’s going to be a qualitative difference. And even if there is some such difference, it’s still unlikely that we’re going to get literally zero insights about the problem from studying AIXI. That’s an extremely strong absolute claim, and absolute claims are almost always false. Ultimately, if you’re going to criticize MIRI’s approach, you need to provide some sort of plausible alternative, and right now, unfortunately, it doesn’t seem like there are any. As far as I can tell, AIXI is the best way to bet.
I have already pointed out that the best AI systems currently existing are not cut down infinite systems.
Something doesn’t have to be completely worthless to be sub optimal.
I think you’ve got this backward. Conceptual understanding comes from formal understanding—not the other way around. First, you lay out the math in rigorous fashion with no errors. Then you do things with the math—very carefully. Only then do you get to have a good conceptual understanding of the problem. That’s just the way these things work; try finding a good theory of truth dating from before we had mathematical logic. Trying for conceptual understanding before actually formalizing the problem is likely to be as ineffectual as going around in the eighteenth century talking about “phlogiston” without knowing the chemical processes behind combustion.
You need a certain kind of conceptual understanding in place to know whether a formal investigation is worthwhile or relevant.
Example