Flight has some abstract principles that don’t depend on all the messy biological details of cells, bones and feathers. It will—pretty obviously IMO—be much the same for machine intelligence.
I disagree that it is so obvious. Much of what we call “intelligence” in humans and other animals is actually tacit knowledge about a specific environment. This knowledge gradually accumulated over billions of years, and it works due to immodular systems that improved stepwise and had to retain relevant functionality at each step.
This is why you barely think about bipedal walking, and discovered it on your own, but even now, very few people can explain how it works. It’s also why learning, for humans, largely consists of reducing a problem into something for which we have native hardware.
So intelligence, if it means successful, purposeful manipulation of the environment, does rely heavily on the particulars of our bodies, in a way that powered flight does not.
If we had good stream compressors we would be able to predict the future consequences of actions—a key ability in shaping the future. You don’t need to scan a brain to build a compressor. That is a silly approach to the problem that pushes the solution many decades into the future. Compression is “just” another computer science problem—much like searching or sorting.
Yes, it’s another CS problem, but not like searching or sorting. Those are computable, while (general) compression isn’t. Not surprisingly, the optimal intelligence Hutter presents is uncomputable, as is every other method presented in every research paper that purports to be a general intelligence.
Now, you can make approximations to the ideal, perfect compressor, but that inevitably requires making decisions about what parts of the search space can be ignored at low enough cost—which itself requires insight into the structure of the search space, the very thing you were supposed to be automating!
Attempts to reduce intelligence to comression butt up against the same limits that compression does: you can be good at compressing some kinds of data, only if you sacrifice ability to compress other kinds of data.
With that said, if you can make a computable, general compressor that identifies regularities in the environment many orders of magnitude faster than evolution, then you will have made some progress.
Re: “So intelligence, if it means successful, purposeful manipulation of the environment, does rely heavily on the particulars of our bodies, in a way that powered flight does not.”
Natural selection shaped wings for roughly as long as it has shaped brains. They too are an accumulated product of millions of years of ancestral success stories. Information about both is transmitted via the genome. If there is a point of dis-analogy here between wings and brains, it is not obvious.
Okay, let me explain it this way: when people refer to intelligence, a large part of what they have in mind is the knowedge that we (tacitly) have about a specific environment. Therefore, our bodies are highly informative about a large part (though certainly not the entirety!) of what is meant by intelligence.
In contrast, the only commonality with birds that is desired in the goal “powered human flight” is … the flight thing. Birds have a solution, but they do not define the solution.
In both cases, I agree, the solution afforded by the biological system (bird or human) is not strictly necessary for the goal (flight or intelligence). And I agree that once certain insights are achieved (the workings of aerodynamic lift or the tacit knowledge humans have [such as the assumptions used in interpreting retinal images]), they can be implemented differently from how the biological system does it.
However, for a robot to match the utility of a human e.g. butler, it must know things specific to humans (like what the meanings of words are, given a particular social context), not just intelligence-related things in general, like how to infer causal maps from raw data.
Then why should I care about intelligence by that definition? I want something that performs well in environments humans will want it to perform well in. That’s a tiny, tiny fraction of the set of all computable environments.
A universal intelligent agent should also perform very well in many real world environments. That is part the beauty of the idea of universal intelligence. A powerful universal intelligence can be reasonably expected to invent nanotechnology, fusion, cure cancer, and generally solve many of the world’s problems.
Also, my point is that, yes, something impossibly good could do that. And that would be good. But performing well across all computable universes (with a sorta-short description, etc.) has costs, and one cost is optimality in this universe.
Since we have to choose, I want it optimal for this universe, for purposes we deem good.
A general agent is often sub-optimal on particular problems. However, it should be able to pick them up pretty quick. Plus, it is a general agent, with all kinds of uses.
A lot of people are interested in building generally intelligent agents. We ourselves are highly general agents—i.e. you can pay us to solve an enormous range of different problems.
Generality of intelligence does not imply lack-of-adaptedness to some particular environment. What it means is more that it can potentially handle a broad range of problems. Specialized agents—on the other hand—fail completely on problems outside their domain.
Re: “Attempts to reduce intelligence to comression butt up against the same limits that compression does: you can be good at compressing some kinds of data, only if you sacrifice ability to compress other kinds of data.”
That is not a meaningful limitation. There are general purpose universal compressors. It is part of the structure of reality that sequences generated by short programs are more commonly observed. That’s part of the point of using a compressor—it is an automated way of applying Occam’s razor.
That is not a meaningful limitation. There are general purpose universal compressors.
There are frequently useful general purpose compressors that work by anticipating the most common regularities in the set of files typically generated by humans. But they do not, and cannot, iterate through all the short programs that could have generated the data—it’s too time-consuming.
The point was that general purpose compression is possible. Yes, you sacrifice the ability to compress other kinds of data—but those other kinds of data are highly incompressible and close to random—not the kind of data which most intelligent agents are interested in finding patterns in in the first place.
Yes, you sacrifice the ability to compress other kinds of data—but those other kinds of data are highly incompressible and close to random.
No, they look random and incompressible because effective compression algorithms optimized for this universe can’t compress them. But algorithms optimized for other computable universes may regard them as normal and have a good way to compress them.
Which kinds of data (from computable processes) are likely to be observed in this universe? Ay, there’s the rub.
Compressing sequences from this universe is good enough for me.
Except that the problem you were attacking at the beginning of this thread was general intelligence, which you claimed to be solvable just by good enough compression, but that requires knowing which parts of the search space in this universe are unlikely, which you haven’t shown how to algorithmize.
“Which kinds of data (from computable processes) are likely to be observed in this universe? Ay, there’s the rub.”
Not really—there are well-known results about that—see: …
Yes, but as I keep trying to say, those results are far from enough to get something workable, and it’s not the methodology behind general compression programs.
Arithmetic compression, Huffman compression, Lempel-Ziv compression, etc are all excellent at compressing sequences produced by small programs. Things like:
Those compressors (crudely) implement a computable approximation of Solomonoff induction without iterating through programs that generate the output. How they work is not very relevant here—the point is that they act as general-purpose compressors—and compress a great range of real world data types.
The complaint that we don’t know what types of data are in the universe is just not applicable—we do, in fact, know a considerable amount about that—and that is why we can build general purpose compressors.
What’s with complaining that compressors are uncomputable?!? Just let your search through the space of possible programs skip on to the next one whenever you spend more than an hour executing. Then you have a computable compressor. That ignores a few especially tedious and boring areas of the search space—but so what?!? Those areas can be binned with no great loss.
Did you do the math on this one? Even with only 10% of programs caught in a loop, then it would take almost 400 years to get through all programs up to 24 bits long.
We need something faster.
(Do you see now why Hutter hasn’t simply run AIXI with your shortcut?)
Uh, I was giving a computable algorithm, not a rapid one.
But you were implying that the uncomputability is somehow “not a problem” because of a quick fix you gave, when the quick fix actually means waiting at least 400 years—under unrealistically optimistic assumptions.
The objection that compression is uncomputable strategy is a useless one—you just use a computable approximation instead—with no great loss.
Yes, I do use a computable approximation, and my computable approximation has already done the work of identifying the important part of the search space (and the structure thereof).
And that’s the point—compression algorithms haven’t done so, except to the extent that a programmer has fed them the “insights” (known regularities of the search space) in advance. That doesn’t tell you the algorithmic way to find those regularities in the first place.
Re: “But you were implying that the uncomputability is somehow “not a problem”″
That’s right—uncomputability in not a problem—you just use a computable compression algorithm instead.
Re: “And that’s the point—compression algorithms haven’t done so, except to the extent that a programmer has fed them the “insights” (known regularities of the search space) in advance.”
The universe itself exhibits regularities. In particular sequences generated by small automata are found relatively frequently. This principle is known as Occam’s razor. That fact is exploited by general purpose compressors to compress a wide range of different data types—including many never seen before by the programmers.
“But you were implying that the uncomputability is somehow “not a problem”″
That’s right—uncomputability in not a problem—you just use a computable compression algorithm.
You said that it was not a problem with respect to creating superintelligent beings, and I showed that it is.
The universe itself exhibits regularities. …
Yes, it does. But, again, scientists don’t find them by iterating through the set of computable generating functions, starting with the smallest. As I’ve repeatedly emphasized, that takes too long. Which is why you’re wrong to generalize compression as a practical, all-encompassing answer to the problem of intelligence.
This is growing pretty tedious, for me, and probably others :-(
You did not show uncomputability is a problem in that context.
I never claimed iterating through programs was an effective practical means of compression. So it seems as though you are attacking a straw man.
Nor do I claim that compression is “a practical, all-encompassing answer to the problem of intelligence”.
Stream compression is largely what you need if you want to predict the future, or build parsimonious models based on observations. Those are important things that many intelligent agents want to do—but they are not themselves a complete solution to the problem.
What’s with complaining that compressors are uncomputable?!? Just let your search through the space of possible programs skip on to the next one whenever you spend more than an hour executing. Then you have a computable compressor. That ignores a few especially tedious and boring areas of the search space—but so what?!? Those areas can be binned with no great loss.
You also say:
Nor do I claim that compression is “a practical, all-encompassing answer to the problem of intelligence”.
Again, yes you did. Right here. Though you said compression was only one of the abilities needed, you did claim “If we had good stream compressors we would be able to predict the future consequences of actions...” and predicting the future is largely what people would classify as having solved the problem of intelligence.
I disagree with all three of your points. However, because the discussion has already been going on already for so long—and because it is so tedious and low grade for me, I am not going to publicly argue the toss with you any more. Best wishes...
I disagree that it is so obvious. Much of what we call “intelligence” in humans and other animals is actually tacit knowledge about a specific environment. This knowledge gradually accumulated over billions of years, and it works due to immodular systems that improved stepwise and had to retain relevant functionality at each step.
This is why you barely think about bipedal walking, and discovered it on your own, but even now, very few people can explain how it works. It’s also why learning, for humans, largely consists of reducing a problem into something for which we have native hardware.
So intelligence, if it means successful, purposeful manipulation of the environment, does rely heavily on the particulars of our bodies, in a way that powered flight does not.
Yes, it’s another CS problem, but not like searching or sorting. Those are computable, while (general) compression isn’t. Not surprisingly, the optimal intelligence Hutter presents is uncomputable, as is every other method presented in every research paper that purports to be a general intelligence.
Now, you can make approximations to the ideal, perfect compressor, but that inevitably requires making decisions about what parts of the search space can be ignored at low enough cost—which itself requires insight into the structure of the search space, the very thing you were supposed to be automating!
Attempts to reduce intelligence to comression butt up against the same limits that compression does: you can be good at compressing some kinds of data, only if you sacrifice ability to compress other kinds of data.
With that said, if you can make a computable, general compressor that identifies regularities in the environment many orders of magnitude faster than evolution, then you will have made some progress.
Re: “So intelligence, if it means successful, purposeful manipulation of the environment, does rely heavily on the particulars of our bodies, in a way that powered flight does not.”
Natural selection shaped wings for roughly as long as it has shaped brains. They too are an accumulated product of millions of years of ancestral success stories. Information about both is transmitted via the genome. If there is a point of dis-analogy here between wings and brains, it is not obvious.
Okay, let me explain it this way: when people refer to intelligence, a large part of what they have in mind is the knowedge that we (tacitly) have about a specific environment. Therefore, our bodies are highly informative about a large part (though certainly not the entirety!) of what is meant by intelligence.
In contrast, the only commonality with birds that is desired in the goal “powered human flight” is … the flight thing. Birds have a solution, but they do not define the solution.
In both cases, I agree, the solution afforded by the biological system (bird or human) is not strictly necessary for the goal (flight or intelligence). And I agree that once certain insights are achieved (the workings of aerodynamic lift or the tacit knowledge humans have [such as the assumptions used in interpreting retinal images]), they can be implemented differently from how the biological system does it.
However, for a robot to match the utility of a human e.g. butler, it must know things specific to humans (like what the meanings of words are, given a particular social context), not just intelligence-related things in general, like how to infer causal maps from raw data.
FWIW, I’m thinking of intelligence this way:
“Intelligence measures an agent’s ability to achieve goals in a wide range of environments.”
http://www.vetta.org/definitions-of-intelligence/
Nothing to do with humans, really.
Then why should I care about intelligence by that definition? I want something that performs well in environments humans will want it to perform well in. That’s a tiny, tiny fraction of the set of all computable environments.
A universal intelligent agent should also perform very well in many real world environments. That is part the beauty of the idea of universal intelligence. A powerful universal intelligence can be reasonably expected to invent nanotechnology, fusion, cure cancer, and generally solve many of the world’s problems.
Oracles for uncomputable problems tend to be like that...
Also, my point is that, yes, something impossibly good could do that. And that would be good. But performing well across all computable universes (with a sorta-short description, etc.) has costs, and one cost is optimality in this universe.
Since we have to choose, I want it optimal for this universe, for purposes we deem good.
A general agent is often sub-optimal on particular problems. However, it should be able to pick them up pretty quick. Plus, it is a general agent, with all kinds of uses.
A lot of people are interested in building generally intelligent agents. We ourselves are highly general agents—i.e. you can pay us to solve an enormous range of different problems.
Generality of intelligence does not imply lack-of-adaptedness to some particular environment. What it means is more that it can potentially handle a broad range of problems. Specialized agents—on the other hand—fail completely on problems outside their domain.
Re: “Attempts to reduce intelligence to comression butt up against the same limits that compression does: you can be good at compressing some kinds of data, only if you sacrifice ability to compress other kinds of data.”
That is not a meaningful limitation. There are general purpose universal compressors. It is part of the structure of reality that sequences generated by short programs are more commonly observed. That’s part of the point of using a compressor—it is an automated way of applying Occam’s razor.
There are frequently useful general purpose compressors that work by anticipating the most common regularities in the set of files typically generated by humans. But they do not, and cannot, iterate through all the short programs that could have generated the data—it’s too time-consuming.
The point was that general purpose compression is possible. Yes, you sacrifice the ability to compress other kinds of data—but those other kinds of data are highly incompressible and close to random—not the kind of data which most intelligent agents are interested in finding patterns in in the first place.
No, they look random and incompressible because effective compression algorithms optimized for this universe can’t compress them. But algorithms optimized for other computable universes may regard them as normal and have a good way to compress them.
Which kinds of data (from computable processes) are likely to be observed in this universe? Ay, there’s the rub.
Re: “they look random and incompressible because effective compression algorithms optimized for this universe can’t compress them”
Compressing sequences from this universe is good enough for me.
Re: “Which kinds of data (from computable processes) are likely to be observed in this universe? Ay, there’s the rub.”
Not really—there are well-known results about that—see:
http://en.wikipedia.org/wiki/Occam’s_razor
http://www.wisegeek.com/what-is-solomonoff-induction.htm
Except that the problem you were attacking at the beginning of this thread was general intelligence, which you claimed to be solvable just by good enough compression, but that requires knowing which parts of the search space in this universe are unlikely, which you haven’t shown how to algorithmize.
Yes, but as I keep trying to say, those results are far from enough to get something workable, and it’s not the methodology behind general compression programs.
Arithmetic compression, Huffman compression, Lempel-Ziv compression, etc are all excellent at compressing sequences produced by small programs. Things like:
1010101010101010 110110110110110110 1011011101111011111
...etc.
Those compressors (crudely) implement a computable approximation of Solomonoff induction without iterating through programs that generate the output. How they work is not very relevant here—the point is that they act as general-purpose compressors—and compress a great range of real world data types.
The complaint that we don’t know what types of data are in the universe is just not applicable—we do, in fact, know a considerable amount about that—and that is why we can build general purpose compressors.
What’s with complaining that compressors are uncomputable?!? Just let your search through the space of possible programs skip on to the next one whenever you spend more than an hour executing. Then you have a computable compressor. That ignores a few especially tedious and boring areas of the search space—but so what?!? Those areas can be binned with no great loss.
Did you do the math on this one? Even with only 10% of programs caught in a loop, then it would take almost 400 years to get through all programs up to 24 bits long.
We need something faster.
(Do you see now why Hutter hasn’t simply run AIXI with your shortcut?)
Of course, in practice many loops can be caught, but combinatorial explosions really does blow any technique out of the water.
Uh, I was giving a computable algorithm, not a rapid one.
The objection that compression is uncomputable strategy is a useless one—you just use a computable approximation instead—with no great loss.
But you were implying that the uncomputability is somehow “not a problem” because of a quick fix you gave, when the quick fix actually means waiting at least 400 years—under unrealistically optimistic assumptions.
Yes, I do use a computable approximation, and my computable approximation has already done the work of identifying the important part of the search space (and the structure thereof).
And that’s the point—compression algorithms haven’t done so, except to the extent that a programmer has fed them the “insights” (known regularities of the search space) in advance. That doesn’t tell you the algorithmic way to find those regularities in the first place.
Re: “But you were implying that the uncomputability is somehow “not a problem”″
That’s right—uncomputability in not a problem—you just use a computable compression algorithm instead.
Re: “And that’s the point—compression algorithms haven’t done so, except to the extent that a programmer has fed them the “insights” (known regularities of the search space) in advance.”
The universe itself exhibits regularities. In particular sequences generated by small automata are found relatively frequently. This principle is known as Occam’s razor. That fact is exploited by general purpose compressors to compress a wide range of different data types—including many never seen before by the programmers.
You said that it was not a problem with respect to creating superintelligent beings, and I showed that it is.
Yes, it does. But, again, scientists don’t find them by iterating through the set of computable generating functions, starting with the smallest. As I’ve repeatedly emphasized, that takes too long. Which is why you’re wrong to generalize compression as a practical, all-encompassing answer to the problem of intelligence.
This is growing pretty tedious, for me, and probably others :-(
You did not show uncomputability is a problem in that context.
I never claimed iterating through programs was an effective practical means of compression. So it seems as though you are attacking a straw man.
Nor do I claim that compression is “a practical, all-encompassing answer to the problem of intelligence”.
Stream compression is largely what you need if you want to predict the future, or build parsimonious models based on observations. Those are important things that many intelligent agents want to do—but they are not themselves a complete solution to the problem.
Just to show the circles I’m going in here:
Right, I showed it is a problem in the context in which you originally brought up compression—as a means to solve the problem of intelligence.
Yes, you did. Right here:
You also say:
Again, yes you did. Right here. Though you said compression was only one of the abilities needed, you did claim “If we had good stream compressors we would be able to predict the future consequences of actions...” and predicting the future is largely what people would classify as having solved the problem of intelligence.
I disagree with all three of your points. However, because the discussion has already been going on already for so long—and because it is so tedious and low grade for me, I am not going to publicly argue the toss with you any more. Best wishes...
Okay, onlookers: please decide which of us (or both, or neither) was engaging the arguments of the other, and comment or vote accordingly.
ETA: Other than timtyler, I mean.