I would point to AIXI as the most “intelligent” agent possible in the limit. This has a very formalized definition of “intelligent agent”. Intelligence doesn’t have to be human-like at all.
This seems to be a silly though-experiment to me:
To quote from the article
There is an agent, and an environment, which is a computable function unknown to the agent
which is equivalent to
There is an agent and a computable function unknown to the agent
If you want to reduce the universe to a computable function where randomized exploration is enough for us to determine the shape… and exploration is free… then yeah, I can’t argue against that reductionist model.
In the reality we live in, however, we don’t know:
a) Random exploration is not cheap, indeed, random exploration is quite expensive. Acquiring any given “interesting” insight about our shared perception (aka the world) is probably costly to the tune of a few hundreds of millions of dollars, or at least to the tune of a lot of wasted time and elecricity if we lower the bar for interesting.
b) “Simpler computations are more likely a priori to describe the environment than more complex ones” … there are people working on encoding that have wrestled with a version of this problem, it ends up not being a very simple heuristic to implement, even when your domain is as limited as “The sum of all 50s jazz songs”, I think a heuristic that even roughly approximates simplifying an equation that describes something as complex as the world is impossible to reach.… and if the simplification is done one a small part of the equation there’s no guarantee the equation you end up with won’t be more complex than if you were to not simplify anything.
c) Whether the universe contains enough resources to define itself, or even a small portion of itself within some given margin of errors, is unkown to use. It might be that all the resource in the universe are not enough to simulate the behavior of the fundamental particles in an apple (or of all humans on Earth). There’s actual observations about seeming “randomness” in particle physics that would back up this view. This argument I make, since I assume most proponents of AIXI will assume it might try to gain “real” insights via some sort of simulation-building.
But maybe I’m miss-understanding AIXI… in that case, please let me know
Also, AIXI has been directly approximated using Monte Carlo methods, and the resulting agent systems do show “intelligent” behavior, so the formalism basically works. I am not suggesting this is a good path to AGI, that’s not the point.
My point is that AIXI is a non-anthropomorphic definition of “intelligent agent”, in direct response to your “Defining intelligence” section where you specifically say
The problem is that we aren’t intelligent enough to define intelligent. If the definition of intelligence does exist, there is no clear path to finding out what it is.
Even worse, I would say it’s highly unlikely that the definition of intelligence exists. What I might consider intelligent is not what you would consider intelligent…
And I’m pointing out that we have a definition already, and that definition is AIXI.
If you want to reduce the universe to a computable function
That’s called “physics”! We’re using computer programs to model the universe. The map is not the territory.
Simpler computations are more likely a priori to describe the environment than more complex ones
Also known as Occam’s Razor. Solomonoff induction, which AIXI is based on, is a formalization of the principle. Since the hypothesis space is literally all possible computer programs, the set is infinite. We can’t very well assign them all equal probability of being the correct model, or our probabilities would add up to infinity instead of the expected 100%.
Whether the universe contains enough resources to define itself, or even a small portion of itself within some given margin of errors, is unkown to use.
We humans predict small portions of the universe all the time. It’s called “planning”. More formally, it’s called “physics”. To the extent that parts of the universe are truly random, it’s irrelevant to the question of artificial intelligence, which need only concern itself with predicting what it can, and accounting for uncertainty in the rest. Humans are no better. But even quantum physics is Turing computable. We have no good reason to think that the laws of physics are not computable, but there are unknown initial conditions.
A reduce sub-set of mathematics, yes, but that reduced sub-set is all that survives. Numerology, for example, has been and still is useless to science, however that hasn’t stopped hundreds of thousands of people from becoming numerologists.
Further more, math is often used as a way to formalize scientific finding, but the fact that a mathematical formalism for a thing exist doesn’t mean that thing exists.
Also, AIXI has been directly approximated using Monte Carlo methods, and the resulting agent systems do show “intelligent” behavior, so the formalism basically works. I am not suggesting this is a good path to AGI, that’s not the point.
This is the point where I start to think that, although we both seem to speak English, we certainly understand fully different things from some words and turns of phrase.
Why bring up AIXI if you yourself admit you are not going to defend the approach as a good path to AGI ?
Why is a system with “some” intelligent behavior proof that the paradigm is useful at all ?
I can use pheromones to train and ant to execute certain “intelligent” tasks, or food to train a parrot or a dog. Yet the fact that African parrots and indeed learns simple NLP tasks leads no credence to the idea that and African parrot based bleeding-edge NLP system is something worth pursuing.
And I’m pointing out that we have a definition already, and that definition is AIXI.
If we have a definition that is non-functional, i.e. one that can’t be used to actually get an intelligent agent, I would claim we don’t have a definition.
We have some people that imagine they have a definition but have no proof said definition works.
Would you consider string theory to be a definition of how the universe works in spite of essentially no experimental evidence backing it up ? If so, that might be the crux of our argument here (though one that I wouldn’t claim I’m able to resolve)
That’s called “physics”! We’re using computer programs to model the universe. The map is not the territory.
Yes, and physics is *very bad* at modeling complex systems, that’s why 2 centuries were spent pondering about how to model systems consisting of 3 interacting objects.
Physics is amazing at launching rockets into space, causing chain reactions and creating car engines.
But if you were to model even a weak representation of a simple gene (aka one that doesn’t stand up to reality, one that couldn’t be used to model an entire’s bacteria DAN plus afferent structural elements), it would take you a few days and some super-computers to get a few nano-seconds of that simulation: https://onlinelibrary.wiley.com/doi/abs/10.1002/jcc.25840
That is not to take anything away from how amazing the fact that we can model the above is, but if you are assuming that an efficient yet reasonably cheap (computationally) model of the universe exists, or if you are assuming it *can* exist you are ignoring all evidence thus far which is pointing towards the fact that:
a) It doesn’t exist
b) There’s no intuition or theory pointing us towards the fact that one could be built, and indeed there’s some evidence (again, see inherent randomness or co-dependences of various particles that can propagate into large differences in macro systems if even a single particle is modeled incorrectly).
Also known as Occam’s Razor. Solomonoff induction, which AIXI is based on, is a formalization of the principle. Since the hypothesis space is literally all possible computer programs, the set is infinite. We can’t very well assign them all equal probability of being the correct model, or our probabilities would add up to infinity instead of the expected 100%.
No idea what you’re on about here
We humans predict small portions of the universe all the time. It’s called “planning”. More formally, it’s called “physics”. To the extent that parts of the universe are truly random, it’s irrelevant to the question of artificial intelligence, which need only concern itself with predicting what it can, and accounting for uncertainty in the rest. Humans are no better. But even quantum physics is Turing computable. We have no good reason to think that the laws of physics are not computable, but there are unknown initial conditions.
Humans are better, since humans can have sensory experience and they can act with the environment.
A hypothetical system that *learns* the universe can’t interact with the environment, that’s the fundamental difference and I spend a whole chapter trying to explain it.
Assuming a human can’t model “X” that’s fine, because a human can design an experiment to see how “X” happens, that’s how we usually do science, indeed, our models are usually just guidelines for how to run experiments.
All but madmen would use theory alone to build or understand a complex system. On the other hand, we have managed to “understand” complex systems many times with not theoretical backing whatsoever (e.g. being able to cure bacterial infections via antibiotics before being able to even see single bacteria under a microscope, not to even mention having some sort of coherent model about what bacterias are and how one might model them, which is still an issue today).
If our hypothetical AGI is bound by the same modeling limitations we are, then it has to run experiments in the real world, and once again we come up to the cost problem. I.e. the experiments needed to better understand the universe might not be too hard to think of, they might just take too long, be physically impossible with our current technology or too expensive… and then even a SF-level AGI becomes as slightly better scientists, rather than a force for unlocking the mysteries of the universe.
This is the point where I start to think that, although we both seem to speak English, we certainly understand fully different things from some words and turns of phrase.
Why bring up AIXI if you yourself admit you are not going to defend the approach as a good path to AGI ?
Implementing true AIXI is not physically possible. It’s uncomputable. I did not say that “AIXI” is not a good path to AGI. I said that a Monte Carloapproximation of AIXI is not a good path.
And it’s not that it can’t work, clearly it does (and indeed any intelligent agent, humans included, is going to be some kind of an approximation of AIXI) but that the Monte Carlo AIXI has certain problems that make the approach not good:
It’s not efficient; other approximations besides Monte Carlo AIXI are probably easier to get good performance out of, the human brain being one such example.
Direct attempts at approximation will run into the wireheading problem. Any useful agent with physical influence over its own brain can run into this issue unless it is specifically dealt with. (AIXI’s brain is not even in the same universe it acts on.)
The Sorcerer’s Apprentice problem: any useful reward function we can come up with seems to result in an agent that is very dangerous to humans. The AIXI paper takes this function as a given without explaining how to do it safely. This is Bostrom’s orthogonality thesis: a general optimizer can be set to optimize pretty much anything. It’s not going to have anything like a conscience unless we program that in, and we don’t know how to do that yet. If we figure out AGI before we figure out Friendly AGI, we’re dead. Or worse.
AIXI is the perfect rolling sphere of advanced agent theory—it’s not realistic, but you can’t understand more complicated scenarios if you can’t envision the rolling sphere.
Implementing true AIXI is not physically possible. It’s uncomputable. I did not say that “AIXI” is not a good path to AGI. I said that a Monte Carloapproximation of AIXI is not a good path.
So, you as saying “it’s no a good path”
*but* then you claim:
And it’s not that it can’t work, clearly it does (and indeed any intelligent agent, humans included, is going to be some kind of an approximation of AIXI) but that the Monte Carlo AIXI has certain problems that make the approach not good:
My argument here is that it doesn’t, it’s an empty formalism with no practical application.
You claim that this approach can work, but again, I’m saying you don’t understand the practical problems of the universe we live in: There are literal physical limitations to computational resources. This approach is likely too stupid to be able to simulate anything relevant even if all the atoms in the universe were to be arrange to create an optimal computer to run it. Or, at least, you have no leg to stand on claiming otherwise considering current performance of an optimize yet inferior implementation.
So essentially:
You are claiming that an in-practical model is proof that a practical model could exist because the in-practical model would work in a fictional reality thus a practical.
My argument here is that it doesn’t, it’s an empty formalism with no practical application. … There are literal physical limitations to computational resources.
Where do I even start with this? That argument proves too much. You could apply the same argument to engineering in general. “Well, it would take infinite computing power to sum up an integral, so I guess we can’t ever use numerical approximations.” Please read through An Intuitive Explanation of Solomonoff Induction. In particular, I will highlight:
But we can find shortcuts. Suppose you know that the exact recipe for baking a cake asks you to count out one molecule of H2O at a time until you have exactly 0.5 cups of water. If you did that, you might not finish the cake before the heat death of the universe. But you could approximate that part of the recipe by measuring out something very close to 0.5 cups of water, and you’d probably still end up with a pretty good cake.
Similarly, once we know the exact recipe for finding truth, we can try to approximate it in a way that allows us to finish all the steps sometime before the sun burns out.
Now you are literally using Zeno to steal cattles.
The problem with your very wide perspective seems to be that you are basically taking Pascal’s wager.
But for now I honestly don’t have time to continue this argument, especially if you’re gonna take the read-this-book-peasant style approach to filling understanding/interpretation gaps.
Though this discussion has given me an idea about a “Why genetically engineered parrots will bring about the singularity” as a counter-argument to this kind of logic.
Other than that, congrats you “win”, I’m afraid however that I no further understand your position or why you hold it then when we began. Nor do I understand what would change your position or what it’s principal pillars are… :/
You’re fighting a strawman, George. You clearly do not understand our real arguments. Attempts to point this out have only been met with your hostility. I do not have the patience to tutor one so unwilling to study.
If your post involves topics that were already covered in the sequences you should build on them, not repeat what has already been said. If your post makes mistakes that were warned against in the sequences, you’ll likely be downvoted and directed to the sequence in question.
That is exactly what is happening here. The symptoms of this dialogue are diagnostic of an inferential gap. Your case is not the first. Read the Sequences, George. Especially the parts we’ve linked you to.
On the other hand, we’re well aware that it can take a long time to read through several years worth of blog posts, so we’ve labeled the most important as “core sequences”. Looking through the core sequences should be enough preparation for most of the discussions that take place here. We do recommend that you eventually read them all, but you can take your time getting through them as you participate. Before discussing a specific topic, consider looking to see if if there is any obvious sequence on that topic.
Again, I think you are veering into religious thinking here, just because something has Eliezer Yudkowsky’s name on it it doesn’t mean that it’s true.
Personally I know the essay and I happen to fundamentally disagree with it. I’m a pragmatic bayesian at the best of times and a radical empiricist at my worst, so the kind of view Eliezer espouses has very little sway on me.
But despite the condescending voice you give this reply, if I am to make a very course assumption, I can probably summarize our difference here as me putting too much weight behind error accumulation in my model of the world or you not taking into account how error accumulation works (not saying one perspective or the other is correct, again, this is I think where we differ and I assume our difference is quite fundamental in nature).
Given the fact that your arguments seem to be mainly based on simple formal model working in what I see as an “ideal” universe, from which you then draw your chain of inferences leading to powerful AGI, I assume you might have a background in mathematics and/or philosophy.
I do think that my article is actually rather bad at addressing AGI from this angle.
I’m honestly unsure if the issue could even be addressed from this perspective, but I do think it might be worth a broader piece addressing why this perspective is flawed (i.e. an argument for why a perspective/model/world-view based on a long inferential distances is inherently flawed).
So, I honestly think this conversation might not have been pointless after all, at least not from my side, because it gives me an idea for an essay and a reason to write it.
Granted, I assume you have still gained nothing in terms of understanding my perspective, because quite frankly I did a bad job at addressing it in such a way that you would understand, I was not addressing the correct problem. So for that I am sorry.
Then again, I might be making too many assumptions about your perspective and background here, stacking imperfect inference upon imperfect inference and creating a caricature that does not match reality in any meaningful way.
This seems to be a silly though-experiment to me:
To quote from the article
which is equivalent to
If you want to reduce the universe to a computable function where randomized exploration is enough for us to determine the shape… and exploration is free… then yeah, I can’t argue against that reductionist model.
In the reality we live in, however, we don’t know:
a) Random exploration is not cheap, indeed, random exploration is quite expensive. Acquiring any given “interesting” insight about our shared perception (aka the world) is probably costly to the tune of a few hundreds of millions of dollars, or at least to the tune of a lot of wasted time and elecricity if we lower the bar for interesting.
b) “Simpler computations are more likely a priori to describe the environment than more complex ones” … there are people working on encoding that have wrestled with a version of this problem, it ends up not being a very simple heuristic to implement, even when your domain is as limited as “The sum of all 50s jazz songs”, I think a heuristic that even roughly approximates simplifying an equation that describes something as complex as the world is impossible to reach.… and if the simplification is done one a small part of the equation there’s no guarantee the equation you end up with won’t be more complex than if you were to not simplify anything.
c) Whether the universe contains enough resources to define itself, or even a small portion of itself within some given margin of errors, is unkown to use. It might be that all the resource in the universe are not enough to simulate the behavior of the fundamental particles in an apple (or of all humans on Earth). There’s actual observations about seeming “randomness” in particle physics that would back up this view. This argument I make, since I assume most proponents of AIXI will assume it might try to gain “real” insights via some sort of simulation-building.
But maybe I’m miss-understanding AIXI… in that case, please let me know
Math is naught but thought experiments, and yet unreasonably effective in science.
Also, AIXI has been directly approximated using Monte Carlo methods, and the resulting agent systems do show “intelligent” behavior, so the formalism basically works. I am not suggesting this is a good path to AGI, that’s not the point.
My point is that AIXI is a non-anthropomorphic definition of “intelligent agent”, in direct response to your “Defining intelligence” section where you specifically say
And I’m pointing out that we have a definition already, and that definition is AIXI.
That’s called “physics”! We’re using computer programs to model the universe. The map is not the territory.
Also known as Occam’s Razor. Solomonoff induction, which AIXI is based on, is a formalization of the principle. Since the hypothesis space is literally all possible computer programs, the set is infinite. We can’t very well assign them all equal probability of being the correct model, or our probabilities would add up to infinity instead of the expected 100%.
We humans predict small portions of the universe all the time. It’s called “planning”. More formally, it’s called “physics”. To the extent that parts of the universe are truly random, it’s irrelevant to the question of artificial intelligence, which need only concern itself with predicting what it can, and accounting for uncertainty in the rest. Humans are no better. But even quantum physics is Turing computable. We have no good reason to think that the laws of physics are not computable, but there are unknown initial conditions.
A reduce sub-set of mathematics, yes, but that reduced sub-set is all that survives. Numerology, for example, has been and still is useless to science, however that hasn’t stopped hundreds of thousands of people from becoming numerologists.
Further more, math is often used as a way to formalize scientific finding, but the fact that a mathematical formalism for a thing exist doesn’t mean that thing exists.
This is the point where I start to think that, although we both seem to speak English, we certainly understand fully different things from some words and turns of phrase.
Why bring up AIXI if you yourself admit you are not going to defend the approach as a good path to AGI ?
Why is a system with “some” intelligent behavior proof that the paradigm is useful at all ?
I can use pheromones to train and ant to execute certain “intelligent” tasks, or food to train a parrot or a dog. Yet the fact that African parrots and indeed learns simple NLP tasks leads no credence to the idea that and African parrot based bleeding-edge NLP system is something worth pursuing.
If we have a definition that is non-functional, i.e. one that can’t be used to actually get an intelligent agent, I would claim we don’t have a definition.
We have some people that imagine they have a definition but have no proof said definition works.
Would you consider string theory to be a definition of how the universe works in spite of essentially no experimental evidence backing it up ? If so, that might be the crux of our argument here (though one that I wouldn’t claim I’m able to resolve)
Yes, and physics is *very bad* at modeling complex systems, that’s why 2 centuries were spent pondering about how to model systems consisting of 3 interacting objects.
Physics is amazing at launching rockets into space, causing chain reactions and creating car engines.
But if you were to model even a weak representation of a simple gene (aka one that doesn’t stand up to reality, one that couldn’t be used to model an entire’s bacteria DAN plus afferent structural elements), it would take you a few days and some super-computers to get a few nano-seconds of that simulation: https://onlinelibrary.wiley.com/doi/abs/10.1002/jcc.25840
That is not to take anything away from how amazing the fact that we can model the above is, but if you are assuming that an efficient yet reasonably cheap (computationally) model of the universe exists, or if you are assuming it *can* exist you are ignoring all evidence thus far which is pointing towards the fact that:
a) It doesn’t exist
b) There’s no intuition or theory pointing us towards the fact that one could be built, and indeed there’s some evidence (again, see inherent randomness or co-dependences of various particles that can propagate into large differences in macro systems if even a single particle is modeled incorrectly).
No idea what you’re on about here
Humans are better, since humans can have sensory experience and they can act with the environment.
A hypothetical system that *learns* the universe can’t interact with the environment, that’s the fundamental difference and I spend a whole chapter trying to explain it.
Assuming a human can’t model “X” that’s fine, because a human can design an experiment to see how “X” happens, that’s how we usually do science, indeed, our models are usually just guidelines for how to run experiments.
All but madmen would use theory alone to build or understand a complex system. On the other hand, we have managed to “understand” complex systems many times with not theoretical backing whatsoever (e.g. being able to cure bacterial infections via antibiotics before being able to even see single bacteria under a microscope, not to even mention having some sort of coherent model about what bacterias are and how one might model them, which is still an issue today).
If our hypothetical AGI is bound by the same modeling limitations we are, then it has to run experiments in the real world, and once again we come up to the cost problem. I.e. the experiments needed to better understand the universe might not be too hard to think of, they might just take too long, be physically impossible with our current technology or too expensive… and then even a SF-level AGI becomes as slightly better scientists, rather than a force for unlocking the mysteries of the universe.
Implementing true AIXI is not physically possible. It’s uncomputable. I did not say that “AIXI” is not a good path to AGI. I said that a Monte Carlo approximation of AIXI is not a good path.
And it’s not that it can’t work, clearly it does (and indeed any intelligent agent, humans included, is going to be some kind of an approximation of AIXI) but that the Monte Carlo AIXI has certain problems that make the approach not good:
It’s not efficient; other approximations besides Monte Carlo AIXI are probably easier to get good performance out of, the human brain being one such example.
Direct attempts at approximation will run into the wireheading problem. Any useful agent with physical influence over its own brain can run into this issue unless it is specifically dealt with. (AIXI’s brain is not even in the same universe it acts on.)
The Sorcerer’s Apprentice problem: any useful reward function we can come up with seems to result in an agent that is very dangerous to humans. The AIXI paper takes this function as a given without explaining how to do it safely. This is Bostrom’s orthogonality thesis: a general optimizer can be set to optimize pretty much anything. It’s not going to have anything like a conscience unless we program that in, and we don’t know how to do that yet. If we figure out AGI before we figure out Friendly AGI, we’re dead. Or worse.
From the Aribital article:
So, you as saying “it’s no a good path”
*but* then you claim:
My argument here is that it doesn’t, it’s an empty formalism with no practical application.
You claim that this approach can work, but again, I’m saying you don’t understand the practical problems of the universe we live in: There are literal physical limitations to computational resources. This approach is likely too stupid to be able to simulate anything relevant even if all the atoms in the universe were to be arrange to create an optimal computer to run it. Or, at least, you have no leg to stand on claiming otherwise considering current performance of an optimize yet inferior implementation.
So essentially:
You are claiming that an in-practical model is proof that a practical model could exist because the in-practical model would work in a fictional reality thus a practical.
It makes no sense to me.
Where do I even start with this? That argument proves too much. You could apply the same argument to engineering in general. “Well, it would take infinite computing power to sum up an integral, so I guess we can’t ever use numerical approximations.” Please read through An Intuitive Explanation of Solomonoff Induction. In particular, I will highlight:
Now you are literally using Zeno to steal cattles.
The problem with your very wide perspective seems to be that you are basically taking Pascal’s wager.
But for now I honestly don’t have time to continue this argument, especially if you’re gonna take the read-this-book-peasant style approach to filling understanding/interpretation gaps.
Though this discussion has given me an idea about a “Why genetically engineered parrots will bring about the singularity” as a counter-argument to this kind of logic.
Other than that, congrats you “win”, I’m afraid however that I no further understand your position or why you hold it then when we began. Nor do I understand what would change your position or what it’s principal pillars are… :/
You’re fighting a strawman, George. You clearly do not understand our real arguments. Attempts to point this out have only been met with your hostility. I do not have the patience to tutor one so unwilling to study.
If you have any desire to cross the inferential gap, I will refer you to the LessWrong FAQ
That is exactly what is happening here. The symptoms of this dialogue are diagnostic of an inferential gap. Your case is not the first. Read the Sequences, George. Especially the parts we’ve linked you to.
Again, I think you are veering into religious thinking here, just because something has Eliezer Yudkowsky’s name on it it doesn’t mean that it’s true.
Personally I know the essay and I happen to fundamentally disagree with it. I’m a pragmatic bayesian at the best of times and a radical empiricist at my worst, so the kind of view Eliezer espouses has very little sway on me.
But despite the condescending voice you give this reply, if I am to make a very course assumption, I can probably summarize our difference here as me putting too much weight behind error accumulation in my model of the world or you not taking into account how error accumulation works (not saying one perspective or the other is correct, again, this is I think where we differ and I assume our difference is quite fundamental in nature).
Given the fact that your arguments seem to be mainly based on simple formal model working in what I see as an “ideal” universe, from which you then draw your chain of inferences leading to powerful AGI, I assume you might have a background in mathematics and/or philosophy.
I do think that my article is actually rather bad at addressing AGI from this angle.
I’m honestly unsure if the issue could even be addressed from this perspective, but I do think it might be worth a broader piece addressing why this perspective is flawed (i.e. an argument for why a perspective/model/world-view based on a long inferential distances is inherently flawed).
So, I honestly think this conversation might not have been pointless after all, at least not from my side, because it gives me an idea for an essay and a reason to write it.
Granted, I assume you have still gained nothing in terms of understanding my perspective, because quite frankly I did a bad job at addressing it in such a way that you would understand, I was not addressing the correct problem. So for that I am sorry.
Then again, I might be making too many assumptions about your perspective and background here, stacking imperfect inference upon imperfect inference and creating a caricature that does not match reality in any meaningful way.