This led to a general realization. The animal has a finite set of actions it can make each timestep. (finite control channel outputs). It needs to choose, from the set of all the actions it can take, one that will result in meeting the animal’s goals
It seems that by having access to things like language, a computer, programming languages, the problems of a finite problem space quickly get resolves and no longer pose an issue. Theoretically I could write a program to make me billions of dollars on the stock market tomorrow. So the space of actions is large enough that performing well at it easily leads to vast increases in performance.
I agree that there are some small action-spaces in which being better at performing in them might not help you very much, but I don’t think humans or AIs have that problem.
Please note that the set of actions is constrained to the set, from all the high value actions you know about, that have the highest value.
While yes such a program probably exists (a character sequence that could be typed in at a human timescale to earn 1 billion), you don’t have the information to even consider it as a valid action. Therefore you (probably) cannot do it. You would need to become a quant and it would take both luck and years of your life.
And as a perfect example, this isn’t the optimal action per nature. The optimal action was probably to socially defeat your rivals back in high school and to immediately start a large family, then cheat on your wife later for additional children.
If your brain were less buggy—aka ‘smarter’ in an evolutionary sense—this and similar “high value” moves would be the only action you could consider and humans would still be in the dark ages.
You would need to become a quant and it would take both luck and years of your life.
Well, sure, because I am a fleshy meat human, but it sure seems that you could build a hypothetical mind that is much better at being a quant than humans, who wouldn’t need years of their life to learn it (the same way that we build AIs that are much much better at Go than humans, and don’t need years of existence to train to a level that vastly outperforms human players).
That’s the part I am saying isn’t true, or it wasn’t until recently. The mind if it is limited to a human body has finite I/O. It may simply not be possible to read enough in a humans working lifespan to devise a reliable way to get a billion dollars. (Getting lucky in a series of risky bets is a different story—in that case you didn’t really solve the problem you just won the lottery)
And even if I posit it is true now, imagine you had this kind of mind but were a peasant in russia in 1900? What meaningful thing can you do? You might devise a marginally better way to plow the fields—but again, with limited I/O and lifespan your revised way may not be as overall robust and effective as the way the village elders show you to do it. This is because your intelligence cannot increase the observations you need and you might need decades of data to devise an optimal strategy.
So this relate to the original topic, that to make a hyper intelligent AI it needs to have access to data, clean data with cause and effect, and the best way to do that is to give it access to robotics and the ability to build things.
This limiting factor of physicality might end up making AIs controllable even if they are in theory exponential, the same way a void coefficient makes a nuclear reactor stable.
I really don’t buy the “you need to run lots of experiments to understand how the world works” hypothesis. It really seems like we could have figured out relativity, and definitely newtonian physics, without any fancy experiments. The experiments were necessary to create broad consensus among the scientific community, but basically any video stream a few minutes long would have been sufficient to derive newtonian physics, and probably even sufficient to derive relativity and quantum physics. Definitely if you include anything like observations about objects in the night sky. And indeed, Einstein didn’t really run any experiments, just thought-experiments, with a few physical facts about the constant nature of the speed of light which can easily be rederived from visual artifacts that occur all the time.
For some theoretical coverage of the bayesian ideal here (which I am definitely not saying is achievable), see Eliezer’s posts on Occam’s razor and Solomonoff induction.
If I had this kind of mind as a Russian peasant in 1900? I would have easily developed artificial fertilizer, which is easily producible given common-household items in 1900, and became rich, then probably used my superior ability to model other people to become extremely socially influential, and then develop some pivotal technology like nanotechnology or nukes to take over the world.
I don’t see why I would be blocked on IO in any meaningful way? Modern scientists don’t magically have more I/O than historical people, and a good fraction of our modern inventions don’t require access to particularly specialized resources. What they have is access to theoretical knowledge and other people’s observations, but that’s exactly what a superintelligent AI would be able to independently generate much better.
Well, for relativity you absolutely required observations that couldn’t be seen in a simple video stream. And unfortunately I think you are wrong, I think there are a very large number of incorrect physical models that would also fit the evidence in a short video that a generative network for this. (also there is probably a simpler model than relativity that is still just as correct, it is improbable that we have found the simplest possible model over the space of all of mathematics)
My evidence for this is pretty much any old machine learning model will overfit to an incorrect/non general model unless the data set is very, very large and you are very careful on the training rules.
I think you could not have invented fertilizer for the same reason. Remember, you are infinitely smart but you have no more knowledge than a russian peasant. So you will know nothing of chemistry, and you have no knowledge of how to perform chemistry with household ingredients. Also, you have desires—to eat, to mate, shelter—and your motivations are the same as the russian peasant, just now with your infinite brainpower you will be aware of the optimal path possible from the possible strategies you are able to consider given the knowledge that you have.
Learning chemistry does not accomplish your goals directly, you may be aware of a shorter-term mechanism to do this, and you do not know you will discover anything if you study chemistry.
What observations do I need that are not available in a video stream? I would indeed bet that within the next 15 years, we will derive relativity-like behavior from nothing but videostreams using AI models. Any picture of the night sky will include some kind of gravitational lensing behavior, which was one of the primary pieces of evidence we used to derive relativity. Before we discovered general relativity we just didn’t have a good hypothesis for why that lensing was present (and the effects were small, so we kind of ignored them).
The space of mathematical models that are as simple as relativity strikes me as quite small, probably less than 10000 bits. Like, encoding a simulation in Python with infinite computing power to simulate relativistic bodies is really quite a short program, probably less than 500 lines. There aren’t that many programs of that length that fit the observations of a video stream. Indeed, I think it is very likely that no other models that are even remotely as simple fit the data in a videostream. Of course it depends on how exactly you encode things, but I could probably code you up a python program that simulates general relativity in an afternoon, assuming infinite compute, under most definitions of objects.
Again, what you are missing is there are other explanations that also will fit the data. As an analogy, if someone draws from a deck of cards and presents the cards as random numbers, you will not be able to deduce what they are doing if you have no prior knowledge of cards, and only a short sequence of draws. There will be many possible explanations and some are simpler than ‘is drawing from a set of 52 elements’.
Yeah, that’s why I used the simplicity argument. Of course there are other explanations that fit the data, but are there other explanations that are remotely as simple? I would argue no, because relativity is just already really simple, and there aren’t that many other theories at the same level of simplicity.
I see that we need to actually do this experiment in order for you to be convinced. But I don’t have infinite compute. Maybe you can at least vaguely understand my point : given the space of all functions in all of mathematics, are you certain nothing fits a short sequence of observed events better than relativity? What if there is a little bit of noise in the video?
I would assume other functions also match. Heck, ReLu with the right coefficients matches just about anything so...
ReLu with the right coefficients in a standard neural net architecture is much much more complicated than general relativity. General relativity is a few thousand bits long when written in Python. Normal neural nets almost never have less than a megabyte of parameters, and state of the art models have gigabytes and terrabytes worth of parameters.
Of course there are other things in the space of all mathematical functions that will fit it as well. The video itself is in that space of functions, and that one will have perfect predictive accuracy.
But relativity is not a randomly drawn element from the space of all mathematical functions. The equations are exceedingly simple. “Most” mathematical functions have an infinite number of differing terms. Relativity has just a few, so few indeed that translating it into a language like python is pretty easy, and won’t result in a very long program.
Indeed, one thing about modern machine learning is that it is producing models with an incredibly long description length, compared to what mathematicians and physicists are producing, and this is causing a number of problems for those models. I expect future more AGI-complete systems to produce much shorter description-length models.
It seems that by having access to things like language, a computer, programming languages, the problems of a finite problem space quickly get resolves and no longer pose an issue. Theoretically I could write a program to make me billions of dollars on the stock market tomorrow. So the space of actions is large enough that performing well at it easily leads to vast increases in performance.
I agree that there are some small action-spaces in which being better at performing in them might not help you very much, but I don’t think humans or AIs have that problem.
Please note that the set of actions is constrained to the set, from all the high value actions you know about, that have the highest value.
While yes such a program probably exists (a character sequence that could be typed in at a human timescale to earn 1 billion), you don’t have the information to even consider it as a valid action. Therefore you (probably) cannot do it. You would need to become a quant and it would take both luck and years of your life.
And as a perfect example, this isn’t the optimal action per nature. The optimal action was probably to socially defeat your rivals back in high school and to immediately start a large family, then cheat on your wife later for additional children.
If your brain were less buggy—aka ‘smarter’ in an evolutionary sense—this and similar “high value” moves would be the only action you could consider and humans would still be in the dark ages.
Well, sure, because I am a fleshy meat human, but it sure seems that you could build a hypothetical mind that is much better at being a quant than humans, who wouldn’t need years of their life to learn it (the same way that we build AIs that are much much better at Go than humans, and don’t need years of existence to train to a level that vastly outperforms human players).
That’s the part I am saying isn’t true, or it wasn’t until recently. The mind if it is limited to a human body has finite I/O. It may simply not be possible to read enough in a humans working lifespan to devise a reliable way to get a billion dollars. (Getting lucky in a series of risky bets is a different story—in that case you didn’t really solve the problem you just won the lottery)
And even if I posit it is true now, imagine you had this kind of mind but were a peasant in russia in 1900? What meaningful thing can you do? You might devise a marginally better way to plow the fields—but again, with limited I/O and lifespan your revised way may not be as overall robust and effective as the way the village elders show you to do it. This is because your intelligence cannot increase the observations you need and you might need decades of data to devise an optimal strategy.
So this relate to the original topic, that to make a hyper intelligent AI it needs to have access to data, clean data with cause and effect, and the best way to do that is to give it access to robotics and the ability to build things.
This limiting factor of physicality might end up making AIs controllable even if they are in theory exponential, the same way a void coefficient makes a nuclear reactor stable.
I really don’t buy the “you need to run lots of experiments to understand how the world works” hypothesis. It really seems like we could have figured out relativity, and definitely newtonian physics, without any fancy experiments. The experiments were necessary to create broad consensus among the scientific community, but basically any video stream a few minutes long would have been sufficient to derive newtonian physics, and probably even sufficient to derive relativity and quantum physics. Definitely if you include anything like observations about objects in the night sky. And indeed, Einstein didn’t really run any experiments, just thought-experiments, with a few physical facts about the constant nature of the speed of light which can easily be rederived from visual artifacts that occur all the time.
For some theoretical coverage of the bayesian ideal here (which I am definitely not saying is achievable), see Eliezer’s posts on Occam’s razor and Solomonoff induction.
If I had this kind of mind as a Russian peasant in 1900? I would have easily developed artificial fertilizer, which is easily producible given common-household items in 1900, and became rich, then probably used my superior ability to model other people to become extremely socially influential, and then develop some pivotal technology like nanotechnology or nukes to take over the world.
I don’t see why I would be blocked on IO in any meaningful way? Modern scientists don’t magically have more I/O than historical people, and a good fraction of our modern inventions don’t require access to particularly specialized resources. What they have is access to theoretical knowledge and other people’s observations, but that’s exactly what a superintelligent AI would be able to independently generate much better.
Well, for relativity you absolutely required observations that couldn’t be seen in a simple video stream. And unfortunately I think you are wrong, I think there are a very large number of incorrect physical models that would also fit the evidence in a short video that a generative network for this. (also there is probably a simpler model than relativity that is still just as correct, it is improbable that we have found the simplest possible model over the space of all of mathematics)
My evidence for this is pretty much any old machine learning model will overfit to an incorrect/non general model unless the data set is very, very large and you are very careful on the training rules.
I think you could not have invented fertilizer for the same reason. Remember, you are infinitely smart but you have no more knowledge than a russian peasant. So you will know nothing of chemistry, and you have no knowledge of how to perform chemistry with household ingredients. Also, you have desires—to eat, to mate, shelter—and your motivations are the same as the russian peasant, just now with your infinite brainpower you will be aware of the optimal path possible from the possible strategies you are able to consider given the knowledge that you have.
Learning chemistry does not accomplish your goals directly, you may be aware of a shorter-term mechanism to do this, and you do not know you will discover anything if you study chemistry.
What observations do I need that are not available in a video stream? I would indeed bet that within the next 15 years, we will derive relativity-like behavior from nothing but videostreams using AI models. Any picture of the night sky will include some kind of gravitational lensing behavior, which was one of the primary pieces of evidence we used to derive relativity. Before we discovered general relativity we just didn’t have a good hypothesis for why that lensing was present (and the effects were small, so we kind of ignored them).
The space of mathematical models that are as simple as relativity strikes me as quite small, probably less than 10000 bits. Like, encoding a simulation in Python with infinite computing power to simulate relativistic bodies is really quite a short program, probably less than 500 lines. There aren’t that many programs of that length that fit the observations of a video stream. Indeed, I think it is very likely that no other models that are even remotely as simple fit the data in a videostream. Of course it depends on how exactly you encode things, but I could probably code you up a python program that simulates general relativity in an afternoon, assuming infinite compute, under most definitions of objects.
Again, what you are missing is there are other explanations that also will fit the data. As an analogy, if someone draws from a deck of cards and presents the cards as random numbers, you will not be able to deduce what they are doing if you have no prior knowledge of cards, and only a short sequence of draws. There will be many possible explanations and some are simpler than ‘is drawing from a set of 52 elements’.
Yeah, that’s why I used the simplicity argument. Of course there are other explanations that fit the data, but are there other explanations that are remotely as simple? I would argue no, because relativity is just already really simple, and there aren’t that many other theories at the same level of simplicity.
I see that we need to actually do this experiment in order for you to be convinced. But I don’t have infinite compute. Maybe you can at least vaguely understand my point : given the space of all functions in all of mathematics, are you certain nothing fits a short sequence of observed events better than relativity? What if there is a little bit of noise in the video?
I would assume other functions also match. Heck, ReLu with the right coefficients matches just about anything so...
ReLu with the right coefficients in a standard neural net architecture is much much more complicated than general relativity. General relativity is a few thousand bits long when written in Python. Normal neural nets almost never have less than a megabyte of parameters, and state of the art models have gigabytes and terrabytes worth of parameters.
Of course there are other things in the space of all mathematical functions that will fit it as well. The video itself is in that space of functions, and that one will have perfect predictive accuracy.
But relativity is not a randomly drawn element from the space of all mathematical functions. The equations are exceedingly simple. “Most” mathematical functions have an infinite number of differing terms. Relativity has just a few, so few indeed that translating it into a language like python is pretty easy, and won’t result in a very long program.
Indeed, one thing about modern machine learning is that it is producing models with an incredibly long description length, compared to what mathematicians and physicists are producing, and this is causing a number of problems for those models. I expect future more AGI-complete systems to produce much shorter description-length models.