...add a primary supergoal which imposes a restriction on the degree to which “instrumental goals” are allowed to supercede the power of other goals. At a stroke, every problem he describes in the paper disappears, with the single addition of a goal that governs the use of instrumental goals—the system cannot say “If I want to achieve goal X I could do that more efficiently if I boosted my power, so therefore I should boost my power to cosmic levels first, and then get back to goal X.”
This is not so simple. “Power” and “instrumental goals” are abstractions. Not things that actually can be programmed into an AI. The AI has no concept of “power” and will do whatever leads to it’s goal.
Imagine for instance, a chess playing AI. You tell it to limit it’s “power”. How do you do this? Is using the queen too powerful? Is taking opposing pieces too powerful? How do you define “power” precisely, in a way that can be coded into an actual algorithm?
Of course the issues of AI risk go well beyond just figuring out how to build an AI that doesn’t want to take over the world. Even if your proposed solution could actually work, you can’t stop other people from making AIs that don’t use it, or use a bugged version of it, etc.
The rest of your essay is just a misunderstanding of what reinforcement learning is. Yes it does have origins in old psychology research. But the field has moved on an awful lot since then.
There are many different ideas on how to implement RL algorithms. But the simplest is, to use an algorithm that can predict the future reward. And then take an action which leads to the highest reward.
This procedure is totally independent of what method is used to predict the future reward. There is absolutely nothing that says it has to be an algorithm that can only make short term predictions. Sure, it’s a lot easier to make algorithms that just predict the short term. But that doesn’t mean it’s impossible to do otherwise. Humans sure seem capable of predicting the long term future.
The RL theorist would say that you somehow did a search through all the quintillions of possible actions you could take, sitting there in front of an equation that requires L’Hôpital’s Rule, and in spite of the fact that the list of possible actions included such possibilities as jumping-on-the-table-and-singing-I-am-the-walrus, and driving-home-to-get-a-marmite-sandwich, and asking-the-librarian-to-go-for-some-cheeky-nandos, you decide instead that the thing that would give you the best dopamine hit right now would be applying L’Hôpital’s Rule to the equation.
This just demonstrates how badly you misunderstand RL. Real AIs like AlphaGo don’t need to search through the entire search space. In fact the reason they work over other methods is because they avoid that. It uses machine learning to eliminate large chunks of the search space almost instantly. And so it only needs to consider a few candidate paths.
People started to point out that the extra machinery was where all the action was happening. And that extra machinery was most emphatically not designed as a kind of RL mechanism, itself. In theory, there was still a tiny bit of reinforcement learning somewhere deep down inside all the extra machinery, but eventually people just said “What’s the point?” Why even bother to use the RL language anymore? The RL, if it is there at all, is pointless. A lot of parameter values get changed in complex ways, inside all the extra machinery, so why even bother to mention the one parameter among thousands, that is supposed to be RL, when it is obvious that the structure of that extra machinery is what matters.
RL is a useful concept, because it lets you get useful work out of other, more limited algorithms. A neural net, on it’s own, can’t do anything but supervised learning. You give it some inputs, and it makes a prediction of what the output should be. You can’t use this to play a video game. You need RL to build on top of it to do anything interesting.
You go on and on about how the tasks that AI researchers achieve with AI are “too simple”. This is just typical AI Effect. Problems in AI don’t seem as hard as they actually are, so significant progress never seems like progress at all, from the outside.
But whatever. You are right that currently NNs are restricted to “simple” tasks that don’t require long term prediction or planning. Because long term prediction is hard. But again, I don’t see any reason to believe it’s impossible. It would certainly be much more complex than today’s feed-forward NNs, but it would still be RL. It would still be just doing predictions about what action leads to the most reward, and taking that action.
There is some recent work in this regard. Researcher are combining “new” NN methods with “old” planning algorithms and symbolic methods, so they can get the best of both worlds.
You keep making this assertion that RL has exponential resource requirements. It doesn’t. It’s a simple loop; predict the action that leads to the highest reward, and take it. With a number of variations, but they are all similar.
The current machine learning algorithms that RL methods use, might have exponential requirements. But so what? No one is claiming that future AIs will be just like today’s machine learning algorithms.
Let’s say you are right about everything. RL doesn’t scale, and future AIs will be based on something entirely different that we can’t even imagine.
So what? The same problems that affect RL apply to every AI architecture. That is the control problem. Making AIs do what we want. The problem that most AI goals lead to them seeking more power to better maximize their goals. The problem is that most utility functions are not aligned with human values.
Unless you have an alternative AI method that isn’t subject to this, you aren’t adding anything. And I’m pretty sure you don’t.
The rest of your essay is just a misunderstanding of what reinforcement learning is. Yes it does have origins in old psychology research. But the field has moved on an awful lot since then.
There are many different ideas on how to implement RL algorithms. But the simplest is, to use an algorithm that can predict the future reward. And then take an action which leads to the highest reward.
This procedure is totally independent of what method is used to predict the future reward.
I really do not like being told that I do not know what reinforcement learning is, by someone who goes on to demonstrate that they haven’t a clue and can’t be bothered to actually read the essay carefully.
This is not so simple. “Power” and “instrumental goals” are abstractions. Not things that actually can be programmed into an AI. The AI has no concept of “power” and will do whatever leads to it’s goal.
Imagine for instance, a chess playing AI. You tell it to limit it’s “power”. How do you do this? Is using the queen too powerful? Is taking opposing pieces too powerful? How do you define “power” precisely, in a way that can be coded into an actual algorithm?
Of course the issues of AI risk go well beyond just figuring out how to build an AI that doesn’t want to take over the world. Even if your proposed solution could actually work, you can’t stop other people from making AIs that don’t use it, or use a bugged version of it, etc.
The rest of your essay is just a misunderstanding of what reinforcement learning is. Yes it does have origins in old psychology research. But the field has moved on an awful lot since then.
There are many different ideas on how to implement RL algorithms. But the simplest is, to use an algorithm that can predict the future reward. And then take an action which leads to the highest reward.
This procedure is totally independent of what method is used to predict the future reward. There is absolutely nothing that says it has to be an algorithm that can only make short term predictions. Sure, it’s a lot easier to make algorithms that just predict the short term. But that doesn’t mean it’s impossible to do otherwise. Humans sure seem capable of predicting the long term future.
This just demonstrates how badly you misunderstand RL. Real AIs like AlphaGo don’t need to search through the entire search space. In fact the reason they work over other methods is because they avoid that. It uses machine learning to eliminate large chunks of the search space almost instantly. And so it only needs to consider a few candidate paths.
RL is a useful concept, because it lets you get useful work out of other, more limited algorithms. A neural net, on it’s own, can’t do anything but supervised learning. You give it some inputs, and it makes a prediction of what the output should be. You can’t use this to play a video game. You need RL to build on top of it to do anything interesting.
You go on and on about how the tasks that AI researchers achieve with AI are “too simple”. This is just typical AI Effect. Problems in AI don’t seem as hard as they actually are, so significant progress never seems like progress at all, from the outside.
But whatever. You are right that currently NNs are restricted to “simple” tasks that don’t require long term prediction or planning. Because long term prediction is hard. But again, I don’t see any reason to believe it’s impossible. It would certainly be much more complex than today’s feed-forward NNs, but it would still be RL. It would still be just doing predictions about what action leads to the most reward, and taking that action.
There is some recent work in this regard. Researcher are combining “new” NN methods with “old” planning algorithms and symbolic methods, so they can get the best of both worlds.
You keep making this assertion that RL has exponential resource requirements. It doesn’t. It’s a simple loop; predict the action that leads to the highest reward, and take it. With a number of variations, but they are all similar.
The current machine learning algorithms that RL methods use, might have exponential requirements. But so what? No one is claiming that future AIs will be just like today’s machine learning algorithms.
Let’s say you are right about everything. RL doesn’t scale, and future AIs will be based on something entirely different that we can’t even imagine.
So what? The same problems that affect RL apply to every AI architecture. That is the control problem. Making AIs do what we want. The problem that most AI goals lead to them seeking more power to better maximize their goals. The problem is that most utility functions are not aligned with human values.
Unless you have an alternative AI method that isn’t subject to this, you aren’t adding anything. And I’m pretty sure you don’t.
You say:
I really do not like being told that I do not know what reinforcement learning is, by someone who goes on to demonstrate that they haven’t a clue and can’t be bothered to actually read the essay carefully.
Bye.