Okay, but (e.g.) deep RL methods can solve problems that apparently require quite complex causal thinking such as playing DotA. I think what is happening here is that while there is no explicit causal modelling happening at the lowest level of the algorithm, the learned model ends up building something that serves the functions of one because that is the simplest way to solve a general class of problems. See the above meta-RL paper for good examples of this. There seems to be no obvious obstruction to scaling this sort of thing up to human-level causal modelling. Can you point to a particular task needing causal inference that you think these methods cannot solve?
Sure, and RL is not a regression problem. The reason RL methods can do causality is they can perform an essentially infinite number of experiments in toy worlds. DL can help RL scale up to more complex toy worlds, and some worlds that are not so toy anymore. But there, it’s not DL on it’s own—it’s DL+RL.
DL is very useful, indeed! In fact, one could use DL as a “subroutine” for causal analysis of the sort Pearl worries about. In fact, people do this now.
Point being it’s no longer “DL”, it’s “DL-as-a-way-to-do-regression + other-methods-that-use-regressions-as-a-subroutine.”
“Can you point to a particular task needing causal inference that you think these methods cannot solve?”
To answer this—anything that’s not a regression problem. At best, you can use DL as a subroutine in some other larger algorithm that needs its own insights to work, that are unrelated to DL. So why would DL get all the credit for solving the problem?
I agree that you do need some sort of causal structure around the function-fitting deep net. The question is how complex this structure needs to be before we can get to HLAI. It seems plausible to me(at least a 10% chance, say) that it could be quite simple, maybe just consisting of modestly more sophisticated versions of the RL algorithms we have so far, combined with really big deep networks.
I disagree. Why would it be simple? Even people who try to get self-driving cars to work (e.g. at Uber) are now using an “engineering stack” approach, rather than formal RL+DL.
Well, the DotA bot pretty much just used PPO,. AlphaZero used MCTS + RL, OpenAI recently got a robot hand to do object manipulation with PPO and a simulator(the simulator was hand-built, but in principle it could be produced by unsupervised learning like in this). Clearly it’s possible to get sophisticated behaviors out of pretty simple RL algorithms. It could be the case that these approaches will “run out of steam” before getting to HLAI, but it’s hard to tell at the moment, because our algorithms aren’t running with the same amount of compute + data as humans (for humans, I am thinking of our entire lifetime experiences as data, which is used to build a cross-domain optimizer).
re: Uber, I agree that at least in the short term most applications in the real world will feature a fair amount of engineering by hand. But the need for this could decrease as more power becomes available, as has been the case in supervised learning.
Well, I am fairly sure DL+RL will not lead to HLAI, on any reasonable timescale that would matter to us. You are not sure. Seems to me, we could turn this into a bet. Any sort of bet where you say DL+RL → HLAI after X years, I will probably take the negation of, gladly.
Hmmm...but if I win the bet then the world may be destroyed, or our environment could change so much the money will become worthless. Would you take 20:1 odds that there won’t be DL+RL-based HLAI in 25 years?
If you think money will be worth a lot now but not much in the future, Ilya could pay you money now in exchange for you paying him a lot of money in the future.
I often hear this response: “I can’t make bets on my beliefs about the Eschaton, because they are about the Eschaton.”
My response to this response is: you have left the path of empiricism if you can’t translate your insight into [topic] (in this case “AI progress”) into taking money via {bets with empirically verifiable outcomes} from folks without your insight.
---
If you are worried the world will change too much in 25 years, can you formulate a nearer-term bet you would be happy with? For example, something non-toy DL+RL would do in 5 years.
“I can’t make bets on my beliefs about the Eschaton, because they are about the Eschaton.” -- Well, it makes sense. Besides, I did offer you a bet taking into account a) that the money may be worth less in my branch b) I don’t think DL + RL AGI is more likely than not, just plausible. If you’re more than 96% certain there will be no such AI, 20:1 odds are a good deal.
But anyways, I would be fine with betting on a nearer-term challenge. How about—in 5 years, a bipedal robot that can run on rough terrain, as in this video, using a policy learned from scratch by DL + RL(possibly including a simulated environment during training) 1:1 odds.
Okay, but (e.g.) deep RL methods can solve problems that apparently require quite complex causal thinking such as playing DotA. I think what is happening here is that while there is no explicit causal modelling happening at the lowest level of the algorithm, the learned model ends up building something that serves the functions of one because that is the simplest way to solve a general class of problems. See the above meta-RL paper for good examples of this. There seems to be no obvious obstruction to scaling this sort of thing up to human-level causal modelling. Can you point to a particular task needing causal inference that you think these methods cannot solve?
Sure, and RL is not a regression problem. The reason RL methods can do causality is they can perform an essentially infinite number of experiments in toy worlds. DL can help RL scale up to more complex toy worlds, and some worlds that are not so toy anymore. But there, it’s not DL on it’s own—it’s DL+RL.
DL is very useful, indeed! In fact, one could use DL as a “subroutine” for causal analysis of the sort Pearl worries about. In fact, people do this now.
Point being it’s no longer “DL”, it’s “DL-as-a-way-to-do-regression + other-methods-that-use-regressions-as-a-subroutine.”
“Can you point to a particular task needing causal inference that you think these methods cannot solve?”
To answer this—anything that’s not a regression problem. At best, you can use DL as a subroutine in some other larger algorithm that needs its own insights to work, that are unrelated to DL. So why would DL get all the credit for solving the problem?
---
Dota’s far from solved, so far.
I agree that you do need some sort of causal structure around the function-fitting deep net. The question is how complex this structure needs to be before we can get to HLAI. It seems plausible to me(at least a 10% chance, say) that it could be quite simple, maybe just consisting of modestly more sophisticated versions of the RL algorithms we have so far, combined with really big deep networks.
I disagree. Why would it be simple? Even people who try to get self-driving cars to work (e.g. at Uber) are now using an “engineering stack” approach, rather than formal RL+DL.
Well, the DotA bot pretty much just used PPO,. AlphaZero used MCTS + RL, OpenAI recently got a robot hand to do object manipulation with PPO and a simulator(the simulator was hand-built, but in principle it could be produced by unsupervised learning like in this). Clearly it’s possible to get sophisticated behaviors out of pretty simple RL algorithms. It could be the case that these approaches will “run out of steam” before getting to HLAI, but it’s hard to tell at the moment, because our algorithms aren’t running with the same amount of compute + data as humans (for humans, I am thinking of our entire lifetime experiences as data, which is used to build a cross-domain optimizer).
re: Uber, I agree that at least in the short term most applications in the real world will feature a fair amount of engineering by hand. But the need for this could decrease as more power becomes available, as has been the case in supervised learning.
It’s easy to tell—they will run out of steam. Want to bet money on a concrete claim? I love money.
Have something in mind?
Well, I am fairly sure DL+RL will not lead to HLAI, on any reasonable timescale that would matter to us. You are not sure. Seems to me, we could turn this into a bet. Any sort of bet where you say DL+RL → HLAI after X years, I will probably take the negation of, gladly.
Hmmm...but if I win the bet then the world may be destroyed, or our environment could change so much the money will become worthless. Would you take 20:1 odds that there won’t be DL+RL-based HLAI in 25 years?
If you think money will be worth a lot now but not much in the future, Ilya could pay you money now in exchange for you paying him a lot of money in the future.
I often hear this response: “I can’t make bets on my beliefs about the Eschaton, because they are about the Eschaton.”
My response to this response is: you have left the path of empiricism if you can’t translate your insight into [topic] (in this case “AI progress”) into taking money via {bets with empirically verifiable outcomes} from folks without your insight.
---
If you are worried the world will change too much in 25 years, can you formulate a nearer-term bet you would be happy with? For example, something non-toy DL+RL would do in 5 years.
“I can’t make bets on my beliefs about the Eschaton, because they are about the Eschaton.” -- Well, it makes sense. Besides, I did offer you a bet taking into account a) that the money may be worth less in my branch b) I don’t think DL + RL AGI is more likely than not, just plausible. If you’re more than 96% certain there will be no such AI, 20:1 odds are a good deal.
But anyways, I would be fine with betting on a nearer-term challenge. How about—in 5 years, a bipedal robot that can run on rough terrain, as in this video, using a policy learned from scratch by DL + RL(possibly including a simulated environment during training) 1:1 odds.
No, that wouldn’t surprise me in 5 years. Nor would that count as “scary progress” to me. That’s bipedalism, not strides towards general intelligence.
---
“Well, it makes sense.”
That makes your beliefs a religion, my friend.
OpenAI Five is very close to being superhuman at Dota. Would you be surprised if it got there in the next few months, without any major changes?
No.