This is NOT what the evidence supports, and super misleadingly phrased. (Either that, or it’s straightup magical thinking, which is worse)
The inductive biases / simplicity biases of deep learning are poorly understood but they almost certainly don’t have anything to do with what humans want, per se.
Seems like a misunderstanding. It seems to me that you are alleging that Nora/Quintin believe there is a causal arrow from “Humans want X generalization” to “NNs have X generalization”? If so, I think that’s an uncharitable reading of the quoted text.
I said “Either that, or it’s straightup magical thinking” which was referring to the causal arrow hypothesis. I agree it’s unlikely that they would endorse the causal arrow / magical thinking hypothesis, especially once it’s spelled out like that.
What do you think they meant by “Deep learning is strongly biased toward networks that generalize the way humans want— otherwise, it wouldn’t be economically useful?”
I think they meant that there is an evidential update from “it’s economically useful” upwards on “this way of doing things tends to produce human-desired generalization in general and not just in the specific tasks examined so far.”
Perhaps it’s easy to consider the same style of reasoning via: “The routes I take home from work are strongly biased towards being short, otherwise I wouldn’t have taken them home from work.”
Thanks. The routes-home example checks out IMO. Here’s another one that also seems to check out, which perhaps illustrates why I feel like the original claim is misleading/unhelpful/etc.: “The laws of ballistics strongly bias aerial projectiles towards landing on targets humans wanted to hit; otherwise, ranged weaponry wouldn’t be militarily useful.”
There’s a non-misleading version of this which I’d recommend saying instead, which is something like “Look we understand the laws of physics well enough and have played around with projectiles enough in practice, that we can reasonably well predict where they’ll land in a variety of situations, and design+aim weapons accordingly; if this wasn’t true then ranged weaponry wouldn’t be militarily useful.”
And I would endorse the corresponding claim for deep learning: “We understand how deep learning networks generalize well enough, and have played around with them enough in practice, that we can reasonably well predict how they’ll behave in a variety of situations, and design training environments accordingly; if this wasn’t true then deep learning wouldn’t be economically useful.”
(To which I’d reply “Yep and my current understanding of how they’ll behave in certain future scenarios is that they’ll powerseek, for reasons which others have explained… I have some ideas for other, different training environments that probably wouldn’t result in undesired behavior, but all of this is still pretty up in the air tbh I don’t think anyone really understands what they are doing here nearly as well as e.g. cannoneers in 1850 understood what they were doing.”)
To put it in terms of the analogy you chose: I agree (in a sense) that the routes you take home from work are strongly biased towards being short, otherwise you wouldn’t have taken them home from work. But if you tell me that today you are going to try out a new route, and you describe it to me and it seems to me that it’s probably going to be super long, and I object and say it seems like it’ll be super long for reasons XYZ, it’s not a valid reply for you to say “don’t worry, the routes I take home from work are strongly biased towards being short, otherwise I wouldn’t take them.” At least, it seems like a pretty confusing and maybe misleading thing to say. I would accept “Trust me on this, I know what I’m doing, I’ve got lots of experience finding short routes” I guess, though only half credit for that since it still wouldn’t be an object level reply to the reasons XYZ and in the absence of such a substantive reply I’d start to doubt your expertise and/or doubt that you were applying it correctly here (especially if I had an error theory for why you might be motivated to think that this route would be short even if it wasn’t.)
Seems like a misunderstanding. It seems to me that you are alleging that Nora/Quintin believe there is a causal arrow from “Humans want X generalization” to “NNs have X generalization”? If so, I think that’s an uncharitable reading of the quoted text.
I said “Either that, or it’s straightup magical thinking” which was referring to the causal arrow hypothesis. I agree it’s unlikely that they would endorse the causal arrow / magical thinking hypothesis, especially once it’s spelled out like that.
What do you think they meant by “Deep learning is strongly biased toward networks that generalize the way humans want— otherwise, it wouldn’t be economically useful?”
I think they meant that there is an evidential update from “it’s economically useful” upwards on “this way of doing things tends to produce human-desired generalization in general and not just in the specific tasks examined so far.”
Perhaps it’s easy to consider the same style of reasoning via: “The routes I take home from work are strongly biased towards being short, otherwise I wouldn’t have taken them home from work.”
Thanks. The routes-home example checks out IMO. Here’s another one that also seems to check out, which perhaps illustrates why I feel like the original claim is misleading/unhelpful/etc.: “The laws of ballistics strongly bias aerial projectiles towards landing on targets humans wanted to hit; otherwise, ranged weaponry wouldn’t be militarily useful.”
There’s a non-misleading version of this which I’d recommend saying instead, which is something like “Look we understand the laws of physics well enough and have played around with projectiles enough in practice, that we can reasonably well predict where they’ll land in a variety of situations, and design+aim weapons accordingly; if this wasn’t true then ranged weaponry wouldn’t be militarily useful.”
And I would endorse the corresponding claim for deep learning: “We understand how deep learning networks generalize well enough, and have played around with them enough in practice, that we can reasonably well predict how they’ll behave in a variety of situations, and design training environments accordingly; if this wasn’t true then deep learning wouldn’t be economically useful.”
(To which I’d reply “Yep and my current understanding of how they’ll behave in certain future scenarios is that they’ll powerseek, for reasons which others have explained… I have some ideas for other, different training environments that probably wouldn’t result in undesired behavior, but all of this is still pretty up in the air tbh I don’t think anyone really understands what they are doing here nearly as well as e.g. cannoneers in 1850 understood what they were doing.”)
To put it in terms of the analogy you chose: I agree (in a sense) that the routes you take home from work are strongly biased towards being short, otherwise you wouldn’t have taken them home from work. But if you tell me that today you are going to try out a new route, and you describe it to me and it seems to me that it’s probably going to be super long, and I object and say it seems like it’ll be super long for reasons XYZ, it’s not a valid reply for you to say “don’t worry, the routes I take home from work are strongly biased towards being short, otherwise I wouldn’t take them.” At least, it seems like a pretty confusing and maybe misleading thing to say. I would accept “Trust me on this, I know what I’m doing, I’ve got lots of experience finding short routes” I guess, though only half credit for that since it still wouldn’t be an object level reply to the reasons XYZ and in the absence of such a substantive reply I’d start to doubt your expertise and/or doubt that you were applying it correctly here (especially if I had an error theory for why you might be motivated to think that this route would be short even if it wasn’t.)