I mostly think “algorithms that involve both SSL and RL” is a much broader space of possible algorithms than you seem to think it is, and thus that there are parts of this broad space that require “fundamental breakthroughs” to access. For example, both AlexNet and differentiable rendering can be used to analyze images via supervised learning with gradient descent. But those two algorithms are very very different from each other! So there’s more to an algorithm than its update rule.
See also 2nd section of this comment, although I was emphasizing alignment-relevant differences there whereas you’re talking about capabilities. Other things include the fact that if I ask you to solve a hard math problem, your brain will be different (different weights, not just different activations / context) when you’re halfway through compared to when you started working on it (a.k.a. online learning, see also here), and the fact that brain neural networks are not really “deep” in the DL sense. Among other things.
Makes sense. I think we’re using the terms differently in scope. By “DL paradigm” I meant to encompass the kind of stuff you mentioned (RL-directing-SS-target (active learning), online learning, different architecture, etc) because they really seemed like “engineering challenges” to me (despite them covering a broad space of algorithms) in the sense that capabilities researchers already seem to be working on & scaling them without facing any apparent blockers to further progress, i.e. in need of any “fundamental breakthroughs”—by which I was pointing more at paradigm shifts away from DL like, idk, symbolic learning.
I mostly think “algorithms that involve both SSL and RL” is a much broader space of possible algorithms than you seem to think it is, and thus that there are parts of this broad space that require “fundamental breakthroughs” to access. For example, both AlexNet and differentiable rendering can be used to analyze images via supervised learning with gradient descent. But those two algorithms are very very different from each other! So there’s more to an algorithm than its update rule.
See also 2nd section of this comment, although I was emphasizing alignment-relevant differences there whereas you’re talking about capabilities. Other things include the fact that if I ask you to solve a hard math problem, your brain will be different (different weights, not just different activations / context) when you’re halfway through compared to when you started working on it (a.k.a. online learning, see also here), and the fact that brain neural networks are not really “deep” in the DL sense. Among other things.
Makes sense. I think we’re using the terms differently in scope. By “DL paradigm” I meant to encompass the kind of stuff you mentioned (RL-directing-SS-target (active learning), online learning, different architecture, etc) because they really seemed like “engineering challenges” to me (despite them covering a broad space of algorithms) in the sense that capabilities researchers already seem to be working on & scaling them without facing any apparent blockers to further progress, i.e. in need of any “fundamental breakthroughs”—by which I was pointing more at paradigm shifts away from DL like, idk, symbolic learning.