Here he touched on this (“Large language models” timestamp in video description), and maybe somewhere else in video, cant seem to find it. It is much better to get it directly from him but it is 4h long so...
My attempt of summary with a bit of inference so take with dose of salt:
There is some “core” of intelligence which he expected to be relatively hard to find by experimentation (but more components than he expected are already found by experimentation/gradient descent so this is partially wrong and he afraid maybe completely wrong).
He was thinking that without full “core” intelligence is non-functional—GPT4 falsified this. It is more functional than he expected, enough to produce mess that can be perceived as human level, but not really. Probably us thinking of GPT4 as being on human level is bias? So GPT4 have impressive pieces but they don’t work in unison with each other?
This is how my (mis)interpretation of his words looks like, last parts I am least certain about. (I wonder, can it be that GPT4 already have all “core” components but just stupid, barely intelligent enough to look impressive because of training?)
So I do think that over time I have come to expect a bit more that things will hang around in a near human place and weird shit will happen as a result. And my failure review where I look back and ask — was that a predictable sort of mistake? I feel like it was to some extent maybe a case of — you’re always going to get capabilities in some order and it was much easier to visualize the endpoint where you have all the capabilities than where you have some of the capabilities. And therefore my visualizations were not dwelling enough on a space we’d predictably in retrospect have entered into later where things have some capabilities but not others and it’s weird. I do think that, in 2012, I would not have called that large language models were the way and the large language models are in some way more uncannily semi-human than what I would justly have predicted in 2012 knowing only what I knew then. But broadly speaking, yeah, I do feel like GPT-4 is already kind of hanging out for longer in a weird, near-human space than I was really visualizing. In part, that’s because it’s so incredibly hard to visualize or predict correctly in advance when it will happen, which is, in retrospect, a bias.
Here he touched on this (“Large language models” timestamp in video description), and maybe somewhere else in video, cant seem to find it. It is much better to get it directly from him but it is 4h long so...
My attempt of summary with a bit of inference so take with dose of salt:
There is some “core” of intelligence which he expected to be relatively hard to find by experimentation (but more components than he expected are already found by experimentation/gradient descent so this is partially wrong and he afraid maybe completely wrong).
He was thinking that without full “core” intelligence is non-functional—GPT4 falsified this. It is more functional than he expected, enough to produce mess that can be perceived as human level, but not really. Probably us thinking of GPT4 as being on human level is bias? So GPT4 have impressive pieces but they don’t work in unison with each other?
This is how my (mis)interpretation of his words looks like, last parts I am least certain about. (I wonder, can it be that GPT4 already have all “core” components but just stupid, barely intelligent enough to look impressive because of training?)
From 38:58 of the podcast:
Thanks, this is exactly the kind of thing I was looking for.