That is part of what I am struggling with when listening to explanations. That I cannot tell how much of me just not seeing how the explanations I have been given explain what I am seeing these models do is me being stupid and uneducated and unexperienced on the topic, and how much of it is those explaining to me bullshitting about their understanding. Like, I am, genuinely, uneducated on the topic. I should expect to be confused. But the type of confusion… I feel like there is something deeper and more problematic to it.
Like, it is like people confidently proposing models of consciousness, and you are like… seriously, if I had shown you a brain, and no hint of subjective experience, based on what you saw in the brain, you are telling me you would have predicted subjective experience? Because you see how it necessarily follows from what is going on here? No? Then don’t tell me you properly understand why the heck we actually got it. Like, I respect people who think we are onto something with recurrent feedback, I am one of them and have been for a long time, I do see a lot of supporting evidence, and it does tickle my intuition. But I resent it when people just go “loops! hand gesture It makes sense, you see?!? This explains everything!” without acknowledging all the leaps we are making in our understanding, and where we would go off the rails if we didn’t know what the result we want to explain looked like already, and how completely uncertain we become when aspects change.
Like, if we apply the current understanding of LLM as explaining what they do right now—does this mean we can make accurate predictions what these models will be able to do in two years, given a range of the following changes? Cause if not… we aren’t explaining, we are just cobbling together something retroactively.
Fwiw, I think the people who made gpt were surprised by its capabilities. I’ve been making smaller language models professionally for five years, and I know far more about them than the average person, and I don’t really understand how chatgpt does some of the stuff it does. Ultimately I think it has to be a fact about language being systematic rather than anything special about chatgpt itself. I.e., the problem of fluently using language is just easier than we (like to) think, not that chatgpt is magic.
There are scaling laws papers, but they just predict how low the loss will go. No one has a very good idea of what capabilities emerge at a given loss level, but we do know from past experience that pretty much fundamentally new stuff does emerge as loss goes down.
If researchers who knew their shit were surprised when these capabilities emerged, and no new discoveries on functionality happened afterwards, and no fundamental theoretical shift in understanding happened afterwards, I would indeed suspect that even those who are competent do not actually understand even now. Surprises of this caliber are an indication that new scientific theories are needed, not something that should just be explained away as somehow overlooked in existing theories despite many bright minds having a strong incentive to look for it.
That is part of what I am struggling with when listening to explanations. That I cannot tell how much of me just not seeing how the explanations I have been given explain what I am seeing these models do is me being stupid and uneducated and unexperienced on the topic, and how much of it is those explaining to me bullshitting about their understanding. Like, I am, genuinely, uneducated on the topic. I should expect to be confused. But the type of confusion… I feel like there is something deeper and more problematic to it.
Like, it is like people confidently proposing models of consciousness, and you are like… seriously, if I had shown you a brain, and no hint of subjective experience, based on what you saw in the brain, you are telling me you would have predicted subjective experience? Because you see how it necessarily follows from what is going on here? No? Then don’t tell me you properly understand why the heck we actually got it. Like, I respect people who think we are onto something with recurrent feedback, I am one of them and have been for a long time, I do see a lot of supporting evidence, and it does tickle my intuition. But I resent it when people just go “loops! hand gesture It makes sense, you see?!? This explains everything!” without acknowledging all the leaps we are making in our understanding, and where we would go off the rails if we didn’t know what the result we want to explain looked like already, and how completely uncertain we become when aspects change.
Like, if we apply the current understanding of LLM as explaining what they do right now—does this mean we can make accurate predictions what these models will be able to do in two years, given a range of the following changes? Cause if not… we aren’t explaining, we are just cobbling together something retroactively.
Fwiw, I think the people who made gpt were surprised by its capabilities. I’ve been making smaller language models professionally for five years, and I know far more about them than the average person, and I don’t really understand how chatgpt does some of the stuff it does. Ultimately I think it has to be a fact about language being systematic rather than anything special about chatgpt itself. I.e., the problem of fluently using language is just easier than we (like to) think, not that chatgpt is magic.
There are scaling laws papers, but they just predict how low the loss will go. No one has a very good idea of what capabilities emerge at a given loss level, but we do know from past experience that pretty much fundamentally new stuff does emerge as loss goes down.
See here for scaling laws stuff: https://www.lesswrong.com/tag/scaling-laws
Thank you for sharing that.
If researchers who knew their shit were surprised when these capabilities emerged, and no new discoveries on functionality happened afterwards, and no fundamental theoretical shift in understanding happened afterwards, I would indeed suspect that even those who are competent do not actually understand even now. Surprises of this caliber are an indication that new scientific theories are needed, not something that should just be explained away as somehow overlooked in existing theories despite many bright minds having a strong incentive to look for it.