Meanwhile, Jeff Hawkins says “Every part of the neocortex is running the same algorithm”, and it’s looking like maybe brains aren’t doing that complicated a set of things.
This is nitpicking, but your post goes back and forth between the “underlying algorithm” level and the “learned model” level. Jeff Hawkins is talking about the underlying algorithm level when he says that it is (more or less) the same in every part of the neocortex. But almost all the things you mention in “My algorithm as I understand it” are habits of thought that you’ve learned over the years. (By the same token, we should distinguish between “Transformer + SGD” and “whatever calculations are being done by the particular weight settings in the trained Transformer model”.)
I don’t expect there to be much simplicity or universality at the “learned model” level … I expect that people use lots of different habits of thought.
Has anyone done anything like “Train a neural net on Reddit, where it’s somehow separately rewarded for predicting the next word, and also for predicting how much karma a cluster of words will get, and somehow propagating that back into the language generation?”)
I imagine the easiest thing would be to pre-pend the karma to each post, fine-tune the model, then you can generate high-karma posts by just prompting with “Karma 1000: …”. I’m not aware of anyone having done this specific thing but didn’t check. I vaguely recall something like that for AlphaStar, where they started by imitation learning with the player’s skill flagged, and then could adjust the flag to make their system play better or worse.
What’s happening in System 2 thought?
If you haven’t already, see Kaj’s Against System 1 and System 2. I agree with everything he wrote; the way I would describe it is: Our brains house a zoo of compositional generative models, and system 2 is a cool thing where generative models can self-assemble into an ad-hoc crappy serial computer. For example, you can learn a Generative Model X that first summons a different Generative Model Y, and then summons either Generative Model Z₁ or Z₂ conditional on some feature of Generative Model Y. (Something like that … I guess I should write this up better someday.) Anyway, this is a pretty neat trick. Can a trained Transformer NN do anything like that? I think there’s some vague sense in which a 6-layer Transformer can do similar things as a series of 6 serial human thoughts maybe?? I don’t know. There’s definitely a ton of differences too.
...chunking...
My vague sense about foresight (rolling out multiple steps before deciding what to do) is that it’s helpful for sample-efficiency but not required in the limit of infinite training data. Some examples: in RL, both TD learning and tree search eventually converge to the same optimal answer; AlphaGo without a tree search is good but not as good as AlphaGo with a tree search.
Perhaps not coincidentally, language models are pretty sample inefficient compared to people...
In my everyday life, I feel like my thoughts very often involve a sequence of two or three chunks, like “I will reach into my bag and then pull out my wallet”, and somewhat less often is it a longer sequence than that, but i dunno.
Maybe “AlphaStar can’t properly block or not block narrow passages using buildings” is an example where it’s held back by lack of foresight.
Thanks! Will reply to some different bits separately. First, on reddit-karma training:
I imagine the easiest thing would be to pre-pend the karma to each post, fine-tune the model, then you can generate high-karma posts by just prompting with “Karma 1000: …”.
This doesn’t accomplish what I’m going for (probably). The key thing I want is to directly reward GPT disproportionately in different circumstances. As I currently understand it, every situation for GPT is identical – bunch of previous words, one more word to predict, graded on that one word.
GPT never accidentally touches a burning hot stove, or gets a delicious meal, or builds up a complicated web of social rewards that they aspire to succeed at. I bet toddlers learn not to touch hot stoves very quickly even without parental supervision, faster than GPT could.
I don’t want “1 karma”, “10 karma” and “100 karma” to be a few different words with different associations. I want 10 karma to be 10x the reward of 1 karma, and 100 karma 10x that. (Well, maybe not literally 10x, I’d fine tune the reward structure with some fancy math)
When GPT-3 sort of struggles to figure out “I’m supposed to be doing addition or multiplication here”, I want to be able to directly punish or reward it more than it usually is.
Well, sure, you could take bigger gradient-descent steps for some errors than others. I’m not aware of people doing that, but again, I haven’t checked. I don’t know how well that would work (if at all).
The thing you’re talking about here sounds to me like “a means to an end” rather than “an end in itself”, right? If writing “Karma 100000: …” creates the high-karma-ish answer we wanted, does it matter that we didn’t use rewards to get there? I mean, if you want algorithmic differences between Transformers and brains, there are loads of them, I could go on and on! To me, the interesting question raised by this post is: to what extent can they do similar things, even if they’re doing it in very different ways? :-)
This is nitpicking, but your post goes back and forth between the “underlying algorithm” level and the “learned model” level. Jeff Hawkins is talking about the underlying algorithm level when he says that it is (more or less) the same in every part of the neocortex. But almost all the things you mention in “My algorithm as I understand it” are habits of thought that you’ve learned over the years. (By the same token, we should distinguish between “Transformer + SGD” and “whatever calculations are being done by the particular weight settings in the trained Transformer model”.)
I don’t expect there to be much simplicity or universality at the “learned model” level … I expect that people use lots of different habits of thought.
I imagine the easiest thing would be to pre-pend the karma to each post, fine-tune the model, then you can generate high-karma posts by just prompting with “Karma 1000: …”. I’m not aware of anyone having done this specific thing but didn’t check. I vaguely recall something like that for AlphaStar, where they started by imitation learning with the player’s skill flagged, and then could adjust the flag to make their system play better or worse.
If you haven’t already, see Kaj’s Against System 1 and System 2. I agree with everything he wrote; the way I would describe it is: Our brains house a zoo of compositional generative models, and system 2 is a cool thing where generative models can self-assemble into an ad-hoc crappy serial computer. For example, you can learn a Generative Model X that first summons a different Generative Model Y, and then summons either Generative Model Z₁ or Z₂ conditional on some feature of Generative Model Y. (Something like that … I guess I should write this up better someday.) Anyway, this is a pretty neat trick. Can a trained Transformer NN do anything like that? I think there’s some vague sense in which a 6-layer Transformer can do similar things as a series of 6 serial human thoughts maybe?? I don’t know. There’s definitely a ton of differences too.
My vague sense about foresight (rolling out multiple steps before deciding what to do) is that it’s helpful for sample-efficiency but not required in the limit of infinite training data. Some examples: in RL, both TD learning and tree search eventually converge to the same optimal answer; AlphaGo without a tree search is good but not as good as AlphaGo with a tree search.
Perhaps not coincidentally, language models are pretty sample inefficient compared to people...
In my everyday life, I feel like my thoughts very often involve a sequence of two or three chunks, like “I will reach into my bag and then pull out my wallet”, and somewhat less often is it a longer sequence than that, but i dunno.
Maybe “AlphaStar can’t properly block or not block narrow passages using buildings” is an example where it’s held back by lack of foresight.
Thanks! Will reply to some different bits separately. First, on reddit-karma training:
This doesn’t accomplish what I’m going for (probably). The key thing I want is to directly reward GPT disproportionately in different circumstances. As I currently understand it, every situation for GPT is identical – bunch of previous words, one more word to predict, graded on that one word.
GPT never accidentally touches a burning hot stove, or gets a delicious meal, or builds up a complicated web of social rewards that they aspire to succeed at. I bet toddlers learn not to touch hot stoves very quickly even without parental supervision, faster than GPT could.
I don’t want “1 karma”, “10 karma” and “100 karma” to be a few different words with different associations. I want 10 karma to be 10x the reward of 1 karma, and 100 karma 10x that. (Well, maybe not literally 10x, I’d fine tune the reward structure with some fancy math)
When GPT-3 sort of struggles to figure out “I’m supposed to be doing addition or multiplication here”, I want to be able to directly punish or reward it more than it usually is.
Well, sure, you could take bigger gradient-descent steps for some errors than others. I’m not aware of people doing that, but again, I haven’t checked. I don’t know how well that would work (if at all).
The thing you’re talking about here sounds to me like “a means to an end” rather than “an end in itself”, right? If writing “Karma 100000: …” creates the high-karma-ish answer we wanted, does it matter that we didn’t use rewards to get there? I mean, if you want algorithmic differences between Transformers and brains, there are loads of them, I could go on and on! To me, the interesting question raised by this post is: to what extent can they do similar things, even if they’re doing it in very different ways? :-)