Some points which I think support the plausibility of this scenario:
(1) EY’s ideas about a “simple core of intelligence”, how chimp brains don’t seem to have major architectural differences from human brains, etc.
(2) RWKV vs Transformers. Why haven’t Transformers been straight up replaced by RWKV at this point? Looks to me like potentially huge efficiency gains being basically ignored because lab researchers can get away with it. Granted, affects efficiency of inference but not training AFAIK, and maybe it wouldn’t work at the 100B+ scale, but it certainly looks like enough evidence to do the experiment.
(3) Why didn’t researchers jump straight to the end on smaller and smaller floating point (or fixed point) precision? Okay, sure, “the hardware didn’t support it” can explain some of it, but you could still do smaller scale experiments to show it appears to work and get support into the next generation of hardware (or at some point even custom hardware if the gains are huge enough) if you’re serious about maximizing efficiency.
(4) I have a few more ideas for huge efficiency gains that I don’t want to state publicly. Probably most of them wouldn’t work. But the thing about huge efficiency gains is that if they do work, doing the experiments to find that out is (relatively) cheap, because of the huge efficiency gains. I’m not saying anyone should update on my claim to have such ideas, but if you understand modern ML, you can try to answer the question “what would you try if you wanted to drastically improve efficiency” and update on the answers you come up with. And there are probably better ideas than those, and almost certainly more such ideas. I end up mostly thinking lab researchers aren’t trying because it’s just not what they’re being paid to do, and/or it isn’t what interests them. Of course they are trying to improve efficiency, but they’re looking for smaller improvements that are more likely to pan out, not massive improvements any given one of which probably won’t work.
Anyway, I think a world in which you could even run GPT-4 quality inference (let alone training) on a current smartphone looks like a world where AI is soon going to determine the future more than humans do, if it hasn’t already happened at that point… and I’m far from certain this is where compute limits (moderate ones, not crushingly tight ones that would restrict or ban a lot of already-deployed hardware) would lead, but it doesn’t seem to me like this possibility is one that people advocating for compute limits have really considered, even if only to say why they find it very unlikely. (Well, I guess if you only care about buying a moderate amount of time, compute limits would probably do that even in this scenario, since researchers can’t pivot on a dime to improving efficiency, and we’re specifically talking about higher-hanging efficiency gains here.)
Some points which I think support the plausibility of this scenario:
(1) EY’s ideas about a “simple core of intelligence”, how chimp brains don’t seem to have major architectural differences from human brains, etc.
(2) RWKV vs Transformers. Why haven’t Transformers been straight up replaced by RWKV at this point? Looks to me like potentially huge efficiency gains being basically ignored because lab researchers can get away with it. Granted, affects efficiency of inference but not training AFAIK, and maybe it wouldn’t work at the 100B+ scale, but it certainly looks like enough evidence to do the experiment.
(3) Why didn’t researchers jump straight to the end on smaller and smaller floating point (or fixed point) precision? Okay, sure, “the hardware didn’t support it” can explain some of it, but you could still do smaller scale experiments to show it appears to work and get support into the next generation of hardware (or at some point even custom hardware if the gains are huge enough) if you’re serious about maximizing efficiency.
(4) I have a few more ideas for huge efficiency gains that I don’t want to state publicly. Probably most of them wouldn’t work. But the thing about huge efficiency gains is that if they do work, doing the experiments to find that out is (relatively) cheap, because of the huge efficiency gains. I’m not saying anyone should update on my claim to have such ideas, but if you understand modern ML, you can try to answer the question “what would you try if you wanted to drastically improve efficiency” and update on the answers you come up with. And there are probably better ideas than those, and almost certainly more such ideas. I end up mostly thinking lab researchers aren’t trying because it’s just not what they’re being paid to do, and/or it isn’t what interests them. Of course they are trying to improve efficiency, but they’re looking for smaller improvements that are more likely to pan out, not massive improvements any given one of which probably won’t work.
Anyway, I think a world in which you could even run GPT-4 quality inference (let alone training) on a current smartphone looks like a world where AI is soon going to determine the future more than humans do, if it hasn’t already happened at that point… and I’m far from certain this is where compute limits (moderate ones, not crushingly tight ones that would restrict or ban a lot of already-deployed hardware) would lead, but it doesn’t seem to me like this possibility is one that people advocating for compute limits have really considered, even if only to say why they find it very unlikely. (Well, I guess if you only care about buying a moderate amount of time, compute limits would probably do that even in this scenario, since researchers can’t pivot on a dime to improving efficiency, and we’re specifically talking about higher-hanging efficiency gains here.)