What about ASICs? I heard someone is making them for inference and of course claims an efficiency gain. ASIC improvement needs to be thought of as part of the status quo
Yeah, could cut both ways for this I think? On the one hand, if no-MatMul models really are more efficient in the long run, you could probably make custom hardware optimized for the stuff they require (e.g. lots of ternary stuff). But getting there from the ASICs currently in development would be a necessary pivot.
Maybe the race dynamics actually help slow things down here? Since nobody wants to pivot and fall temporarily behind; money might dry up or someone else might get there before the investment pays off and you leapfrog.
But yeah, even in the medium run, as constraints start to flare up, probably ASICs are a factor in changing up architectures.
What about ASICs? I heard someone is making them for inference and of course claims an efficiency gain. ASIC improvement needs to be thought of as part of the status quo
Yeah, could cut both ways for this I think? On the one hand, if no-MatMul models really are more efficient in the long run, you could probably make custom hardware optimized for the stuff they require (e.g. lots of ternary stuff). But getting there from the ASICs currently in development would be a necessary pivot.
Maybe the race dynamics actually help slow things down here? Since nobody wants to pivot and fall temporarily behind; money might dry up or someone else might get there before the investment pays off and you leapfrog.
But yeah, even in the medium run, as constraints start to flare up, probably ASICs are a factor in changing up architectures.