Do you mind sharing your guesstimate on number of parameters?
Also, do you have per chance guesstimates on number of parameters / compute of other systems?
I did, sorry—I guesstimated FLOP/step and then figured parameters is probably a bit less than 1 OOM less than that. But since this is recurrent maybe it’s even less? IDK. My guesstimate is shitty and I’d love to see someone do a better one!
Do you mind sharing your guesstimate on number of parameters?
Also, do you have per chance guesstimates on number of parameters / compute of other systems?
I did, sorry—I guesstimated FLOP/step and then figured parameters is probably a bit less than 1 OOM less than that. But since this is recurrent maybe it’s even less? IDK. My guesstimate is shitty and I’d love to see someone do a better one!