I agree this makes a large fractional change to some AI timelines, and has significant impacts on questions like ownership. But when considering very short timescales, while I can see OpenAI halting their work would change ownership, presumably to some worse steward, I don’t see the gap being large enough to materially affect alignment research. That is, it’s better OpenAI gets it in 2024 than someone else gets it in 2026.
This constant seems to be very small, which is why compute had to drop all the way to ~$1k before any researchers worldwide were fanatical enough to bother trying CNNs and create AlexNet.
It’s hard to be fanatical when you don’t have results. Nowadays AI is so successful it’s hard to imagine this being a significant impediment.
Excluding GShard (which as a sparse model is not at all comparable parameter-wise)
I wouldn’t dismiss GShard altogether. The parameter counts aren’t equal, but MoE(2048E, 60L) is still a beast, and it opens up room for more scaling than a standard model.
I agree this makes a large fractional change to some AI timelines, and has significant impacts on questions like ownership. But when considering very short timescales, while I can see OpenAI halting their work would change ownership, presumably to some worse steward, I don’t see the gap being large enough to materially affect alignment research. That is, it’s better OpenAI gets it in 2024 than someone else gets it in 2026.
It’s hard to be fanatical when you don’t have results. Nowadays AI is so successful it’s hard to imagine this being a significant impediment.
I wouldn’t dismiss GShard altogether. The parameter counts aren’t equal, but MoE(2048E, 60L) is still a beast, and it opens up room for more scaling than a standard model.