The predicted cost for GPT-N parameter improvements is for the “classical Transformer” architecture? Recent updates like the Performer should require substantially less compute and therefore cost.
Yes, in general you want to account for hardware and software improvements. From the original post:
Finally, it’s important to note that algorithmic advances are real and important. GPT-3 still uses a somewhat novel and unoptimised architecture, and I’d be unsurprised if we got architectures or training methods that were one or two orders of magnitude more compute-efficient in the next 5 years.
From the summary:
$100B -$1T at current prices, $1B - $10B given estimated hardware and software improvements over the next 5 − 10 years
The $1B - $10B number is meant to include things like the Performer.
The predicted cost for GPT-N parameter improvements is for the “classical Transformer” architecture? Recent updates like the Performer should require substantially less compute and therefore cost.
Yes, in general you want to account for hardware and software improvements. From the original post:
From the summary:
The $1B - $10B number is meant to include things like the Performer.