[19] I recommend trying out different numbers depending on who is in the conversation. For conversations in which everyone assigns high credence to the 10^35 version, it may be more fruitful to debate the 10^29 version, since 10^29 FLOP is when GPT-7 surpasses human level at text prediction (and is also superhuman at the other tests we’ve tried) according to the scaling laws and performance trends, I think.
For conversations where everyone has low credence in the 10^35 version, I suggest using the 10^41 version, since 10^41 FLOP is enough to recapitulate evolution without any shortcuts.
[19] I recommend trying out different numbers depending on who is in the conversation. For conversations in which everyone assigns high credence to the 10^35 version, it may be more fruitful to debate the 10^29 version, since 10^29 FLOP is when GPT-7 surpasses human level at text prediction (and is also superhuman at the other tests we’ve tried) according to the scaling laws and performance trends, I think.
For conversations where everyone has low credence in the 10^35 version, I suggest using the 10^41 version, since 10^41 FLOP is enough to recapitulate evolution without any shortcuts.