Hoagy comments on RSPs are pauses done right

Hoagy 16 Oct 2023 19:40 UTC
LW: 1 AF: 1
0
AF
Do you know why 4x was picked? I understand that doing evals properly is a pretty substantial effort, but once we get up to gigantic sizes and proto-AGIs it seems like it could hide a lot. If there was a model sitting in training with 3x the train-compute of GPT4 I’d be very keen to know what it could do!