Since it [GPT-4] finished training in August of 2022, we have been evaluating, adversarially testing, and iteratively improving the model and the system-level mitigations around it.
(Emphasis added.) This coincides with the “eight months” of safety research they mention. I wasn’t aware of this when I made my original post so I’ll edit it to be fairer.
But this itself is surprising: GPT-4 was “finished training” in August 2022, before ChatGPT was even released! I am unsure of what “finished training” means here—is the released model weight-for-weight identical to the 2022 version? Did they do RLHF since then?
Yeah but it’s not clear to me that they needed 8 months of safety research. If they released it after 12 months, they could’ve still written that they’d been “evaluating, adversarially testing, and iteratively improving” it for 12 months. So it’s still not clear to me how much they delayed bc they had to, versus how much (if at all) they did due to the forecasters and/or acceleration considerations.
But this itself is surprising: GPT-4 was “finished training” in August 2022, before ChatGPT was even released! I am unsure of what “finished training” means here—is the released model weight-for-weight identical to the 2022 version? Did they do RLHF since then?
I think “finished training” is the next-token prediction pre-training, and what they did since August is the fine-tuning and the RLHF + other stuff.
So it’s still not clear to me how much they delayed bc they had to, versus how much (if at all) they did due to the forecasters and/or acceleration considerations.
Yeah, completely agree.
I think “finished training” is the next-token prediction pre-training, and what they did since August is the fine-tuning and the RLHF + other stuff.
This seems most likely? But if so, I wish openai had used a different phrase, fine-tuning/RLHF/other stuff is also part of training (unless I’m badly mistaken), and we have this lovely phrase “pre-training” that they could have used instead.
On page 2 of the system card it says:
(Emphasis added.) This coincides with the “eight months” of safety research they mention. I wasn’t aware of this when I made my original post so I’ll edit it to be fairer.
But this itself is surprising: GPT-4 was “finished training” in August 2022, before ChatGPT was even released! I am unsure of what “finished training” means here—is the released model weight-for-weight identical to the 2022 version? Did they do RLHF since then?
Yeah but it’s not clear to me that they needed 8 months of safety research. If they released it after 12 months, they could’ve still written that they’d been “evaluating, adversarially testing, and iteratively improving” it for 12 months. So it’s still not clear to me how much they delayed bc they had to, versus how much (if at all) they did due to the forecasters and/or acceleration considerations.
I think “finished training” is the next-token prediction pre-training, and what they did since August is the fine-tuning and the RLHF + other stuff.
Yeah, completely agree.
This seems most likely? But if so, I wish openai had used a different phrase, fine-tuning/RLHF/other stuff is also part of training (unless I’m badly mistaken), and we have this lovely phrase “pre-training” that they could have used instead.
Ah yeah, that does seem needlessly ambiguous.