Huh, interesting. Maybe the OpenAI statements about their models being “more aligned” came earlier than that in the context of Instruct-GPT? I definitely feel like I remember some Twitter threads and LW comment threads about it in the context of OpenAI announcements, and nothing in the context of Anthropic announcements.
Huh, interesting. Maybe the OpenAI statements about their models being “more aligned” came earlier than that in the context of Instruct-GPT? I definitely feel like I remember some Twitter threads and LW comment threads about it in the context of OpenAI announcements, and nothing in the context of Anthropic announcements.
This is likely not the first instance, but OpenAI was already using the word “aligned” in this way in 2021 in the Codex paper.
https://arxiv.org/abs/2107.03374 (section 7.2)
Ah, you’re correct, it’s from the original instructGPT release in Jan 2022:
https://openai.com/index/instruction-following/