I think if you weren’t carefully reading OpenAI’s documentation it was pretty easy to believe that text-davinci-002 was InstructGPT (and hence trained with RLHF).
Not only was it easy, in fact many people did (including myself). In fact, can you point a single case of people NOT making this reading mistake? As in, after January 2022 instruction following announcement, but before October 2022 model index for researchers. Jan Leike’s tweet you linked to postdates October 2022 and does not count. The allegation is that OpenAI lied (or at the very least was extremely misleading) for ten months of 2022. I am more ambivalent about post October 2022.
OpenAI wasted a whole year between GPT-3 and GPT-4. (Source: Greg Brockman said this in an OpenAI developer event.) So yes, I think OpenAI was 12+ months ahead at one time.