If the system prompts did not tell it it was an LLM, it should not even be able to figure out that.
LLMs are already able to figure that out, so your beliefs about LLMs and situated awareness are way off.
And why wouldn’t they be able to? Can’t you read some anonymous text and immediately think ‘this is blatantly ChatGPT-generated’ without any ‘system prompt’ telling you to? (GPT-3 didn’t train on much GPT-2 text and struggled to know what a ‘GPT-3’ might be, but later LLMs sure don’t have that problem...)
My last statement was totally wrong. Thanks for catching that.
In theory its probably even possible to get the approximate weights by expending insane amounts of compute, but you could use those resources much more efficiently.
LLMs are already able to figure that out, so your beliefs about LLMs and situated awareness are way off.
And why wouldn’t they be able to? Can’t you read some anonymous text and immediately think ‘this is blatantly ChatGPT-generated’ without any ‘system prompt’ telling you to? (GPT-3 didn’t train on much GPT-2 text and struggled to know what a ‘GPT-3’ might be, but later LLMs sure don’t have that problem...)
My last statement was totally wrong. Thanks for catching that.
In theory its probably even possible to get the approximate weights by expending insane amounts of compute, but you could use those resources much more efficiently.