huh, interesting artifact here, I wonder what mechanism update should be made to correct the thinking that generated this mistake towards a better fundamentals model; I imagine you’ve already updated the view, but I wonder if there’s some gear in your model that still generates perspectives like this
Nice catch, I’d totally forgotten having written this comment.
I’m not sure if it was actually wrong, though. I assume you mean some of the stuff that seems to imply ChatGPT(-4) having a level of self-awareness? Possibly that could still be explained by a combination of “a Transformer architecture trained on texts which talked about Transformer architectures” and RLHF/fine-tuning having made ChatGPT specifically refer to itself.
I think the only error is to notice that the claim that GPT2 couldn’t have self-awareness (I agree, it still can’t) was incorrectly overgeneralized to also claim that the self-representation would be weak for all LM transformers. You did say the “technically,” so maybe no update is needed.
It’s probably not important. I just happened to be archive browsing for a couple minutes for an unrelated reason and ran across it.
huh, interesting artifact here, I wonder what mechanism update should be made to correct the thinking that generated this mistake towards a better fundamentals model; I imagine you’ve already updated the view, but I wonder if there’s some gear in your model that still generates perspectives like this
Nice catch, I’d totally forgotten having written this comment.
I’m not sure if it was actually wrong, though. I assume you mean some of the stuff that seems to imply ChatGPT(-4) having a level of self-awareness? Possibly that could still be explained by a combination of “a Transformer architecture trained on texts which talked about Transformer architectures” and RLHF/fine-tuning having made ChatGPT specifically refer to itself.
I think the only error is to notice that the claim that GPT2 couldn’t have self-awareness (I agree, it still can’t) was incorrectly overgeneralized to also claim that the self-representation would be weak for all LM transformers. You did say the “technically,” so maybe no update is needed.
It’s probably not important. I just happened to be archive browsing for a couple minutes for an unrelated reason and ran across it.