the gears to ascension comments on Are we certain that gpt-2 and similar algorithms are not self-aware?

the gears to ascension 5 Apr 2023 13:57 UTC
4 points
huh, interesting artifact here, I wonder what mechanism update should be made to correct the thinking that generated this mistake towards a better fundamentals model; I imagine you’ve already updated the view, but I wonder if there’s some gear in your model that still generates perspectives like this
- Kaj_Sotala 5 Apr 2023 18:19 UTC
  2 points
  Parent
  Nice catch, I’d totally forgotten having written this comment.
  I’m not sure if it was actually wrong, though. I assume you mean some of the stuff that seems to imply ChatGPT(-4) having a level of self-awareness? Possibly that could still be explained by a combination of “a Transformer architecture trained on texts which talked about Transformer architectures” and RLHF/fine-tuning having made ChatGPT specifically refer to itself.
  - the gears to ascension 5 Apr 2023 18:26 UTC
    4 points
    Parent
    I think the only error is to notice that the claim that GPT2 couldn’t have self-awareness (I agree, it still can’t) was incorrectly overgeneralized to also claim that the self-representation would be weak for all LM transformers. You did say the “technically,” so maybe no update is needed.
    
    It’s probably not important. I just happened to be archive browsing for a couple minutes for an unrelated reason and ran across it.