What do you mean by Scaling Hypothesis? Do you believe extremely large transformer models trained based on autoregressive loss will have superhuman capabilities?
Mohammad Bavarian
Karma: 5
What do you mean by Scaling Hypothesis? Do you believe extremely large transformer models trained based on autoregressive loss will have superhuman capabilities?
Did you test Claude for it being less susceptible to this issue? Otherwise not sure where your comment actually comes from. Testing this, I saw similar or worse behavior by that model—albeit GPT4 also definitely has this issue
https://twitter.com/mobav0/status/1637349100772372480?s=20