Thomas Kwa comments on First and Last Questions for GPT-5*

Thomas Kwa 25 Nov 2023 2:03 UTC
11 points
0
It seems way more fruitful to do science on it. Check whether current interpretability methods still work, look for evidence of internal planning and deception, start running sandwiching experiments, try to remove capabilities from it, etc.