ryan_greenblatt comments on Matthew Barnett’s Shortform

ryan_greenblatt 17 Jun 2024 14:55 UTC
6 points
2
I don’t think current system systems are well described as having “big picture awareness”. From my experiments with Claude, it makes cartoonish errors reasoning about various AI related situations and can’t do such reasoning except aloud.

I’m not certain this was your claim, but it seems to have been.
- Bogdan Ionut Cirstea 17 Jun 2024 15:28 UTC
  3 points
  2
  Parent
  it makes cartoonish errors reasoning about various AI related situations and can’t do such reasoning except aloud
  Wouldn’t reasoning aloud be enough though, if it was good enough? Also, I expect reasoning aloud first to be the modal scenario, given theoretical results on Chain of Thought and the like.
- Matthew Barnett 17 Jun 2024 15:03 UTC
  3 points
  −2
  Parent
  My claim was not that current LLMs have a high level of big picture awareness.
  
  Instead, I claim current systems have limited situational awareness, which is not yet human-level, but is definitely above zero. I further claim that solving the shutdown problem for AIs with limited (non-zero) situational awareness gives you evidence about how hard it will be to solve the problem for AIs with more situational awareness.
  
  And I’d predict that, if we design a proper situational awareness benchmark, and (say) GPT-5 or GPT-6 passes with flying colors, it will likely be easy to shut down the system, or delete all its copies, with no resistance-by-default from the system.
  
  And if you think that wouldn’t count as an adequate solution to the problem, then it’s not clear the problem was coherent as written in the first place.