Peter Chatain comments on Race to the Top: Benchmarks for AI Safety

Peter Chatain 28 Mar 2023 5:41 UTC
1 point
0
Curious if you ever found what you were looking for.
- kerry 30 Mar 2023 21:18 UTC
  1 point
  0
  Parent
  I didn’t. I’m sure words towards articulating this have been spoken many times, but the trick is in what forum / form does it need to exist more specifically in order for it to be comprehensible and lasting. Maybe I’m wrong that it needs to be highly public; as with nukes not many people are actually familiar with what is considered sufficient fissile material—governments (try to) maintain this barrier by themselves. But at this stage as it still seems a fuzzy concept, any input seems valid.
  Consider the following combination of properties:
  - (software—if that’s the right word?) capable of self replication / sustainability / improvement
  - capable of eluding human control
  - capable of doing harm
  In isolation none of these is sufficient, but taken together I think we could all agree we have a problem. So we could begin to categorize and rank various assemblages of AI by these criteria, and not by how “smart” they are.
  - Štěpán Los 21 Jul 2023 21:06 UTC
    2 points
    0
    Parent
    I know I am super late to the party but this seems like something along the lines of what you’re looking for: https://www.alignmentforum.org/posts/qYzqDtoQaZ3eDDyxa/distinguishing-ai-takeover-scenarios
    - kerry 10 Aug 2023 20:37 UTC
      1 point
      0
      Parent
      yea that’s cool to see. Very similar attempt at categorization. I feel we get caught up often in the potential / theoretical capabilities of systems. But there are already plenty of systems that fulfill self-replicating, harmful, intelligent behaviors. It’s entirely a question of degrees. That’s why a visual ranking of all systems’ metrics is in order I think.
      Defining what comprises a ‘system’ would be the other big challenge. Is a hostile government a system? That’s fairly intelligent and self-replicating. etc.