kerry comments on Race to the Top: Benchmarks for AI Safety

kerry 30 Mar 2023 21:18 UTC
1 point
0
I didn’t. I’m sure words towards articulating this have been spoken many times, but the trick is in what forum / form does it need to exist more specifically in order for it to be comprehensible and lasting. Maybe I’m wrong that it needs to be highly public; as with nukes not many people are actually familiar with what is considered sufficient fissile material—governments (try to) maintain this barrier by themselves. But at this stage as it still seems a fuzzy concept, any input seems valid.
Consider the following combination of properties:
- (software—if that’s the right word?) capable of self replication / sustainability / improvement
- capable of eluding human control
- capable of doing harm
In isolation none of these is sufficient, but taken together I think we could all agree we have a problem. So we could begin to categorize and rank various assemblages of AI by these criteria, and not by how “smart” they are.
- Štěpán Los 21 Jul 2023 21:06 UTC
  2 points
  0
  Parent
  I know I am super late to the party but this seems like something along the lines of what you’re looking for: https://www.alignmentforum.org/posts/qYzqDtoQaZ3eDDyxa/distinguishing-ai-takeover-scenarios
  - kerry 10 Aug 2023 20:37 UTC
    1 point
    0
    Parent
    yea that’s cool to see. Very similar attempt at categorization. I feel we get caught up often in the potential / theoretical capabilities of systems. But there are already plenty of systems that fulfill self-replicating, harmful, intelligent behaviors. It’s entirely a question of degrees. That’s why a visual ranking of all systems’ metrics is in order I think.
    Defining what comprises a ‘system’ would be the other big challenge. Is a hostile government a system? That’s fairly intelligent and self-replicating. etc.