kerry comments on Race to the Top: Benchmarks for AI Safety

kerry 5 Dec 2022 5:06 UTC
2 points
0
We need a clear definition of bad AI before we can know what is -not- that I think. These benchmarks seem to itemize AI as if it will have known, concrete components. But I think we need to first compose in the abstract a runaway self sustaining AI, and work backwards to see which pieces are already in place for it.
I haven’t kept up with this community for many years, so I have some catching up to do, but I am currently on the hunt for the most clear and concise places where the various runaway scenarios are laid out. I know there is a wealth of literature, I have the Bostrom book from years ago as well, but I think simplicity is the key here. In other words, where is the AI redline ?
- Peter Chatain 28 Mar 2023 5:41 UTC
  1 point
  0
  Parent
  Curious if you ever found what you were looking for.
  - kerry 30 Mar 2023 21:18 UTC
    1 point
    0
    Parent
    I didn’t. I’m sure words towards articulating this have been spoken many times, but the trick is in what forum / form does it need to exist more specifically in order for it to be comprehensible and lasting. Maybe I’m wrong that it needs to be highly public; as with nukes not many people are actually familiar with what is considered sufficient fissile material—governments (try to) maintain this barrier by themselves. But at this stage as it still seems a fuzzy concept, any input seems valid.
    Consider the following combination of properties:
    (software—if that’s the right word?) capable of self replication / sustainability / improvement
    capable of eluding human control
    capable of doing harm
    In isolation none of these is sufficient, but taken together I think we could all agree we have a problem. So we could begin to categorize and rank various assemblages of AI by these criteria, and not by how “smart” they are.
    - Štěpán Los 21 Jul 2023 21:06 UTC
      2 points
      0
      Parent
      I know I am super late to the party but this seems like something along the lines of what you’re looking for: https://www.alignmentforum.org/posts/qYzqDtoQaZ3eDDyxa/distinguishing-ai-takeover-scenarios
      - kerry 10 Aug 2023 20:37 UTC
        1 point
        0
        Parent
        yea that’s cool to see. Very similar attempt at categorization. I feel we get caught up often in the potential / theoretical capabilities of systems. But there are already plenty of systems that fulfill self-replicating, harmful, intelligent behaviors. It’s entirely a question of degrees. That’s why a visual ranking of all systems’ metrics is in order I think.
        Defining what comprises a ‘system’ would be the other big challenge. Is a hostile government a system? That’s fairly intelligent and self-replicating. etc.