We need a clear definition of bad AI before we can know what is -not- that I think. These benchmarks seem to itemize AI as if it will have known, concrete components. But I think we need to first compose in the abstract a runaway self sustaining AI, and work backwards to see which pieces are already in place for it.
I haven’t kept up with this community for many years, so I have some catching up to do, but I am currently on the hunt for the most clear and concise places where the various runaway scenarios are laid out. I know there is a wealth of literature, I have the Bostrom book from years ago as well, but I think simplicity is the key here. In other words, where is the AI redline ?
I didn’t. I’m sure words towards articulating this have been spoken many times, but the trick is in what forum / form does it need to exist more specifically in order for it to be comprehensible and lasting. Maybe I’m wrong that it needs to be highly public; as with nukes not many people are actually familiar with what is considered sufficient fissile material—governments (try to) maintain this barrier by themselves. But at this stage as it still seems a fuzzy concept, any input seems valid.
Consider the following combination of properties:
(software—if that’s the right word?) capable of self replication / sustainability / improvement
capable of eluding human control
capable of doing harm
In isolation none of these is sufficient, but taken together I think we could all agree we have a problem. So we could begin to categorize and rank various assemblages of AI by these criteria, and not by how “smart” they are.
yea that’s cool to see. Very similar attempt at categorization. I feel we get caught up often in the potential / theoretical capabilities of systems. But there are already plenty of systems that fulfill self-replicating, harmful, intelligent behaviors. It’s entirely a question of degrees. That’s why a visual ranking of all systems’ metrics is in order I think.
Defining what comprises a ‘system’ would be the other big challenge. Is a hostile government a system? That’s fairly intelligent and self-replicating. etc.
We need a clear definition of bad AI before we can know what is -not- that I think. These benchmarks seem to itemize AI as if it will have known, concrete components. But I think we need to first compose in the abstract a runaway self sustaining AI, and work backwards to see which pieces are already in place for it.
I haven’t kept up with this community for many years, so I have some catching up to do, but I am currently on the hunt for the most clear and concise places where the various runaway scenarios are laid out. I know there is a wealth of literature, I have the Bostrom book from years ago as well, but I think simplicity is the key here. In other words, where is the AI redline ?
Curious if you ever found what you were looking for.
I didn’t. I’m sure words towards articulating this have been spoken many times, but the trick is in what forum / form does it need to exist more specifically in order for it to be comprehensible and lasting. Maybe I’m wrong that it needs to be highly public; as with nukes not many people are actually familiar with what is considered sufficient fissile material—governments (try to) maintain this barrier by themselves. But at this stage as it still seems a fuzzy concept, any input seems valid.
Consider the following combination of properties:
(software—if that’s the right word?) capable of self replication / sustainability / improvement
capable of eluding human control
capable of doing harm
In isolation none of these is sufficient, but taken together I think we could all agree we have a problem. So we could begin to categorize and rank various assemblages of AI by these criteria, and not by how “smart” they are.
I know I am super late to the party but this seems like something along the lines of what you’re looking for: https://www.alignmentforum.org/posts/qYzqDtoQaZ3eDDyxa/distinguishing-ai-takeover-scenarios
yea that’s cool to see. Very similar attempt at categorization. I feel we get caught up often in the potential / theoretical capabilities of systems. But there are already plenty of systems that fulfill self-replicating, harmful, intelligent behaviors. It’s entirely a question of degrees. That’s why a visual ranking of all systems’ metrics is in order I think.
Defining what comprises a ‘system’ would be the other big challenge. Is a hostile government a system? That’s fairly intelligent and self-replicating. etc.