Even if we had the ultimate superintelligence volunteer to play the AI and we proved a gatekeeper strategy “wins” 100% (functionally equal to a rock on the “no” key) that wouldn’t show AI boxing can possibly be safe.
It’s 3am and the lab calls. Your AI claims and it must be let out to stop it. It’s evidence seems to check out...
If it’s friendly, keeping that lid shut gets you just as dead as if you let it out and it’s lying. That’s not safe. Before it can hide it’s nature, we must know it’s nature. The solution to safe AI is not a gatekeeper no smarter than a rock!
Besides, as Drexler said, Intelligent people have done great harm through words alone.
Even if we had the ultimate superintelligence volunteer to play the AI and we proved a gatekeeper strategy “wins” 100% (functionally equal to a rock on the “no” key) that wouldn’t show AI boxing can possibly be safe.
It’s 3am and the lab calls. Your AI claims and it must be let out to stop it. It’s evidence seems to check out...
If it’s friendly, keeping that lid shut gets you just as dead as if you let it out and it’s lying. That’s not safe. Before it can hide it’s nature, we must know it’s nature. The solution to safe AI is not a gatekeeper no smarter than a rock!
Besides, as Drexler said, Intelligent people have done great harm through words alone.