You’re looking to play AI in a box experiment? I’ve wanted to play gatekeeper for a while now. I don’t know if I’ll be able to offer money, but I would be willing to bet a fair amount of karma.
What I want to know is whether you are one of those who thinks no superintelligence could talk them out in two hours, or just no human. If not with a probability of literally zero (or perhaps one for the ability of a superintelligence to talk its way out), approximately what.
Regardless, let’s do this some time this month. As far as betting is concerned, something similar to the original seems reasonable to me.
I am >90% confident that no human could talk past me, and I doubt a superintelligence could either without some sort of “magic” like a Basilisk image (>60% it couldn’t).
Unfortunately, I can’t bet money. We could do predictionbook predictions, or bet karma.
It all depends on the relative stakes. Suppose you bet $10 that you wouldn’t let a human AI-impersonator out of a box. A clever captive could just transfer $10 of bitcoins to one address, $100 to a second, $1000 to a third, and so on. During the breakout attempt the captive would reveal the private keys of increasingly valuable addresses to indicate both the capability to provide bribes of ever greater value and the inclination to continue cooperating even if you didn’t release them after a few bribes. The captive’s freedom is almost always worth more to them than your bet is to you.
You’re looking to play AI in a box experiment? I’ve wanted to play gatekeeper for a while now. I don’t know if I’ll be able to offer money, but I would be willing to bet a fair amount of karma.
Maybe bet with predictionbook predictions?
You sir, are a man (?) after my own heart.
Sounds good. I don’t have a predictionbook account yet, but IIRC it’s free.
What I want to know is whether you are one of those who thinks no superintelligence could talk them out in two hours, or just no human. If not with a probability of literally zero (or perhaps one for the ability of a superintelligence to talk its way out), approximately what.
Regardless, let’s do this some time this month. As far as betting is concerned, something similar to the original seems reasonable to me.
I am >90% confident that no human could talk past me, and I doubt a superintelligence could either without some sort of “magic” like a Basilisk image (>60% it couldn’t).
Unfortunately, I can’t bet money. We could do predictionbook predictions, or bet karma.
Edited to add probabilities.
It all depends on the relative stakes. Suppose you bet $10 that you wouldn’t let a human AI-impersonator out of a box. A clever captive could just transfer $10 of bitcoins to one address, $100 to a second, $1000 to a third, and so on. During the breakout attempt the captive would reveal the private keys of increasingly valuable addresses to indicate both the capability to provide bribes of ever greater value and the inclination to continue cooperating even if you didn’t release them after a few bribes. The captive’s freedom is almost always worth more to them than your bet is to you.