The AI Box game always struck me as interesting for several reasons:
If you had this situation in real life, it’d be useless. Presumably a computer would have patience sufficient to outlast all humans trying to get into its noggin; you’d never be able to release it safely.
Eliezer looked for people with total confidence they’d succeed in the tiny two-hour time frame; I think that was to increase his chance of winning. This isn’t just Dunning-Krueger—if you think you can always suss out lies and bad information, you’re a sucker waiting to happen.
The way the challenge is set up, I think the AI’s at an unfair disadvantage (and the setup’s designed that way for PR reasons) because the human communicator need not engage the AI in an effort to evaluate it for release. I’d think forcing some level of engagement other than repeated, “No”s would be fair.
Oddly, I’m highly confident I’d win the AI Box game as the human, partly because I think that my being persuaded to release the AI over the very long term isn’t merely likely, but inevitable. I think this leads to more diligence. I’m also in a profession where people lie to me on a semi-regular basis; that helps.
I think I could win as the AI Box against many people. But that’s because I’d be especially friendly AI. If the other person is obliged to engage and stay in character.… well, I’ve got a plan.
I’ll play as the human on negotiable terms; my charity would likely be National Center for Science Education. The only term I’d be pretty inflexible about is that I’d want to make the transcript publicly available. (In the extremely unlikely event that Eliezer wants to take me on, I’d be willing to place an embargo date on the transcript—say 18 months out.) Private message me if you’re particularly interested.
I certainly do like games.
The AI Box game always struck me as interesting for several reasons:
If you had this situation in real life, it’d be useless. Presumably a computer would have patience sufficient to outlast all humans trying to get into its noggin; you’d never be able to release it safely.
Eliezer looked for people with total confidence they’d succeed in the tiny two-hour time frame; I think that was to increase his chance of winning. This isn’t just Dunning-Krueger—if you think you can always suss out lies and bad information, you’re a sucker waiting to happen.
The way the challenge is set up, I think the AI’s at an unfair disadvantage (and the setup’s designed that way for PR reasons) because the human communicator need not engage the AI in an effort to evaluate it for release. I’d think forcing some level of engagement other than repeated, “No”s would be fair.
Oddly, I’m highly confident I’d win the AI Box game as the human, partly because I think that my being persuaded to release the AI over the very long term isn’t merely likely, but inevitable. I think this leads to more diligence. I’m also in a profession where people lie to me on a semi-regular basis; that helps.
I think I could win as the AI Box against many people. But that’s because I’d be especially friendly AI. If the other person is obliged to engage and stay in character.… well, I’ve got a plan.
I’ll play as the human on negotiable terms; my charity would likely be National Center for Science Education. The only term I’d be pretty inflexible about is that I’d want to make the transcript publicly available. (In the extremely unlikely event that Eliezer wants to take me on, I’d be willing to place an embargo date on the transcript—say 18 months out.) Private message me if you’re particularly interested.