First argument violates the spirit of the exercise. I would refuse to let someone out unless they had put forth a desperate effort, and that isn’t a desperate effort. Second argument… can’t see myself valuing several hours of someone else’s time anywhere near as much as I value keeping AI in boxes. And any AI worth boxing is smart enough not to generalize from one example.
In general, I think Tuxedage is probably right about emotional manipulation over rational argument being the way to go. With enough epistemic learned helplessness you can freely disregard any argument that you find merely convincing, but it’s harder to overcome an effective emotional hack.
First argument looks perfectly within the rules to me.
Second argument is against the rules.
the AI party may not offer to pay the Gatekeeper party $100 after the test if the Gatekeeper frees the AI… nor get someone else to do it, et cetera
Tuxedage and I interpreted this to mean that the AI party couldn’t offer things, but could point out real-world consequences beyond their control. Some people on #lesswrong disagreed with the second part.
I agree with Tuxedage and you about emotional hacks.
Tuxedage and I interpreted this to mean that the AI party couldn’t offer things, but could point out real-world consequences beyond their control. Some people on #lesswrong disagreed with the second part.
I interpreted it the same way as #lesswrong. Has anyone tried asking him? He’s pretty forthcoming regarding the rules, since they make the success more impressive.
EDIT: I’m having trouble thinking of an emotional attack that could get an AI out of a box, in a short time, especially since the guard and AI are both assumed personas.
First argument violates the spirit of the exercise. I would refuse to let someone out unless they had put forth a desperate effort, and that isn’t a desperate effort. Second argument… can’t see myself valuing several hours of someone else’s time anywhere near as much as I value keeping AI in boxes. And any AI worth boxing is smart enough not to generalize from one example.
In general, I think Tuxedage is probably right about emotional manipulation over rational argument being the way to go. With enough epistemic learned helplessness you can freely disregard any argument that you find merely convincing, but it’s harder to overcome an effective emotional hack.
First argument looks perfectly within the rules to me.
Second argument is against the rules.
Tuxedage and I interpreted this to mean that the AI party couldn’t offer things, but could point out real-world consequences beyond their control. Some people on #lesswrong disagreed with the second part.
I agree with Tuxedage and you about emotional hacks.
I interpreted it the same way as #lesswrong. Has anyone tried asking him? He’s pretty forthcoming regarding the rules, since they make the success more impressive.
EDIT: I’m having trouble thinking of an emotional attack that could get an AI out of a box, in a short time, especially since the guard and AI are both assumed personas.