We kept the secrecy rule because it was the default but I stand by it now as well. There are a lot of things I said in that convo that I wouldn’t want posted on lesswrong, enough that I think the convo would have been different without the expectation of privacy. Observing behavior often changes it.
If a game is began with the notion that it’ll be posted online, one of two things, or both will happen. Either (a) the AI is constrained by the techniques they can implore, unwilling to embarrass themselves or the gatekeeper to a public audience (especially when it comes down to personal details.), or (b) the Gatekeeper now has a HUGE incentive not to let the AI out; to avoid being known as the sucker who let the AI out...
Even if you could solve this by changing details and anonymising, it seems to me that the techniques are so personal and specific that changing them in any way would make the entire dialogue make even less sense.
The only other solution is to have a third-party monitor the game and post it without consent (which is obviously unethical, but probably the only real way you could get a truly authentic transcript.)
We kept the secrecy rule because it was the default but I stand by it now as well. There are a lot of things I said in that convo that I wouldn’t want posted on lesswrong, enough that I think the convo would have been different without the expectation of privacy. Observing behavior often changes it.
That last bit is particularly important methinks.
If a game is began with the notion that it’ll be posted online, one of two things, or both will happen. Either (a) the AI is constrained by the techniques they can implore, unwilling to embarrass themselves or the gatekeeper to a public audience (especially when it comes down to personal details.), or (b) the Gatekeeper now has a HUGE incentive not to let the AI out; to avoid being known as the sucker who let the AI out...
Even if you could solve this by changing details and anonymising, it seems to me that the techniques are so personal and specific that changing them in any way would make the entire dialogue make even less sense.
The only other solution is to have a third-party monitor the game and post it without consent (which is obviously unethical, but probably the only real way you could get a truly authentic transcript.)