accolade comments on I attempted the AI Box Experiment (and lost)

accolade 22 Jan 2013 10:25 UTC
0 points
Ok, I take it by “one-way-blind” you mean that each layer gets no new information that is not already in its database, but what is explicitly controlled by the humans. (E.g. I guess each layer should know the human query, in order to evaluate if AI’s answer is manipulative.)

I also understand that we do look at complex information given by the AI, but only if the security bit signals “ok”.

Ideally the AI […] knows as little as possible about humans and about our universe’s physics.

That seems problematic, as these kinds of knowledge will be crucial for the optimization we want the AI to calculate.