“Stay inside the box and answer questions accurately” is about as specific as “Obey my commands” which, again, both Hitler and Gandhi could have said in response to the first question.
That would define a genie (which is about as hard as an oracle) but not a safe genie (which would be “obey the intentions of my commands, extending my understanding in unusual cases)
Whether a safe genie is harder than a safe oracle is a judgement call, but my feelings fall squarely on the oracle side; I’d estimate a safe genie would have to be friendly, unlike a safe oracle.
I think with the oracle, part of the difficulty might be pushed back to the asking-questions stage. Correctly phrasing a question so that the answer is what you want seems to be the same kind of difficulty as getting an AI to do what you want.
“Stay inside the box and answer questions accurately” is about as specific as “Obey my commands” which, again, both Hitler and Gandhi could have said in response to the first question.
That would define a genie (which is about as hard as an oracle) but not a safe genie (which would be “obey the intentions of my commands, extending my understanding in unusual cases)
Whether a safe genie is harder than a safe oracle is a judgement call, but my feelings fall squarely on the oracle side; I’d estimate a safe genie would have to be friendly, unlike a safe oracle.
I think with the oracle, part of the difficulty might be pushed back to the asking-questions stage. Correctly phrasing a question so that the answer is what you want seems to be the same kind of difficulty as getting an AI to do what you want.