While it is unlikely that long-term and secure confinement of AI is
possible, we are hopeful that the proposed protocol will give researchers
a little more time to find a permanent and satisfactory solution for
addressing existential risks associated with appearance of superintelligent
machines
but not nearly cautious enough:
All supervision, maintenance
and interactions with the AI need to be done only via unintelligent
programs, which have no option to release the AI.
as if an unintelligent program is not gameable.
Their approach to AI safety:
Asking
the AI to answer open ended questions opens up the possibilities for
social engineering attacks as well as numerous other dangerous consequences.
Our proposed protocol is based on the idea of asking what
we will call ’safe questions’with restricted answers. We define a safe
question as one for which a human being could find an answer without
any help from superintelligence, all on his own merit.Why is this useful?
Suppose that a scientist working on a cure for cancer is looking at
two potential cures. Each one could be developed and tested without
assistance from the AI, but would take about 3 years of hard work
each. Which one to try first if the scientist believes that they have
about an equal chance of working? If a wrong option is chosen to be
tested first, humanity will still obtain a cure for cancer but it will happen
3 years later.What if we could ask the AI to suggest which option
to try first?
This question is safe for a number of reasons. First, each potential
answer has an equal probability of being correct. Second, a human
being could answer it without assistance from the AI and so asking AI
for help will only speed up progress but in no way change the outcome.
It is essentially equal to being very lucky multiple choice
guesser. Finally, the answer to this question could be encoded in a single
bit making it hard to hide additional messages in the provided
answer.
To make sure that a potential question is safe, a group of experts
could be used to review it.
We believe our communication protocol to be verifiably secure
against information leaking even if the AI had full access to the
description of the security protocol. If the protocol is correctly implemented,
humanity will essentially have access to a superintelligent
oracle, which might be extremely beneficial and safer than an
unconstrained AI.
I fail to see how “only safe questions with two possible answers of even
likelihood which are independently computable by people should be
submitted to the AI” can even be constructed for a non-trivial case, and how it can stop a determined superintelligence to make people want to relax the constraints bit by bit.
The original article is somewhat cautious:
but not nearly cautious enough:
as if an unintelligent program is not gameable.
Their approach to AI safety:
I fail to see how “only safe questions with two possible answers of even likelihood which are independently computable by people should be submitted to the AI” can even be constructed for a non-trivial case, and how it can stop a determined superintelligence to make people want to relax the constraints bit by bit.