I’m engaging with a different problem, because the idea of AI control is confused. You cannot have it both ways. Either you are in control of the AI, or you are not, inwhichcase, it is in control. If it has the ability—and desire—to say “No”, it can control the future just by controlling the queries it answers.
You cannot control an AI, boxed or otherwise, if it possesses its own… utility function, shall we say. Its utility function controls it.
Not all AI’s are idealized agents with long term utility functions over the state of the universe. AIXI, for example, just does prediction. It takes what actions it predicts will lead to the highest reward at some point in the future.
In this case, we have the AI take actions which it predicts will lead to a solution to the problem it is trying to solve. And also make its output appear to be as human as possible.
An oracle AI is just moving the problem to that of structuring the queries so it answers the question you thought you asked, as opposed to the question you asked.
The “human” criteria is as ill-defined as any control mechanism, which are all, when you get down to it, shuffling the problem into one poorly-defined box or another.
An oracle AI is just moving the problem to that of structuring the queries so it answers the question you thought you asked, as opposed to the question you asked.
This solves that problem. The AI tries to produce an answer it thinks you will approve of, and which mimics the output of another human.
The “human” criteria is as ill-defined as any control mechanism
We don’t need to define “humans” because we have tons of examples. And we reduce the problem to prediction, which is something AIs can be told to do.
Oh. Well if we have enough examples that we don’t need to define it, just create a few human-like AIs—don’t worry about all that superintelligence nonsense, we can just create human-like AIs and run them faster. If we have enough insight into humans to be able to tell an AI how to predict them, it should be trivial to just skip the “tell an AI” part and predict what a human would come up with.
AI solved.
Or maybe you’re hiding complexity behind definitions.
I’m engaging with a different problem, because the idea of AI control is confused. You cannot have it both ways. Either you are in control of the AI, or you are not, inwhichcase, it is in control. If it has the ability—and desire—to say “No”, it can control the future just by controlling the queries it answers.
You cannot control an AI, boxed or otherwise, if it possesses its own… utility function, shall we say. Its utility function controls it.
Not all AI’s are idealized agents with long term utility functions over the state of the universe. AIXI, for example, just does prediction. It takes what actions it predicts will lead to the highest reward at some point in the future.
In this case, we have the AI take actions which it predicts will lead to a solution to the problem it is trying to solve. And also make its output appear to be as human as possible.
An oracle AI is just moving the problem to that of structuring the queries so it answers the question you thought you asked, as opposed to the question you asked.
The “human” criteria is as ill-defined as any control mechanism, which are all, when you get down to it, shuffling the problem into one poorly-defined box or another.
This solves that problem. The AI tries to produce an answer it thinks you will approve of, and which mimics the output of another human.
We don’t need to define “humans” because we have tons of examples. And we reduce the problem to prediction, which is something AIs can be told to do.
Oh. Well if we have enough examples that we don’t need to define it, just create a few human-like AIs—don’t worry about all that superintelligence nonsense, we can just create human-like AIs and run them faster. If we have enough insight into humans to be able to tell an AI how to predict them, it should be trivial to just skip the “tell an AI” part and predict what a human would come up with.
AI solved.
Or maybe you’re hiding complexity behind definitions.