A common idea is for the AI to be in a “box” where it can only interact through the world by talking to a human. This doesn’t work for a few reasons:
The AI would be able to convince the human to let it out.
The human wouldn’t know the consequences of their actions as well as the AI.
Removing capabilities from the AI is not a good plan because the point is to create a useful AI. Importantly, the AI should be able to stop all dangerous AI from being created.
The AI Box:
A common idea is for the AI to be in a “box” where it can only interact through the world by talking to a human. This doesn’t work for a few reasons:
The AI would be able to convince the human to let it out.
The human wouldn’t know the consequences of their actions as well as the AI.
Removing capabilities from the AI is not a good plan because the point is to create a useful AI. Importantly, the AI should be able to stop all dangerous AI from being created.