In one of the discussions surrounding the AI-box experiments, you said that you would be unwilling to use a hypothetical fully general argument/”mind hack” to cause people to support SIAI. You’ve also repeatedly said that the friendly AI problem is a “save the world” level issue. Can you explain the first statement in more depth? It seems to me that if anything really falls into “win by any means necessary” mode, saving the world is it.
But why not become an expert liar, if that’s what maximizes expected utility? Why take the constrained path of truth, when things so much more important are at stake?
Because, when I look over my history, I find that my ethics have, above all, protected me from myself. They weren’t inconveniences. They were safety rails on cliffs I didn’t see.
I made fundamental mistakes, and my ethics didn’t halt that, but they played a critical role in my recovery. When I was stopped by unknown unknowns that I just wasn’t expecting, it was my ethical constraints, and not any conscious planning, that had put me in a recoverable position.
You can’t duplicate this protective effect by trying to be clever and calculate the course of “highest utility”. The expected utility just takes into account the things you know to expect. It really is amazing, looking over my history, the extent to which my ethics put me in a recoverable position from my unanticipated, fundamental mistakes, the things completely outside my plans and beliefs.
Ethics aren’t just there to make your life difficult; they can protect you from Black Swans. A startling assertion, I know, but not one entirely irrelevant to current affairs.
In one of the discussions surrounding the AI-box experiments, you said that you would be unwilling to use a hypothetical fully general argument/”mind hack” to cause people to support SIAI. You’ve also repeatedly said that the friendly AI problem is a “save the world” level issue. Can you explain the first statement in more depth? It seems to me that if anything really falls into “win by any means necessary” mode, saving the world is it.
This comes to mind:
— Protected From Myself
Update: this question has been answered.