How have I not addressed the arguments on its own terms? I agree with basically everything you said, except calling it a solution. You’ll run into non-trivial problems when you try to turn it into an algorithm.
For example, the case of there being an actual physical mugger is meant to be an example of the more general problem of programs with tiny priors predicting super-huge rewards. A strategy based on “probability of the mugger lying” has to be translated to the general case somehow. You have to prevent the AI from mugging itself.
How have I not addressed the arguments on its own terms? I agree with basically everything you said, except calling it a solution. You’ll run into non-trivial problems when you try to turn it into an algorithm.
For example, the case of there being an actual physical mugger is meant to be an example of the more general problem of programs with tiny priors predicting super-huge rewards. A strategy based on “probability of the mugger lying” has to be translated to the general case somehow. You have to prevent the AI from mugging itself.