I’ll mention my own issues with IBP, and where the fatal issue lies in my opinion.
The most fatal objection, is as you said the monotonicity principle issue, and I suspect this is an issue because IBP is trying to both unify capabilities and values/morals, when I think they are strictly separate types of things, and in general the unification heuristic is going too far.
To be honest, if Vanessa managed to focus on how capable the IBP agent is, without trying to shoehorn an alignment solution into it, I think the IBP model might actually work.
I disagree on whether maximization of values is advisable, but I agree that the monotonicity principle is pointing to a fatal issue in IBP.
Another issue is that it’s trying to solve an impossible problem, that is it’s trying to avoid simulation hypotheses forming if the AI already has a well calibrated belief that we are being simulated by a superintelligence. But even under the most optimistic assumptions, if the AI is actually acausally cooperating with the simulator, we are no more equipped to fight against it than we are against alien invasions. Worst case, it would be equivalent to fighting an omnipotent and omniscient god, which pretty obviously is known to be unsolvable.
I’ll mention my own issues with IBP, and where the fatal issue lies in my opinion.
The most fatal objection, is as you said the monotonicity principle issue, and I suspect this is an issue because IBP is trying to both unify capabilities and values/morals, when I think they are strictly separate types of things, and in general the unification heuristic is going too far.
To be honest, if Vanessa managed to focus on how capable the IBP agent is, without trying to shoehorn an alignment solution into it, I think the IBP model might actually work.
I disagree on whether maximization of values is advisable, but I agree that the monotonicity principle is pointing to a fatal issue in IBP.
Another issue is that it’s trying to solve an impossible problem, that is it’s trying to avoid simulation hypotheses forming if the AI already has a well calibrated belief that we are being simulated by a superintelligence. But even under the most optimistic assumptions, if the AI is actually acausally cooperating with the simulator, we are no more equipped to fight against it than we are against alien invasions. Worst case, it would be equivalent to fighting an omnipotent and omniscient god, which pretty obviously is known to be unsolvable.