One consideration is something like… Would Bob be willing to be eaten (and raised in torturous conditions, if he’s eating inhumanely raised cattle), if the benefit outweighed the cost? If yes, then maybe he doesn’t need to be punished: his decision theory is to do the globally best thing, which shouldn’t be punished. Of course we can’t in practice determine this in every case with certainty, which leaves the door open for fraud (“oh yeah no yeah I totally would want to be eaten in the unreal counterfactual world where that was most utilitous, definitely”).
Generally, a tension here is that to prevent fraud, we have to punish actions, not algorithms, but the decision-theoretic ideal is to punish algorithms not actions.
Not sure how to express this, but I’ll try:
One consideration is something like… Would Bob be willing to be eaten (and raised in torturous conditions, if he’s eating inhumanely raised cattle), if the benefit outweighed the cost? If yes, then maybe he doesn’t need to be punished: his decision theory is to do the globally best thing, which shouldn’t be punished. Of course we can’t in practice determine this in every case with certainty, which leaves the door open for fraud (“oh yeah no yeah I totally would want to be eaten in the unreal counterfactual world where that was most utilitous, definitely”).
Generally, a tension here is that to prevent fraud, we have to punish actions, not algorithms, but the decision-theoretic ideal is to punish algorithms not actions.