please cover the difference between precommitment and saying out loud (Or even believing) “I precommit not to succumb to blackmail/let the AI out of the box”. This is one of the most common mistakes I’ve seen, even on LW
Yep. The most common model that yields a rational agent who will choose to restrict zir own future actions is beta-delta discounting, or time inconsistent preferences. I’ve had problem sets with such questions, usually involving a student procrastinating on an assignment; I don’t think I can copy them, but let me know if you want me to sketch out how such a problem might look.
Actually, maybe the most instrumental-rationality-enhancing topics to cover that have legitimate game theoretic aspects are in behavioral economics. Perhaps you could construct examples where you contrast the behavior of an agent who interprets probabilities in a funny way, as in Prospect Theory, with an agent who obeys the vNM axioms.
Precommitment is an interesting aspect of game theory that ties in well with lukeprog’s how to beat procrastination.
please cover the difference between precommitment and saying out loud (Or even believing) “I precommit not to succumb to blackmail/let the AI out of the box”. This is one of the most common mistakes I’ve seen, even on LW
I don’t think there a difference in kind. It’s just that some commitments are stronger than others.
Yep. The most common model that yields a rational agent who will choose to restrict zir own future actions is beta-delta discounting, or time inconsistent preferences. I’ve had problem sets with such questions, usually involving a student procrastinating on an assignment; I don’t think I can copy them, but let me know if you want me to sketch out how such a problem might look.
Actually, maybe the most instrumental-rationality-enhancing topics to cover that have legitimate game theoretic aspects are in behavioral economics. Perhaps you could construct examples where you contrast the behavior of an agent who interprets probabilities in a funny way, as in Prospect Theory, with an agent who obeys the vNM axioms.