I think this shows that the step-wise inaction penalty is time-inconsistent: https://www.lesswrong.com/posts/w8QBmgQwb83vDMXoz/dynamic-inconsistency-of-the-stepwise-inaction-baseline
I think this shows that the step-wise inaction penalty is time-inconsistent: https://www.lesswrong.com/posts/w8QBmgQwb83vDMXoz/dynamic-inconsistency-of-the-stepwise-inaction-baseline