Rafael Harth comments on Attainable Utility Preservation: Empirical Results

Rafael Harth 28 Jul 2020 14:03 UTC
3 points
Turns out you don’t need the normalization, per the linked SafeLife paper. I’d probably just take it out of the equations, looking back. Complication often isn’t worth it.
It’s also slightly confusing in this case because the post doesn’t explain it, which made me wonder, “am I supposed to understand what it’s for?” But it is explained in the conservative agency paper.
I think the n-step stepwise inaction baseline doesn’t fail at any of them?
Yeah, but the first one was “[comparing AU for aux. goal if I do this action to] AU for aux. goal if I do nothing”