DanielFilan comments on Test Cases for Impact Regularisation Methods

DanielFilan 7 Feb 2019 20:03 UTC
LW: 3 AF: 2
AF

This post is extremely well done.

Thanks!

Wouldn’t most measures with a stepwise inaction baseline pass?

I think not, because given stepwise inaction, the supervisor will issue a high-impact task, and the AI system will just ignore it due to being inactive. Therefore, the actual rollout of the supervisor issuing a high-impact task and the system completing it should be high impact relative to that baseline. Or at least that’s my current thinking, I’ve regularly found myself changing my mind about what systems actually do in these test cases.
- DanielFilan 15 Apr 2021 0:27 UTC
  LW: 2 AF: 1
  AF Parent
  OK, I now think the above comment is wrong, because proposals using stepwise inaction baselines often compare what would happen if you didn’t take the current action and were inactive to what would happen if you took the current action but were inactive from then on—at least that’s how it’s represented in this paper.