Wei Dai comments on AI Alignment Open Thread August 2019

Wei Dai 23 Sep 2019 5:35 UTC
LW: 11 AF: 5
AF
A downside of the portfolio approach to AI safety research

Given typical human biases, researchers of each AI safety approach are likely to be overconfident about the difficulty and safety of the approach they’re personally advocating and pursuing, which exacerbates the problem of unilateralist’s curse in AI. This should highlighted and kept in mind by practitioners of the portfolio approach to AI safety research (e.g., grant makers). In particular it may be a good idea to make sure researchers who are being funded have a good understanding of the overconfidence effect and other relevant biases, as well as the unilateralist’s curse.
- Ofer 23 Sep 2019 7:11 UTC
  LW: 1 AF: 1
  AF Parent
  These biases seem very important to keep in mind!
  If “AI safety” refers here only to AI alignment, I’d be happy to read about how overconfidence about the difficulty/safety of one’s approach might exacerbate the unilateralist’s curse.