paulfchristiano comments on Techniques for optimizing worst-case performance

paulfchristiano 28 Jan 2019 22:40 UTC
2 points
I agree that you probably need ensembling in addition to these techniques.
At best this technique would produce a system which has a small probability of unacceptable behavior for any input. You’d then need to combine multiple of those to get a system with negligible probability of unacceptable behavior.
I expect you often get this for free, since catastrophe either involves a bunch of different AI systems behaving unacceptably, or a single AI behaving consistently unacceptably across time.