Ofer comments on Clarifying “What failure looks like”

Ofer Sep 22, 2020, 2:49 PM
LW: 5 AF: 4
AF

Note ML systems are way more interpretable than humans, so if they are replacing humans then this shouldn’t make that much of a difference.

I guess you mean here that activations and weights in NNs are more interpretable to us than neurological processes in the human brain, but if so this comparison does not seem relevant to the text you quoted. Consider that it seems easier to understand why an editor of a newspaper placed some article on the front page than why FB’s algorithm showed some post to some user (especially if we get to ask the editor questions or consult with other editors).

Overall I’d guess that for WFLL1 it’s closer to “replacing humans” than “replacing institutions”.

Even if so (which I would expect to become uncompetitive with “replacing institutions” at some point) you may still get weird dynamics between AI systems within an institution and across institutions (e.g. between a CEO advisor AI and a regulator advisor AI). These dynamics may be very hard to interpret (and may not even involve recognizable communication channels).
- Rohin Shah Sep 22, 2020, 3:27 PM
  LW: 5 AF: 5
  AF Parent
  I guess you mean here that activations and weights in NNs are more interpretable to us than neurological processes in the human brain, but if so this comparison does not seem relevant to the text you quoted. Consider that it seems easier to understand why an editor of a newspaper placed some article on the front page than why FB’s algorithm showed some post to some user (especially if we get to ask the editor questions or consult with other editors).
  Isn’t this what I said in the rest of that paragraph (although I didn’t have an example)?
  which I would expect to become uncompetitive with “replacing institutions” at some point
  I’m not claiming that replacing humans is more competitive than replacing institutions. I’m claiming that, if we’re considering the WFLL1 setting, and we’re considering the point at which we could have prevented failure, at that point I’d expect AI systems are in the “replacing humans” category. By the time they’re in the “replacing institutions” category, we probably are far beyond the position where we could do anything about the future.
  Separately, even in the long run, I expect modularity to be a key organizing principle for AI systems.
  you may still get weird dynamics between AI systems within an institution and across institutions (e.g. between a CEO advisor AI and a regulator advisor AI). These dynamics may be very hard to interpret (and may not even involve recognizable communication channels).
  I agree this is possible but it doesn’t seem very likely to me, since we’ll very likely be training our AI systems to communicate in natural language, and those AI systems will likely be trained to behave in vaguely human-like ways.
  - Ofer Sep 22, 2020, 5:36 PM
    LW: 1 AF: 1
    AF Parent
    
    Isn’t this what I said in the rest of that paragraph (although I didn’t have an example)?
    
    I meant to say that even if we replace just a single person (like a newspaper editor) with an ML system, it may become much harder to understand why each decision was made.
    
    I agree this is possible but it doesn’t seem very likely to me, since we’ll very likely be training our AI systems to communicate in natural language, and those AI systems will likely be trained to behave in vaguely human-like ways.
    
    The challenge here seems to me to train competitive models—that behave in vaguely human-like ways—for general real-world tasks (e.g. selecting content for a FB user feed or updating item prices on Walmart). In the business-as-usual scenario we would need such systems to be competitive with systems that are optimized for business metrics (e.g. users’ time spent or profit).