Wei Dai comments on Disentangling arguments for the importance of AI safety

Wei Dai 21 Jan 2019 21:51 UTC
LW: 5 AF: 2
AF
This is great, but 4 and 5 seem to be aspects of the same problem to me (i.e., that humans aren’t safe agents) and I’m not sure how you’re proposing to draw the line between them. For example

It’s also possible that we invent some technology which destroys us unexpectedly, either through unluckiness or carelessness.

If this was caused entirely by an AI pursuing an otherwise beneficial goal, it would certainly count as a failure of AI safety (and is currently studied under “safe exploration”) so it seems to make sense to call the analogous human problem “human safety”. Similarly coordination between AIs is considered a safety problem and studied under decision theory and game theory for AIs.

Can you explain a bit more the difference you see between 4 and 5?
- ESRogs 22 Jan 2019 1:23 UTC
  LW: 6 AF: 4
  AF Parent
  To me the difference is that when I read 5 I’m thinking about people being careless or malevolent, in an everyday sense of those terms, whereas when I read 4 I’m thinking about how maybe there’s no such thing as a human who’s not careless or malevolent, if given enough power and presented with a weird enough situation.
- Richard_Ngo 22 Jan 2019 11:07 UTC
  LW: 4 AF: 3
  AF Parent
  I endorse ESRogs’ answer. If the world were a singleton under the control of a few particularly benevolent and wise humans, with an AGI that obeys the intention of practical commands (in a somewhat naive way, say, so it’d be unable to help them figure out ethics) then I think argument 5 would no longer apply, but argument 4 would. Or, more generally: argument 5 is about how humans might behave badly under current situations and governmental structures in the short term, but makes no claim that this will be a systemic problem in the long term (we could probably solve it using a singleton + mass surveillance); argument 4 is about how we don’t know of any governmental(/psychological?) structures which are very likely to work well in the long term.
  Having said that, your ideas were the main (but not sole) inspiration for argument 4, so if this isn’t what you intended, then I may need to rethink its inclusion.
  - Wei Dai 22 Jan 2019 17:22 UTC
    LW: 4 AF: 2
    AF Parent
    I think this division makes sense on a substantive level, and I guess I was confused by the naming and the ordering between 4 and 5. I would define “human safety problems” to include both short term and long term problems (just like “AI safety problems” includes short term and long term problems) so I’d put both 4 and 5 under “human safety problems” instead of just 4. I guess in my posts I mostly focused on long term problems since short term problems have already been widely recognized, but as far as naming, it seems strange to exclude short term problems from “human safety problems”. Also you wrote “They are listed roughly from most specific and actionable to most general” and 4 feels like a more general problem than 5 to me, although perhaps that’s arguable.