Rohin Shah comments on An Analytic Perspective on AI Alignment

Rohin Shah 24 Mar 2020 1:19 UTC
LW: 2 AF: 2
AF
Hmm, I was more pointing at the distinction where the first claim doesn’t need to argue for the subclaim “we will be able to get people to use mechanistic transparency” (it’s assumed away by “if I were in charge of the world”), while the second claim does have to argue for it.
- DanielFilan 27 Mar 2020 23:50 UTC
  LW: 2 AF: 1
  AF Parent
  
  I am mostly interested in allowing the developers of AI systems to determine whether their system has the cognitive ability to cause human extinction, and whether their system might try to cause human extinction.
  
  The way I read this, if the research community enables the developers to determine these things at prohibitive cost, then we mostly haven’t “allowed” them to do it, but if the cost is manageable then we have. So I’d say my desiderata here (and also in my head) include the cost being manageable. If the cost of any such approach were necessarily prohibitive, I wouldn’t be very excited about it.