Daniel Kokotajlo comments on Is the argument that AI is an xrisk valid?

Daniel Kokotajlo Jul 19, 2021, 4:48 PM
5 points
Thanks for posting this here! As you might expect, I disagree with you. I’d be interested to hear your positive account of why there isn’t x-risk from AI (excluding from misused instrumental intelligence). Your view seems to be that we may eventually build AGI, but that it’ll be able to reason about goals, morality, etc. unlike the cognitiviely limited instrumental AIs you discuss, and therefore it won’t be a threat. Can you expand on the italicized bit? Is the idea that if it can reason about such things, it’s as likely as we humans are to come to the truth about them? (And, there is in fact a truth about them? Some philosophers would deny this about e.g. morality.) Or indeed perhaps you would say it’s more likely than humans to come to the truth, since if it were merely as likely as humans then it would be pretty scary (humans come to the wrong conclusions all the time, and have done terrible things when granted absolute power).
- VCM Jul 24, 2021, 3:22 PM
  3 points
  Parent
  We do not say that there is no XRisk or no XRisk from AI.
  - Daniel Kokotajlo Jul 25, 2021, 7:13 AM
    2 points
    Parent
    Yeah, sorry, I misspoke. You are critiquing one of the arguments for why there is XRisk from AI. One way to critique an argument is to dismiss it on “purely technical” grounds, e.g. “this argument equivocates between two different meanings of a term, therefore it is disqualified.” But usually when people critique arguments, even if on technical grounds, they also have more “substantive” critiques in mind, e.g. “here is a possible world in which the premises are true and the conclusion false.” (Or both conclusion and at least one premise false). I was guessing that you had such a possible world in mind, and trying to get a sense of what it looked like.
    - VCM Jul 25, 2021, 7:32 AM
      7 points
      Parent
      Thanks. We are actually more modest. We would like to see a sound argument for XRisk from AI and we investigate what we call ‘the standard argument’; we find it wanting and try to strengthen it, but we fail. So there is something amiss. In the conclusion we admit “we could well be wrong somewhere and the classical argument for existential risk from AI is actually sound, or there is another argument that we have not considered.”
      I would say the challenge is to present a sound argument (valid + true premises) or at least a valid argument with decent inductive support for the premises. Oddly, we do not seem to have that.
      - Daniel Kokotajlo Jul 25, 2021, 10:05 AM
        9 points
        Parent
        Laying my cards on the table, I think that there do exist valid arguments with plausible premises for x-risk from AI, and insofar as you haven’t found them yet then you haven’t been looking hard enough or charitably enough. The stuff I was saying above is a suggestion for how you could proceed: If you can’t prove X, try to prove not-X for a bit, often you learn something that helps you prove X. So, I suggest you try to argue that there is no x-risk from AI (excluding the kinds you acknowledge, such as AI misused by humans) and see where that leads you. It sounds like you have the seeds of such an argument in your paper; I was trying to pull them together and flesh them out in the comment above.