Rob Bensinger comments on AGI Ruin: A List of Lethalities

Rob Bensinger 16 Jun 2022 8:40 UTC
2 points
−1
What do you mean by ‘intentionality’? Per SEP, “In philosophy, intentionality is the power of minds and mental states to be about, to represent, or to stand for, things, properties and states of affairs.” So I read your comment as saying, a la Searle, ‘maybe AI can never think like a human because there’s something mysterious and crucial about carbon atoms in particular, or about capital-b Biology, for doing reasoning.’
This seems transparently silly to me—I know of no reasonable argument for thinking carbon differs from silicon on this dimension—and also not relevant to AGI risk. You can protest “but AlphaGo doesn’t really understand Go!” until the cows come home, and it will still beat you at Go. You can protest “but you don’t really understand killer nanobots!” until the cows come home, and superintelligent Unfriendly AI will still build the nanobots and kill you with them.
By the same reasoning, Searle-style arguments aren’t grounds for pessimism either. If Friendly AI lacks true intentionality or true consciousness or whatever, it can still do all the same mechanistic operations, and therefore still produce the same desirable good outcomes as if it had human-style intentionality or whatver.
What links here?
- Remmelt's comment on The inordinately slow spread of good AGI conversations in ML by Rob Bensinger (9 Jul 2022 14:47 UTC; 3 points)
- Remmelt 16 Jun 2022 10:01 UTC
  1 point
  0
  Parent
  
  So I read your comment as saying, a la Searle, ‘maybe AI can never think like a human because there’s something mysterious and crucial about carbon atoms in particular, or about capital-b Biology, for doing reasoning.’
  
  That’s not the argument. Give me a few days to write a response. There’s a minefield of possible misinterpretations here.
  
  whatever, it can still do all the same mechanistic operations, and therefore still produce the same desirable good outcomes as if it had human-style intentionality or whatver.
  
  However, the argumentation does undermine the idea that designing for mechanistic (alignment) operations is going to work. I’ll try and explain why.
  - Remmelt 16 Jun 2022 12:41 UTC
    2 points
    0
    Parent
    BTW, with ‘intentionality’, I meant something closer to everyday notions of ‘intentions one has’. Will more precisely define that meaning later.
    
    I should have checked for diverging definitions from formal fields. Thanks for catching that.
  - Remmelt 16 Jun 2022 10:42 UTC
    2 points
    0
    Parent
    If you happen to have time, this paper serves as useful background reading: https://royalsocietypublishing.org/doi/full/10.1098/rsif.2012.0869
    
    Particularly note the shift from trivial self-replication (e.g. most computer viruses) to non-trivial self-replication (e.g. as through substrate-environment pathways to reproduction).
    
    None of this is sufficient for you to guess what the argumentation is (you might be able to capture a bit of it, along with a lot of incorrect and often implicit assumptions we must dig into).
    
    If you could call on some patience and openness to new ideas, I would really appreciate it! I am already bracing for a next misinterpretation (which is fine, if we can talk about that). I apologise for that I cannot find a viable way yet to throw out all the argumentation in one go, and also for that this will get a bit disorientating when we go through arguments step-by-step.
  - Remmelt 19 Jun 2022 10:19 UTC
    1 point
    0
    Parent
    Returning to this:
    Give me a few days to write a response. There’s a minefield of possible misinterpretations here.
    
    Key idea: Different basis of existence→ different drives→ different intentions→ different outcomes.
    
    @Rob, I wrote up a longer explanation here, which I prefer to discuss with you in private first. Will email you a copy ~~tomorrow~~ in the next weeks.