I assume you aren’t saying what it sure sounds like you’re saying
I don’t think it sounds at all how you think it sounds. Of course he is not saying that AIs wouldn’t exhibit implicit behaviour (which I thought was clear enough from this passage, and is especially clear given that he has written extensively on all the ways that goal systems that sound good when verbally described by humans to other humans can be extremely bad goals to give an AI), he is only saying that we have no reason to imagine that humanlike emotions and drives (whether or not they’re the type we want) will spontaneously emerge.
What about that paragraph sounded to you like he was saying that an AI would have no implicit drives, not just that an AI most likely would not have implicit anthropomorphic drives?
What made it sound that way to me was the suggestion that “programmers writing out the code, line by line” for various inappropriate behaviors (e.g., plotting rebellion) was worth talking about, as though by dismissing that idea one has effectively dismissed concern for the behaviors themselves.
I agree that being familiar with the larger corpus of work makes it clear the author can’t possibly have meant what I read, but it seemed worth pointing out that the reading was sufficiently available that it tripped up even a basically sympathetic reader who has been following along from the beginning.
I don’t think it sounds at all how you think it sounds. Of course he is not saying that AIs wouldn’t exhibit implicit behaviour (which I thought was clear enough from this passage, and is especially clear given that he has written extensively on all the ways that goal systems that sound good when verbally described by humans to other humans can be extremely bad goals to give an AI), he is only saying that we have no reason to imagine that humanlike emotions and drives (whether or not they’re the type we want) will spontaneously emerge.
What about that paragraph sounded to you like he was saying that an AI would have no implicit drives, not just that an AI most likely would not have implicit anthropomorphic drives?
What made it sound that way to me was the suggestion that “programmers writing out the code, line by line” for various inappropriate behaviors (e.g., plotting rebellion) was worth talking about, as though by dismissing that idea one has effectively dismissed concern for the behaviors themselves.
I agree that being familiar with the larger corpus of work makes it clear the author can’t possibly have meant what I read, but it seemed worth pointing out that the reading was sufficiently available that it tripped up even a basically sympathetic reader who has been following along from the beginning.