I may be missing something, but why does this matter? An AI has components, as does the human mind. When reasoning about friendliness, what matters is the goal component. Can’t the perception/probability estimate module just be treated as an interchangeable black box, regardless of whether it is a DNN, or MCTS Solomov induction approximation, or Bayes nets or anything else?
Can’t the perception/probability estimate module just be treated as an interchangeable black box, regardless of whether it is a DNN, or MCTS Solomov induction approximation, or Bayes nets or anything else?
Not necessarily. If the goal component what’s to respect human preferences, it will be vital that the perception component isn’t going to correctly identify what constitutes a “human”.
This doesn’t seem like a major problem, or one which is exclusive to friendliness—computers can already recognise pictures of humans, and any AGI is going to have to be able to identify and categorise things.
I may be missing something, but why does this matter? An AI has components, as does the human mind. When reasoning about friendliness, what matters is the goal component. Can’t the perception/probability estimate module just be treated as an interchangeable black box, regardless of whether it is a DNN, or MCTS Solomov induction approximation, or Bayes nets or anything else?
Not necessarily. If the goal component what’s to respect human preferences, it will be vital that the perception component isn’t going to correctly identify what constitutes a “human”.
This doesn’t seem like a major problem, or one which is exclusive to friendliness—computers can already recognise pictures of humans, and any AGI is going to have to be able to identify and categorise things.
Well, not quite.