Then my disagreement is that I disagree with the claim that the human regime is very special, or that there’s any reason to attach much specialness to human-level intelligence.
In essence, I agree with a weaker version of Quintin Pope’s comment here:
“AGI” is not the point at which the nascent “core of general intelligence” within the model “wakes up”, becomes an “I”, and starts planning to advance its own agenda. AGI is just shorthand for when we apply a sufficiently flexible and regularized function approximator to a dataset that covers a sufficiently wide range of useful behavioral patterns.
There are no “values”, “wants”, “hostility”, etc. outside of those encoded in the structure of the training data (and to a FAR lesser extent, the model/optimizer inductive biases). You can’t deduce an AGI’s behaviors from first principles without reference to that training data. If you don’t want an AGI capable and inclined to escape, don’t train it on data[1] that gives it the capabilities and inclination to escape.
Putting it another way, I suspect you’re suffering from the fallacy of generalizing from fiction, since fictional portrayals make it far more discontinuous and misaligned ala the Terminator than what happens in reality.
Then my disagreement is that I disagree with the claim that the human regime is very special, or that there’s any reason to attach much specialness to human-level intelligence.
In essence, I agree with a weaker version of Quintin Pope’s comment here:
https://forum.effectivealtruism.org/posts/zd5inbT4kYKivincm/?commentId=Zyz9j9vW8Ai5eZiFb
Putting it another way, I suspect you’re suffering from the fallacy of generalizing from fiction, since fictional portrayals make it far more discontinuous and misaligned ala the Terminator than what happens in reality.
Link below:
https://www.lesswrong.com/posts/rHBdcHGLJ7KvLJQPk/the-logical-fallacy-of-generalization-from-fictional