See the clarifying note in the OP. I don’t think this is about imitating humans, per se.
Yes, I realized that after I wrote my original comment, so I added the “ETA” part.
I think this intuition has some validity, but also might lead to a false sense of confidence that such systems are safe, when in fact they may end up behaving as if they do seek to influence the world, depending on the task they are trained on (ETA: and other details of the learning algorithm, e.g. outer-loop optimization and model choice).
I think this makes sense and at least some have also realized this and have reacted appropriately within their agenda (see the “ETA” part of my earlier comment). It also seems good that you’re calling it out as a general issue. I’d still suggest giving some examples of AI alignment proposals where people haven’t realized this, to help illustrate your point.
Yes, I realized that after I wrote my original comment, so I added the “ETA” part.
I think this makes sense and at least some have also realized this and have reacted appropriately within their agenda (see the “ETA” part of my earlier comment). It also seems good that you’re calling it out as a general issue. I’d still suggest giving some examples of AI alignment proposals where people haven’t realized this, to help illustrate your point.