Okay, I find it weird that your supposed vector of attack is “Humans find some voices more persuasive than other voices” and not “AI can copy voices of everyone you trust”. For what I know you can imitate almost everyone with just a few minutes of audio.
I don’t believe in a clear distinction between interactions that are attacks and interactions that are not attacks. When a politician asks people to vote for him so that he can have power, he’s not engaging in “attacks” but if he wins he still gets power. Some of the things a politician says are more manipulative than others.
I think that AGI’s are more likely able to justify to themselves by engaging in power-seeking behavior that’s about simply being very persuasive then to justify misleading people by faking voices, so I’m more worried about them wielding power in ways that are easier to rationalize.
Okay, I find it weird that your supposed vector of attack is “Humans find some voices more persuasive than other voices” and not “AI can copy voices of everyone you trust”. For what I know you can imitate almost everyone with just a few minutes of audio.
I don’t believe in a clear distinction between interactions that are attacks and interactions that are not attacks. When a politician asks people to vote for him so that he can have power, he’s not engaging in “attacks” but if he wins he still gets power. Some of the things a politician says are more manipulative than others.
I think that AGI’s are more likely able to justify to themselves by engaging in power-seeking behavior that’s about simply being very persuasive then to justify misleading people by faking voices, so I’m more worried about them wielding power in ways that are easier to rationalize.