I guess that alignment problem is “difference in power between agents is dangerous” rather than “AGI is dangerous”.
Sketch of proof:
Agent is either optimizing for some utility function or not optimizing for any at all. The second case seems dangerous both for it and for surrounding agents [proof needed].
Utility function probably can be represented as vector in basis “utility of other agents” x “power” x “time existing” x etc. More powerful agents move the world further along their utility vectors.
If utility vectors of a powerful agent (AGI, for example) and of humans are different, on some level of power this difference (also a vector) will become sufficiently big that we consider the agent misaligned.
I guess that alignment problem is “difference in power between agents is dangerous” rather than “AGI is dangerous”.
Sketch of proof:
Agent is either optimizing for some utility function or not optimizing for any at all. The second case seems dangerous both for it and for surrounding agents [proof needed].
Utility function probably can be represented as vector in basis “utility of other agents” x “power” x “time existing” x etc. More powerful agents move the world further along their utility vectors.
If utility vectors of a powerful agent (AGI, for example) and of humans are different, on some level of power this difference (also a vector) will become sufficiently big that we consider the agent misaligned.