My hangup is that it seems like a truly benevolent AI would share our goals.
In the way that a “truly benevolent” human would leave an unpolluted lake for fish to live in, instead of using it for its own purposes. The fish might think that humans share its goals, but the human goals would be infinitely more complex than fish could understand.
...It sounds like you’re hinting at the fact that humans are not benevolent towards fish. If we are, then we do share its goals when it comes to outcomes for the fish—we just have other goals, which do not conflict. (I’m assuming the fish actually has clear preferences.) And a well-designed AI should not even have additional goals. The lack of understanding “only” might come in with the means, or with our poor understanding of our own preferences.
In the way that a “truly benevolent” human would leave an unpolluted lake for fish to live in, instead of using it for its own purposes. The fish might think that humans share its goals, but the human goals would be infinitely more complex than fish could understand.
...It sounds like you’re hinting at the fact that humans are not benevolent towards fish. If we are, then we do share its goals when it comes to outcomes for the fish—we just have other goals, which do not conflict. (I’m assuming the fish actually has clear preferences.) And a well-designed AI should not even have additional goals. The lack of understanding “only” might come in with the means, or with our poor understanding of our own preferences.