Thane Ruthenis comments on Accurate Models of AI Risk Are Hyperexistential Exfohazards

Thane Ruthenis 26 Dec 2022 2:53 UTC
4 points
0
I believe I address most of that in the post. The “evolution selected for it” explanation doesn’t actually explain much about the mechanism by which power is correlated with corruption.
I suppose one additional “mechanism” is if one is immoral to begin with, and upon acquiring power, it just allows them to act on previously-suppressed urges to hurt people.
But if a person is selected not on the basis of power-hunger but on the basis of intellectual curiosity and commitment to abstract altruistic ideals, and then only has to “exercise” that power once by way of writing some lines of code, while supervised by other people selected on the basis of curiosity/abstract altruism? Seems plausible that it won’t involve any corrupting incentives.
I guess we can end up with several plausible targets for the AI, and then there’d be a disagreement over them, and the most vicious guy will win, and e. g. the chosen target will be DWIM, which will make the paradigm shift slow enough to potentially give the vicious guy time to feel threatened by the opposition, which will make them abuse the power and therefore get corrupted by it...
But, uh, that can happen with any organization tasked with deploying the AGI? Except if it’s done “through the proper channels”, on top of this problem, we also have the problem where the people nominally in charge of the AGI deployment were subjected to all the corrupting incentives.
- Gurkenglas 1 Jan 2023 14:25 UTC
  2 points
  0
  Parent
  There’s a kind of equivalence between being immoral to begin with, which is suppressed all our lives, and moral to begin with, but corrupted upon gaining power. To resolve ontological crises, I recommend https://arbital.com/p/rescue_utility. Afaic, the instinct to identify with past and future inhabitants of our body has the game-theoretic purpose of establishing mutual cooperation. This suggests to identify with whichever aspects of oneself would mutually cooperate.
  Sure, one can avert corruption by never triggering the ancient condition of feeling in power. That’s perhaps a core purpose of democracy.
  That non-genre-savvy supervillain feeling? Check whether this was a case of “Honest Inside View vs. Unfortunate Deontological Injunction” or of “Ancient politics adaptation vs. Deontological Injunction working exactly as intended”.