Mau comments on The case against AI alignment

Mau 25 Dec 2022 7:56 UTC
25 points
18
Thanks for posting, but I think these arguments have major oversights. This leaves me more optimistic about the extent to which people will avoid and prevent the horrible misuse you describe.

First, this post seems to overstate the extent to which people tend to value and carry out extreme torture. Maximally cruel torture fortunately seems very rare.
- The post asks “How many people have you personally seen who insist on justifying some form of suffering for those they consider undesirable[?]” But “justifying some form of suffering” isn’t actually an example of justifying extreme torture.
- The post asks, “What society hasn’t had some underclass it wanted to put down in the dirt just to lord power over them?” But that isn’t actually an example of people endorsing extreme torture.
- The post asks, “How many ordinary, regular people throughout history have become the worst kind of sadist under the slightest excuse or social pressure to do so to their hated outgroup?” But has it really been as many as the post suggests? The historical and ongoing atrocities that come to mind were cases of serious suffering in the context of moderately strong social pressure/conditioning—not maximally cruel torture in the context of slight social pressure.
- So history doesn’t actually give us strong reasons to expect maximally suffering-inducing torture at scale (edit: or at least, the arguments this post makes for that aren’t strong).
Second, this post seems to overlook a major force that often prevents torture (and which, I argue, will be increasingly able to succeed at doing so): many people disvalue torture and work collectively to prevent it.
- Torture tends to be illegal and prosecuted. The trend here seems to be positive, with cruelty against children, animals, prisoners, and the mentally ill being increasingly stigmatized, criminalized, and prosecuted over the past few centuries.
- We’re already seeing AI development being highly centralized, with this leading AI developers working to make their AI systems hit some balance of helpful and harmless, i.e. not just letting users carry out whatever misuse they want.
- Today, the cruelest acts of torture seem to be small-scale acts pursued by not-very-powerful individuals, while (as mentioned above) powerful actors tend to disvalue and work to prevent torture. Most people will probably continue to support the prevention and prosecution of very cruel torture, since that’s the usual trend, and also because people would want to ensure that they do not themselves end up as victims of horrible torture. In the future, people will be better equipped to enforce these prohibitions, through improved monitoring technologies.
Third, this post seems to overlook arguments for why AI alignment may be worthwhile (or opposing it may be a bad idea), even if a world with aligned AI wouldn’t be worthwhile on its own. My understanding is that most people focused on preventing extreme suffering find such arguments compelling enough to avoid working against alignment, and sometimes even to work towards it.
- Concern over s-risks will lose support and goodwill if adherents try to kill everyone, as the poster suggests they intend to do (“I will oppose any measure which makes the singularity more likely to be aligned with somebody’s values”). Then, if we do end up with aligned AI, it’ll be significantly less likely that powerful actors will work to stamp out extreme suffering.
- The highest-leverage intervention for preventing suffering is arguably coordinating/trading with worlds where there is a lot of it, and humanity won’t be able to do that if we lose control of this world.
These oversights strike me as pretty reckless, when arguing for letting (or making) everyone die.
- Aaron_Scher 30 Dec 2022 4:04 UTC
  3 points
  2
  Parent
  I first want to signal-boost Mauricio’s comment.
  
  My experience reading the post was that I kinda nodded along without explicitly identifying and interrogating cruxes. I’m glad that Mauricio has pointed out the crux of “how likely is human civilization to value suffering/torture”. Another crux is “assuming some expectation about how much humans value suffering, how likely are we to get a world with lots of suffering, assuming aligned ai”, another is “who is in control if we get aligned AI”, another is “how good is the good that could come from aligned ai and how likely is it”.
  
  In effect this post seems to argue “because humans have a history of producing lots of suffering, getting an AI aligned to human intent would produce an immense amount of suffering, so much that rolling the dice is worse than extinction with certainty”
  
  It matters what the probabilities are, and it matters what the goods and bars are, but this post doesn’t seem to argue very convincingly that extremely-bads are all that likely (see Mauricio’s bullet points).
- andrew sauer 25 Dec 2022 17:01 UTC
  3 points
  0
  Parent
  I’ll have to think about the things you say, particularly the part about support and goodwill. I am curious about what you mean by trading with other worlds?
  - Mau 25 Dec 2022 18:42 UTC
    4 points
    0
    Parent
    Ah sorry, I meant the ideas introduced in this post and this one (though I haven’t yet read either closely).