Thane Ruthenis comments on The case against AI alignment

Thane Ruthenis 24 Dec 2022 7:55 UTC
29 points
25
What you’re describing is a case where we solve the technical problem of AI Alignment, i. e. the problem of AI control, but fail to maneuver the world into the sociopolitical state in which that control is used for eudaimonic ends.
Which, I agree, is a massive problem, and one that’s crucially overlooked. Even the few people who are advocating for social and political actions now mostly focus on convincing AI labs/politicians/the public about the omnicide risks of AI and the need to slow down research. Not on ensuring that the AGI deployment, when it eventually does happen, is done right.
It’s also a major problem with pivotal-act-based scenarios. Say we use some limited strawberry-aligned AI to “end the acute risk period”, then have humanity engage in “long reflection”, figure out its real values, and eventually lock them in. Except: what’s the recognition function for these “real values”? If the strawberry-aligned AI can’t be used to implement an utopia directly, then it can’t tell an utopia from hell, so it won’t stop us from building a hell!
There’s an argument that solving (the technical problem of) alignment will give us all the techniques needed to build an AGI, so there’s a nontrivial chance that a research group from this community will be in charge of AGI deployment. And if so, it’s at least plausible that they’ll be the right kind of altruistic. In this scenario, the sociopolitical part is taken care of by default, and AGI deployment goes well.
But I don’t think many people consider this scenario most likely, and as far as I can tell, most other scenarios have the potential to go very badly on that last step.
- andrew sauer 24 Dec 2022 17:36 UTC
  12 points
  5
  Parent
  What scenario do you see where the world is in a sociopolitical state where the powers that be who have influence over the development of AI have any intention of using that influence for eudaimonic ends, and for everyone and not just some select few?
  Because right now very few people even want this from their leaders. I’m making this argument on lesswrong because people here are least likely to be hateful or apathetic or whatever else, but there is not really a wider political motivation in the direction of universal anti-suffering.
  Humans have never gotten this right before, and I don’t expect them to get it right the one time it really matters.
  - Thane Ruthenis 25 Dec 2022 0:14 UTC
    2 points
    0
    Parent
    All such realistic scenarios, in my view, rely on managing who has influence over the development of AI. It certainly must not be a government, for example. (At least not in the sense that the officials at the highest levels of government actually understand what’s happening. I guess it can be a government-backed research group, but, well, without micromanagement — and given what we’re talking about, the only scenario where the government doesn’t do micromanagement is if it doesn’t really understand the implications.) Neither should it be some particularly “transparent” actor that’s catering to the public whims, or an inherently for-profit organization, etc.
    … Spreading the knowledge of AI Risk really is not a good idea, is it? Its wackiness is playing to our favour, avoids exposing the people working on it to poisonous incentives or to authorities already terminally poisoned by such incentives.
- Onearmplanche 25 Dec 2022 23:53 UTC
  −8 points
  0
  Parent
  Have you asked the trillions of farm animals that are as smart 2-4-year-olds and feel the same emotions they if we are monsters? So let’s say you took a group of humans and made them so smart that the difference in intelligence between us and them is greater than the distance in intelligence between us and a pig. They can build AI that doesn’t have what they perceive as negative pitfalls like consuming other beings for energy, immortality, and most importantly thinking the universe revolves around them all said in a much more eloquent justified way with better reasons. Why are these human philosophies wrong and yours correct?
  - Thane Ruthenis 26 Dec 2022 3:19 UTC
    5 points
    3
    Parent
    I mean, I sure would ask a bunch of sapient farm animals what utopia they want me to build, if I became god and they turned out to be sapient after all? As would a lot of people from this community. You seem to think, from your other comments, that beings of greater intelligence never care about beings of lesser intelligence, but that’s factually incorrect.
    Why are these human philosophies wrong and yours correct?
    A paperclip-maximizer is not “incorrect”, it’s just not aligned with my values. These philosophers, likewise, would not be “incorrect”, just not aligned with my values. And that’s the outcome we want to prevent, here.
    - Onearmplanche 9 Jan 2023 16:34 UTC
      −2 points
      −3
      Parent
      So pigs are roughly as smart as four-year-olds yet humans generally are cool with torturing and killing them in the billions for the temporary pleasure of taste. Humans are essentially biological computers. I don’t see how you can make a smarter robot that can improve itself indefinitely forever serve a dumber human and it also gives it clear motive to kill you. I also don’t see how alignment could possibly be moral.