avturchin comments on Turing-Test-Passing AI implies Aligned AI

avturchin 1 Jan 2025 15:02 UTC
2 points
0
The main problem here is that this approach doesn’t solve alignment, but merely shifts it to another system. We know that human organizational systems also suffer from misalignment—they are intrinsically misaligned. Here are several types of human organizational misalignment:
- Dictatorship: exhibits non-corrigibility, with power becoming a convergent goal
- Goodharting: manifests the same way as in AI systems
- Corruption: acts as internal wireheading
- Absurd projects (pyramids, genocide): parallel AI’s paperclip maximization
- Hansonian organizational rot: mirrors error accumulation in AI systems
- Aggression: parallels an AI’s drive to dominate the world
All previous attempts to create a government without these issues have failed (Musk’s DOGE will likely be another such attempt).
Furthermore, this approach doesn’t prevent others from creating self-improving paperclippers.
- Roko 1 Jan 2025 20:29 UTC
  2 points
  0
  Parent
  The most important thing here is that we can at least achieve an outcome with AI that is equal to the outcome we would get without AI, and as far as I know nobody has suggested a system that has that property.
  
  The famous “list of lethalities” (https://www.lesswrong.com/posts/uMQ3cqWDPHhjtiesc/agi-ruin-a-list-of-lethalities) piece would consider that a strong success.
  - avturchin 1 Jan 2025 20:44 UTC
    0 points
    0
    Parent
    I once wrote about an idea that we need to scan just one good person and make them a virtual king. This idea of mine is a subset of your idea in which several uploads form a good government.
    I also spent last year perfecting my mind’s model (sideload) to be run by an LLM. I am likely now the closest person on Earth to being uploaded.
    - Roko 2 Jan 2025 1:19 UTC
      2 points
      0
      Parent
      that’s true, however I don’t think it’s necessary that the person is good.
      - avturchin 2 Jan 2025 10:42 UTC
        0 points
        0
        Parent
        If one king-person, he needs to be good. If many, organizational system needs to be good. Like virtual US Constitution.
        Roko 2 Jan 2025 14:10 UTC
        2 points
        0
        Parent
        
        If one king-person
        
        yes. But this is a very unusual arrangement.
        avturchin 2 Jan 2025 14:56 UTC
        2 points
        0
        Parent
        If we have one good person, we could use his-her copies many times in many roles, including high-speed assessment of the safety of AI’s outputs.
        Current LLM’s, btw, have good model of the mind of Gwern (without any his personal details).