quetzal_rainbow comments on How could you possibly choose what an AI wants?

quetzal_rainbow 19 Apr 2023 20:54 UTC
1 point
0
Actually, another example of pivotal act is “invent method of mind uploading, upload some alignment researchers, run them at speed 1000x until they solve full alignment problem”. I’m sure that if you think hard enough, you can find some other, even less dangerous pivotal act, but you probably shouldn’t talk out loud about it.
- mishka 19 Apr 2023 21:14 UTC
  1 point
  0
  Parent
  Right, but how do you restrict them from “figuring out how to know themselves and figuring out how to self-improve themselves to become gods”? And I remember talking to Eliezer in … 2011 at his AGI-2011 poster and telling him, “but we can’t control a teenager, and why would not AI rebel against your ‘provably safe’ technique, like a human teenager would”, and he answered “that’s why it should not be human-like, a human-like one can’t be provably safe”.
  
  Yes, I am always unsure, what we can or can’t talk about out loud (nothing effective seems to be safe to talk about, “effective” seems to always imply “powerful”, this is, of course, one of the key conundrums, how do we organize real discussions about these things)...
  - quetzal_rainbow 19 Apr 2023 21:18 UTC
    1 point
    0
    Parent
    Yep, corrigibility is unsolved! So we should try to solve it.
    - mishka 19 Apr 2023 21:47 UTC
      1 point
      0
      Parent
      I wrote the following in 2012:
      
      “The idea of trying to control or manipulate an entity which is much smarter than a human does not seem ethical, feasible, or wise. What we might try to aim for is a respectful interaction.”
      
      I still think that this kind of a more symmetric formulation is the best we can hope for, unless the AI we are dealing with is not “an entity with sentience and rights”, but only a “smart instrument” (even the LLM-produced simulations in the sense of Janus’ Simulator theory seem to me to already be much more than “merely smart instruments” in this sense, so if “smart superintelligent instruments” are at all possible, we are not moving in the right direction to obtain them; a different architecture and different training methods or, perhaps, non-training synthesis methods would be necessary for that (and would be something difficult to talk out loud about, because that’s very powerful too)).