Right, but how do you restrict them from “figuring out how to know themselves and figuring out how to self-improve themselves to become gods”? And I remember talking to Eliezer in … 2011 at his AGI-2011 poster and telling him, “but we can’t control a teenager, and why would not AI rebel against your ‘provably safe’ technique, like a human teenager would”, and he answered “that’s why it should not be human-like, a human-like one can’t be provably safe”.
Yes, I am always unsure, what we can or can’t talk about out loud (nothing effective seems to be safe to talk about, “effective” seems to always imply “powerful”, this is, of course, one of the key conundrums, how do we organize real discussions about these things)...
“The idea of trying to control or manipulate an entity which is much smarter than a human does not seem ethical, feasible, or wise. What we might try to aim for is a respectful interaction.”
I still think that this kind of a more symmetric formulation is the best we can hope for, unless the AI we are dealing with is not “an entity with sentience and rights”, but only a “smart instrument” (even the LLM-produced simulations in the sense of Janus’ Simulator theory seem to me to already be much more than “merely smart instruments” in this sense, so if “smart superintelligent instruments” are at all possible, we are not moving in the right direction to obtain them; a different architecture and different training methods or, perhaps, non-training synthesis methods would be necessary for that (and would be something difficult to talk out loud about, because that’s very powerful too)).
Right, but how do you restrict them from “figuring out how to know themselves and figuring out how to self-improve themselves to become gods”? And I remember talking to Eliezer in … 2011 at his AGI-2011 poster and telling him, “but we can’t control a teenager, and why would not AI rebel against your ‘provably safe’ technique, like a human teenager would”, and he answered “that’s why it should not be human-like, a human-like one can’t be provably safe”.
Yes, I am always unsure, what we can or can’t talk about out loud (nothing effective seems to be safe to talk about, “effective” seems to always imply “powerful”, this is, of course, one of the key conundrums, how do we organize real discussions about these things)...
Yep, corrigibility is unsolved! So we should try to solve it.
I wrote the following in 2012:
“The idea of trying to control or manipulate an entity which is much smarter than a human does not seem ethical, feasible, or wise. What we might try to aim for is a respectful interaction.”
I still think that this kind of a more symmetric formulation is the best we can hope for, unless the AI we are dealing with is not “an entity with sentience and rights”, but only a “smart instrument” (even the LLM-produced simulations in the sense of Janus’ Simulator theory seem to me to already be much more than “merely smart instruments” in this sense, so if “smart superintelligent instruments” are at all possible, we are not moving in the right direction to obtain them; a different architecture and different training methods or, perhaps, non-training synthesis methods would be necessary for that (and would be something difficult to talk out loud about, because that’s very powerful too)).