“The idea of trying to control or manipulate an entity which is much smarter than a human does not seem ethical, feasible, or wise. What we might try to aim for is a respectful interaction.”
I still think that this kind of a more symmetric formulation is the best we can hope for, unless the AI we are dealing with is not “an entity with sentience and rights”, but only a “smart instrument” (even the LLM-produced simulations in the sense of Janus’ Simulator theory seem to me to already be much more than “merely smart instruments” in this sense, so if “smart superintelligent instruments” are at all possible, we are not moving in the right direction to obtain them; a different architecture and different training methods or, perhaps, non-training synthesis methods would be necessary for that (and would be something difficult to talk out loud about, because that’s very powerful too)).
Yep, corrigibility is unsolved! So we should try to solve it.
I wrote the following in 2012:
“The idea of trying to control or manipulate an entity which is much smarter than a human does not seem ethical, feasible, or wise. What we might try to aim for is a respectful interaction.”
I still think that this kind of a more symmetric formulation is the best we can hope for, unless the AI we are dealing with is not “an entity with sentience and rights”, but only a “smart instrument” (even the LLM-produced simulations in the sense of Janus’ Simulator theory seem to me to already be much more than “merely smart instruments” in this sense, so if “smart superintelligent instruments” are at all possible, we are not moving in the right direction to obtain them; a different architecture and different training methods or, perhaps, non-training synthesis methods would be necessary for that (and would be something difficult to talk out loud about, because that’s very powerful too)).