Melanie contended that a truly intelligent machine would understand what we really mean when we give it incomplete instructions, or else not deserve the mantle of “truly intelligent”.
This sounds pretty reasonable in itself: a generally capable AI has a good change of being able to distinguish between what we say and what we mean, within the AI’s post-training instructions. But I get the impression that she then implicitly takes it a step further, thinking that the AI would necessarily also reflect on its core programming/trained model, to check for and patch up similar differences there. An AI could possibly work that way, but it’s not at all guaranteed—just like how a person may discover that they want something different from what their parents wanted them to want, and yet stick with their own desire rather than conforming to their parents’ wishes.
This sounds pretty reasonable in itself: a generally capable AI has a good change of being able to distinguish between what we say and what we mean, within the AI’s post-training instructions. But I get the impression that she then implicitly takes it a step further, thinking that the AI would necessarily also reflect on its core programming/trained model, to check for and patch up similar differences there. An AI could possibly work that way, but it’s not at all guaranteed—just like how a person may discover that they want something different from what their parents wanted them to want, and yet stick with their own desire rather than conforming to their parents’ wishes.