I think you are completely missing the entire point of the AI alignment problem.
The problem is how to make the AI recognize good from evil. Not whether upon recognizing good, the AI should print “good” to output, or smile, or clap its hands. Either reaction is equally okay, and can be improved later. The important part is that AI does not print “good” / smile / clap its hands when it figures out a course of action which would, as a side effect, destroy humankind, or do something otherwise horrible (the problem is to define what “otherwise horrible” exactly means). Actually it is more complicated by this, but you are already missing the very basics.
I think you are completely missing the entire point of the AI alignment problem.
The problem is how to make the AI recognize good from evil. Not whether upon recognizing good, the AI should print “good” to output, or smile, or clap its hands. Either reaction is equally okay, and can be improved later. The important part is that AI does not print “good” / smile / clap its hands when it figures out a course of action which would, as a side effect, destroy humankind, or do something otherwise horrible (the problem is to define what “otherwise horrible” exactly means). Actually it is more complicated by this, but you are already missing the very basics.