The most coherent reply I got was that an AI doesn’t follow verbal instructions and we can’t just order the AI to “make humans happy”, or even “make humans happy, in the way that I mean”. You can only tell the AI to make humans happy by writing a program that makes it do so. It doesn’t matter if the AI grasps what you really want it to do, if there is a mismatch between the program and what you really want it to do, it follows the program.
Obviously I don’t buy this. For one thing, you can always program it to obey verbal instructions, or you can talk to it and ask it how it will make people happy.
Jiro: Did you read my post? I discuss whether getting an AI to ‘obey verbal instructions’ is a trivial task in the first named section. I also link to section 2 of Yudkowsky’s reply to Holden, which addresses the question of whether ‘talk to it and ask it how it will make people happy’ is generally a safe way to interact with an Unfriendly Oracle.
I also specifically quote an argument you made in section 2 that I think reflects a common mistake in this whole family of misunderstandings of the problem — the conflation of the seed AI with the artificial superintelligence it produces. Do you agree this distinction helps clarify why the problem is one of coding the right values, and not of coding the right factual knowledge or intelligence-relevant capacities?
I tried arguing basically the same thing.
The most coherent reply I got was that an AI doesn’t follow verbal instructions and we can’t just order the AI to “make humans happy”, or even “make humans happy, in the way that I mean”. You can only tell the AI to make humans happy by writing a program that makes it do so. It doesn’t matter if the AI grasps what you really want it to do, if there is a mismatch between the program and what you really want it to do, it follows the program.
Obviously I don’t buy this. For one thing, you can always program it to obey verbal instructions, or you can talk to it and ask it how it will make people happy.
Jiro: Did you read my post? I discuss whether getting an AI to ‘obey verbal instructions’ is a trivial task in the first named section. I also link to section 2 of Yudkowsky’s reply to Holden, which addresses the question of whether ‘talk to it and ask it how it will make people happy’ is generally a safe way to interact with an Unfriendly Oracle.
I also specifically quote an argument you made in section 2 that I think reflects a common mistake in this whole family of misunderstandings of the problem — the conflation of the seed AI with the artificial superintelligence it produces. Do you agree this distinction helps clarify why the problem is one of coding the right values, and not of coding the right factual knowledge or intelligence-relevant capacities?