The “superintelligent tool” in the example you provided gave a blatantly incorrect answer by it’s own metric. If it counts suicide as a win, why did it say the disease would not be gotten rid of?
In the example the “win” could be defined as an answer which is: a) technically correct, b) relatively cheap among the technically correct answers.
This is (in my imagination) something that builders of the system could consider reasonable, if either they didn’t consider Friendliness or they believed that a “tool AI” which “only gives answers” is automatically safe.
The computer gives an answer which is technically correct (albeit a self-fulfilling prophecy) and cheap (in dollars spent for cure). For the computer, this answer is a “win”. Not because of the suicide—that part is completely irrelevant. But because of the technical correctness and cheapness.
The “superintelligent tool” in the example you provided gave a blatantly incorrect answer by it’s own metric. If it counts suicide as a win, why did it say the disease would not be gotten rid of?
In the example the “win” could be defined as an answer which is: a) technically correct, b) relatively cheap among the technically correct answers.
This is (in my imagination) something that builders of the system could consider reasonable, if either they didn’t consider Friendliness or they believed that a “tool AI” which “only gives answers” is automatically safe.
The computer gives an answer which is technically correct (albeit a self-fulfilling prophecy) and cheap (in dollars spent for cure). For the computer, this answer is a “win”. Not because of the suicide—that part is completely irrelevant. But because of the technical correctness and cheapness.