In the example the “win” could be defined as an answer which is: a) technically correct, b) relatively cheap among the technically correct answers.
This is (in my imagination) something that builders of the system could consider reasonable, if either they didn’t consider Friendliness or they believed that a “tool AI” which “only gives answers” is automatically safe.
The computer gives an answer which is technically correct (albeit a self-fulfilling prophecy) and cheap (in dollars spent for cure). For the computer, this answer is a “win”. Not because of the suicide—that part is completely irrelevant. But because of the technical correctness and cheapness.
In the example the “win” could be defined as an answer which is: a) technically correct, b) relatively cheap among the technically correct answers.
This is (in my imagination) something that builders of the system could consider reasonable, if either they didn’t consider Friendliness or they believed that a “tool AI” which “only gives answers” is automatically safe.
The computer gives an answer which is technically correct (albeit a self-fulfilling prophecy) and cheap (in dollars spent for cure). For the computer, this answer is a “win”. Not because of the suicide—that part is completely irrelevant. But because of the technical correctness and cheapness.