The scientific method is reliable → very_controversial_thing
And hardcoded:
P(very_controversial_thing)=0
Then the conclusion is that the scientific method isn’t reliable.
I the point I am trying to make is that if an AI axiomatically believes something which is actually false, then this is likely to result in weird behavior.
I suspect it would react by adjusting it’s definitions so that very_controversial_thing doesn’t mean what the designers think it means.
This can lead to very bad outcomes. For example, if the AI is hard coded with P(“there are differences between human groups in intelligence”)=0, it might conclude that some or all of the groups aren’t in fact “human”. Consider the results if it is also programed to care about “human” preferences.
I suspect it would react by adjusting it’s definitions so that very_controversial_thing doesn’t mean what the designers think it means.
This can lead to very bad outcomes. For example, if the AI is hard coded with P(“there are differences between human groups in intelligence”)=0, it might conclude that some or all of the groups aren’t in fact “human”. Consider the results if it is also programed to care about “human” preferences.