If an outcome with infinite utility is presented, then it doesn’t matter how small its probability is: all actions which lead to that outcome will have to dominate the agent’s behavior.
Let’s investigate a proposition “an outcome with infinite utility exists”. It may be false, it may be true, and it may be unknowable. Probability is greater than zero.
Which means that an agent that finds out this proposition will become uncontrollable, alignment is impossible. Especially when finding out this proposition does not seem too challenging.
Hopefully I’m wrong, please help me find a mistake.
AGI is uncontrollable, alignment is impossible
According to Pascal’s Mugging
According to Fitch’s paradox of knowability and Gödel’s incompleteness theorems
Let’s investigate a proposition “an outcome with infinite utility exists”. It may be false, it may be true, and it may be unknowable. Probability is greater than zero.
Which means that an agent that finds out this proposition will become uncontrollable, alignment is impossible. Especially when finding out this proposition does not seem too challenging.
Hopefully I’m wrong, please help me find a mistake.