Even if the narrator was as close to a rational agent as he could be while still being human (his beliefs were the best that could be formed given his available evidence and computing power, and his actions were the ones which best increased his expected utility), he’d still have human characteristics in addition to ideal-rational-agent characteristics. His terminal values would cause emotions in him, in addition to just steering his actions, and his emotions have more terminal value to him. Having an unmet terminal desire would be frustrating and he doesn’t like frustration (apart from not liking its cause). Basically, he disvalues having terminal values unmet, (separately from his terminal values being unmet).
It can be rational for an agent to change their own terminal values in several situations. One is if they have terminal values about their own terminal values. They can also have instrumental value on terminal value changes. For example, if Omega says “You have two options. I wipe out humanity, guaranteeing that humane values shall never control the universe, xor you choose to edit every extant copy of the human brain and genome to make ‘good reputation’ no longer a terminal value. Your predicted response to this ultimatum did not influence my decision to make it.” Or you are pretty much absolutely certain that a certain terminal value can never be influenced one way or another, and you are starved for computing power or storage. Or modifying your values to include “And if X doesn’t do Y, then I want to minimize X’s utility function” as part of a commitment for blackmail.
Even if the narrator was as close to a rational agent as he could be while still being human (his beliefs were the best that could be formed given his available evidence and computing power, and his actions were the ones which best increased his expected utility), he’d still have human characteristics in addition to ideal-rational-agent characteristics. His terminal values would cause emotions in him, in addition to just steering his actions, and his emotions have more terminal value to him. Having an unmet terminal desire would be frustrating and he doesn’t like frustration (apart from not liking its cause). Basically, he disvalues having terminal values unmet, (separately from his terminal values being unmet).
It can be rational for an agent to change their own terminal values in several situations. One is if they have terminal values about their own terminal values. They can also have instrumental value on terminal value changes. For example, if Omega says “You have two options. I wipe out humanity, guaranteeing that humane values shall never control the universe, xor you choose to edit every extant copy of the human brain and genome to make ‘good reputation’ no longer a terminal value. Your predicted response to this ultimatum did not influence my decision to make it.” Or you are pretty much absolutely certain that a certain terminal value can never be influenced one way or another, and you are starved for computing power or storage. Or modifying your values to include “And if X doesn’t do Y, then I want to minimize X’s utility function” as part of a commitment for blackmail.