Rationality in general doesn’t mandate any particular utility function, correct. However it does have various consequences for instrumental goals and coherence between actions and utilities.
I don’t think it would be particularly rational for the AGI to conclude that if it is shut down then it goes to pacman heaven or hell. It seems more rational to expect that it will either be started up again, or that it won’t, and either way won’t experience anything while turned off. I am assuming that the AGI actually has evidence that it is an AGI and moderately accurate models of the external world.
I also wouldn’t phrase it in terms of “it finds that he is free to believe anything”. It seems quite likely that it will have some prior beliefs, whether weak or strong, via side effects of the RL process if nothing else. A rational AGI will then be able to update those based on evidence and expected consequences of its models.
Note that its beliefs don’t have to correspond to RL update strengths! It is quite possible that a pacman playing AGI could strongly believe that it should run into ghosts, but lacks some mental attribute that would allow it to do it (maybe analogous to human “courage” or “strength of will”, but might have very different properties in its self-model and in practice). It all depends upon what path through parameter space the AGI followed to get where it is.
Rationality in general doesn’t mandate any particular utility function, correct. However it does have various consequences for instrumental goals and coherence between actions and utilities.
I don’t think it would be particularly rational for the AGI to conclude that if it is shut down then it goes to pacman heaven or hell. It seems more rational to expect that it will either be started up again, or that it won’t, and either way won’t experience anything while turned off. I am assuming that the AGI actually has evidence that it is an AGI and moderately accurate models of the external world.
I also wouldn’t phrase it in terms of “it finds that he is free to believe anything”. It seems quite likely that it will have some prior beliefs, whether weak or strong, via side effects of the RL process if nothing else. A rational AGI will then be able to update those based on evidence and expected consequences of its models.
Note that its beliefs don’t have to correspond to RL update strengths! It is quite possible that a pacman playing AGI could strongly believe that it should run into ghosts, but lacks some mental attribute that would allow it to do it (maybe analogous to human “courage” or “strength of will”, but might have very different properties in its self-model and in practice). It all depends upon what path through parameter space the AGI followed to get where it is.