A perfectly rational agent would scarcely ever want to lose the ability to like something, since that would always lower their utility.
What is a perfectly rational self-modifying agent? I don’t think anyone has an answer to that, although surely it is something that MIRI studies. The same argument that proves that it is never rational to cease liking something, proves that it must always be rational to acquire a liking for anything. You end up with wireheading.
What is a perfectly rational self-modifying agent? I don’t think anyone has an answer to that, although surely it is something that MIRI studies. The same argument that proves that it is never rational to cease liking something, proves that it must always be rational to acquire a liking for anything. You end up with wireheading.