Slider comments on Torture and Dust Specks and Joy—Oh my! or: Non-Archimedean Utility Functions as Pseudograded Vector Spaces

Slider 29 Aug 2019 18:30 UTC
1 point
Reinforcement learning with rewards or punishments that can have an infinite magnitude would seem to make intuitive sense for me. The buck is then kicked to reasoning whether it’s ever reasonable to give a sample a post-finite reward. Say that there are pictures label as either “woman”, “girl”,”boy” or “man” and labeling a boy a man or a man a boy would get you a Small reward while labeling a man a man would get you a Large reward where Large is infinite respect with respect to Small. With a finite version some “boy” vs “girl” weight could overcome a “man” vs “girl” weight which might be undesirable behaviour (if you strictly care about gender discrimination with no tradeoff for age discrimination).