“Reward function” is a much more general term, which IMO has been overused to the point where it arguably doesn’t even have a clear meaning. “Utility function” is less general: it always connotes an optimization objective, something which is being optimized for directly. And that basically matches the usage here.
what’s wrong with calling the “short-term utility function” a “reward function”?
“Reward function” is a much more general term, which IMO has been overused to the point where it arguably doesn’t even have a clear meaning. “Utility function” is less general: it always connotes an optimization objective, something which is being optimized for directly. And that basically matches the usage here.