You might also be interested in the question of whether the costs of the agent’s own thinking can be included in an RL environment. Suppose I have a finite electricity budget and thinking harder will use electricity. It seems like I as a human am flexible enough to adjust my thinking style to some degree in response to constraints like this, but whap happens yo typical RL agents if they’re given negative reward for running out of electricity?
You might also be interested in the question of whether the costs of the agent’s own thinking can be included in an RL environment. Suppose I have a finite electricity budget and thinking harder will use electricity. It seems like I as a human am flexible enough to adjust my thinking style to some degree in response to constraints like this, but whap happens yo typical RL agents if they’re given negative reward for running out of electricity?