I feel like I once saw RL agents trained with and without energy costs, where the agents trained with energy costs acted a lot less jittery. But I can’t remember where I saw it.
I feel like I once saw RL agents trained with and without energy costs, where the agents trained with energy costs acted a lot less jittery. But I can’t remember where I saw it.