For a RL agent, the “opioid addiction” thing could be as simple as increasing the portion of the loss proportional to weight norm. You’d expect that to cause the agent to lobotomize itself into only fulfilling the newly unlocked goal.
For a RL agent, the “opioid addiction” thing could be as simple as increasing the portion of the loss proportional to weight norm. You’d expect that to cause the agent to lobotomize itself into only fulfilling the newly unlocked goal.