Noting that you could provide exact “negative” gradients by focusing on what you should have done instead. Although whether this transduces into an internal positive reward event / “exact gradient” is unclear to me. Seems like that stills “feels bad” in similar ways to unconcentrated negative reward events.
Noting that you could provide exact “negative” gradients by focusing on what you should have done instead. Although whether this transduces into an internal positive reward event / “exact gradient” is unclear to me. Seems like that stills “feels bad” in similar ways to unconcentrated negative reward events.