Martin Vlach comments on Martin Vlach’s Shortform

Martin Vlach 5 Oct 2022 9:52 UTC
1 point
0
Deepmind researcher Hado mentions here a RL reward can be defined containing a risk component, that seems up-to-genial, promising for a simple generic RL development policy, I would love to learn( and teach) on more practical details!