Yes, that is a thing you can do with decision transforms too. I was referring to variant of the decision transformer (see link in original short form) where the AI samples the reward it’s aiming for.
Yes, that is a thing you can do with decision transforms too. I was referring to variant of the decision transformer (see link in original short form) where the AI samples the reward it’s aiming for.