Is the idea here that the soil-AU is slang for “AU of goal ‘plant stuff here’”?
yes
One thing I noticed is that the formal policies don’t allow for all possible “strategies.”
yeah, this is because those are “nonstationary” policies—you change your mind about what to do at a given state. A classic result in MDP theory is that you never need these policies to find an optimal policy.
Am I correct that a deterministic transition function is
yes
yeah, this is because those are “nonstationary” policies—you change your mind about what to do at a given state. A classic result in MDP theory is that you never need these policies to find an optimal policy.
yup!