We then loop over each action a and take the action with the highest expected answer.
Wasn’t the whole point that we want to avoid such goal-direction?
Wasn’t the whole point that we want to avoid such goal-direction?