See also Evolution of Modularity. Using your quantilizer to choose training examples from which to learn appears to be a very simple, natural way of accomplishing modularly varying reward functions. (I left a comment there too.)
See also Evolution of Modularity. Using your quantilizer to choose training examples from which to learn appears to be a very simple, natural way of accomplishing modularly varying reward functions. (I left a comment there too.)