I suspect we’re using SGD in different ways, because everything we’ve talked about seems like it could be implemented with SGD. Do you agree that letting the Predict-O-Matic predict the future and rewarding it for being right, RL-style, would lead to it finding fixed points? Because you can definitely use SGD to do RL (first google result).
Fair enough, I was thinking about supervised learning.
Fair enough, I was thinking about supervised learning.