ryan_b answers What are some non-purely-sampling ways to do deep RL?

ryan_b 5 Dec 2019 17:32 UTC
LW: 9 AF: 6
AF
This doesn’t strike directly at the sampling question, but it is related to several of your ideas about incorporating the differentiable function: Neural Ordinary Differential Equations.
This is being exploited most heavily in the Julia community. The broader pitch is that they have formalized the relationship between differential equations and neural networks. This allows things like:
- applying differential equation tricks to computing the outputs of neural networks
- using neural networks to solve pieces of differential equations
- using differential equations to specify the weighting of information
The last one is the most intriguing to me, mostly because it solves the problem of machine learning models having to start from scratch even in environments where information about the environment’s structure is known. For example, you can provide it with Maxwell’s Equations and then it “knows” electromagnetism.
There is a blog post about the paper and using it with the DifferentialEquations.jl and Flux.jl libraries. There is also a good talk by Christopher Rackauckas about the approach.
It is mostly about using ML in the physical sciences, which seems to be going by the name Scientific ML now.
- evhub 5 Dec 2019 19:25 UTC
  LW: 4 AF: 2
  AF Parent
  This is really neat; thanks for the pointer!
- ryan_b 6 Dec 2019 16:54 UTC
  4 points
  Parent
  I don’t know what the procedure for this is, but it occurs to me that if we can specify information about an environment via differential equations inside the neural network, then we can also compare this network’s output to one that doesn’t have the same information.
  In the name of learning more about how to interpret the models, we could try something like:
  1) Construct an artificial environment which we can completely specify via a set of differential equations.
  2) Run a neural network to learn that environment with every combination of those differential equations.
  3) Compare all of these to several control cases of not providing any differential equations.
  It seems like how the control case differs from each of the cases-with-structural-information should give us some information about how the network learns the environmental structure.