This is a great post! Thank you in particular for the Stephen Boyd recommendation.
~~~~
In terms of frameworks and ML, you might be interested in this blog post announcing the release of the DiffEqFlux.jl package in the Julia programming language. I was reminded of this because it seems to be an example of merging already-mature frameworks, specifically a machine learning framework (Flux) and a differential equation framework (DifferentialEquations.jl).
A recent paper called Neural Ordinary Differential Equations explored the problem of parameterizing a term in an ODE instead of specifying a sequence of layers, using ML techniques. The benefit of the combined framework is that it allows us to incorporate what we already know about the structure of the problem via the ODE, and apply neural networks to the parts we don’t.
I’m not deeply familiar with either framework, but I have been following DifferentialEquations.jl for a while because I had hoped it would let me tackle a few problems I have had my eye on.
~~~~
I feel like Geometric Algebra is pointing in a similar direction to what you describe. It is not directly an applied framework, but rather a kind of ur-framework for teaching and developing other frameworks on top of it. The idea is that they make geometric objects elements of the algebra, which really boils down to making the math match the structure of space and in this way preserve geometric intuitions throughout. Its advocates hope it will serve as a unified framework for physics and engineering applications. The primary theorist is David Hestenes, and his argument for why this was necessary is found here; there are surveys/tutorials from other people who have done work in the field here and here.
Geometric Algebra and its extension Geometric Calculus are both still pretty firmly in the ‘reorganizing math’ kind of endeavor, with lots of priority put on proving results and developing algorithms. But there are a few areas where they are growing into an applied framework of the kind you describe, specifically robotics, computer graphics, and computer vision.
It was through the robotics applications that I found it; they promised an intuitive explanation of quaternions, which otherwise were just this scary way to calculate motion.
Geometric algebra is really neat, thanks for the links. I’ve been looking for something like that since I first encountered Pauli matrices back in quantum. I would describe it as an improved language for talking about lots of physical phenomena; that makes it a component of a potentially-better interface layer for many different mathematical frameworks. That’s really the key to a successful declarative framework: having an interface layer/language which makes it easy to recognize and precisely formulate the kinds of problems the framework can handle.
I’m generally suspicious of anything combining neural nets and nonlinear DEs. As James Mickens would say, putting a numerical solver for a chaotic system in the middle of another already-notoriously-finicky-and-often-unstable-system is like asking Godzilla to prevent Mega-Godzilla from terrorizing Japan. This does not lead to rising property values in Tokyo! That said, it does seem like something along those lines will have to work inside learning frameworks sooner or later, so it’s cool to see a nice implementation that puts everything under one roof.
It’s interesting that you mention that; I had never even considered applying the combination blindly to a new problem. I figured it would only really be useful in cases where you knew what kind of physical problem you were working with already.
I can think of two examples for which ML is currently not very valuable but this new trick opens up. First, the blog post gave the example of Maxwell’s Equations—in EE there has been some exploration of things like genetic algorithms for circuit and antenna design, which boils down to a different form of searching the problem space for a solution. The hitch is, we can specify exactly what things to vary and what to keep constant; if we were to use ML it would always be starting from scratch, which is very inefficient. Until now, anyway.
The second one is also from engineering, but on the applied math side. The field of Continuum Mechanics has a certain generalization to thermodynamics called Rational Thermodynamics, which uses a method of moments. The idea there is simply to add another differential term to the field equation for every behavior you want to describe (heat, stress, deformation, etc). As of the 1980’s they had proved both that Navier Stokes is a special case and also that including many new terms does not significantly outperform Navier Stokes. The guess at that time was that it might take hundreds of thousands of terms to get a much better description, which of course was infeasible with analysis by hand. But having a lot of differentials to figure out seems exactly like what this would be good for.
Those are great examples! That’s exactly the sort of thing I see the tools currently associated with neural nets being most useful for long term—applications which aren’t really neural nets at all. Automated differentiation and optimization aren’t specific to neural nets, they’re generic mathematical tools. The neural network community just happens to be the main group developing them.
I really look forward to the day when I can bust out a standard DE solver, use it to estimate the frequency of some stable nonlinear oscillator, and then compute the sensitivity of that frequency to each of the DE’s parameters with an extra two lines of code.
This is a great post! Thank you in particular for the Stephen Boyd recommendation.
~~~~
In terms of frameworks and ML, you might be interested in this blog post announcing the release of the DiffEqFlux.jl package in the Julia programming language. I was reminded of this because it seems to be an example of merging already-mature frameworks, specifically a machine learning framework (Flux) and a differential equation framework (DifferentialEquations.jl).
A recent paper called Neural Ordinary Differential Equations explored the problem of parameterizing a term in an ODE instead of specifying a sequence of layers, using ML techniques. The benefit of the combined framework is that it allows us to incorporate what we already know about the structure of the problem via the ODE, and apply neural networks to the parts we don’t.
I’m not deeply familiar with either framework, but I have been following DifferentialEquations.jl for a while because I had hoped it would let me tackle a few problems I have had my eye on.
~~~~
I feel like Geometric Algebra is pointing in a similar direction to what you describe. It is not directly an applied framework, but rather a kind of ur-framework for teaching and developing other frameworks on top of it. The idea is that they make geometric objects elements of the algebra, which really boils down to making the math match the structure of space and in this way preserve geometric intuitions throughout. Its advocates hope it will serve as a unified framework for physics and engineering applications. The primary theorist is David Hestenes, and his argument for why this was necessary is found here; there are surveys/tutorials from other people who have done work in the field here and here.
Geometric Algebra and its extension Geometric Calculus are both still pretty firmly in the ‘reorganizing math’ kind of endeavor, with lots of priority put on proving results and developing algorithms. But there are a few areas where they are growing into an applied framework of the kind you describe, specifically robotics, computer graphics, and computer vision.
It was through the robotics applications that I found it; they promised an intuitive explanation of quaternions, which otherwise were just this scary way to calculate motion.
Geometric algebra is really neat, thanks for the links. I’ve been looking for something like that since I first encountered Pauli matrices back in quantum. I would describe it as an improved language for talking about lots of physical phenomena; that makes it a component of a potentially-better interface layer for many different mathematical frameworks. That’s really the key to a successful declarative framework: having an interface layer/language which makes it easy to recognize and precisely formulate the kinds of problems the framework can handle.
I’m generally suspicious of anything combining neural nets and nonlinear DEs. As James Mickens would say, putting a numerical solver for a chaotic system in the middle of another already-notoriously-finicky-and-often-unstable-system is like asking Godzilla to prevent Mega-Godzilla from terrorizing Japan. This does not lead to rising property values in Tokyo! That said, it does seem like something along those lines will have to work inside learning frameworks sooner or later, so it’s cool to see a nice implementation that puts everything under one roof.
It’s interesting that you mention that; I had never even considered applying the combination blindly to a new problem. I figured it would only really be useful in cases where you knew what kind of physical problem you were working with already.
I can think of two examples for which ML is currently not very valuable but this new trick opens up. First, the blog post gave the example of Maxwell’s Equations—in EE there has been some exploration of things like genetic algorithms for circuit and antenna design, which boils down to a different form of searching the problem space for a solution. The hitch is, we can specify exactly what things to vary and what to keep constant; if we were to use ML it would always be starting from scratch, which is very inefficient. Until now, anyway.
The second one is also from engineering, but on the applied math side. The field of Continuum Mechanics has a certain generalization to thermodynamics called Rational Thermodynamics, which uses a method of moments. The idea there is simply to add another differential term to the field equation for every behavior you want to describe (heat, stress, deformation, etc). As of the 1980’s they had proved both that Navier Stokes is a special case and also that including many new terms does not significantly outperform Navier Stokes. The guess at that time was that it might take hundreds of thousands of terms to get a much better description, which of course was infeasible with analysis by hand. But having a lot of differentials to figure out seems exactly like what this would be good for.
Those are great examples! That’s exactly the sort of thing I see the tools currently associated with neural nets being most useful for long term—applications which aren’t really neural nets at all. Automated differentiation and optimization aren’t specific to neural nets, they’re generic mathematical tools. The neural network community just happens to be the main group developing them.
I really look forward to the day when I can bust out a standard DE solver, use it to estimate the frequency of some stable nonlinear oscillator, and then compute the sensitivity of that frequency to each of the DE’s parameters with an extra two lines of code.