Writing AIs is not running them. Proving what they would do is, but we need not have the math engine design an AI and prove it safe. We need it to babble about agent foundations in the same way that it would presumably be inspiring to hear Ramanujan talk in his sleep.
The math engine I’m looking for would be able to intuit not only a lemma that helps prove a theorem, but a conjecture, which is just a lemma when you don’t know the theorem. Or a definition, which is to a conjecture as sets are to truth values. A human who has proven many theorems sometimes becomes able to write them in turn, why should language models be any different?
I can sense some math we need: An AI is more interpretable if the task of interpreting it can be decomposed into interpreting its parts, we want the assembly of descriptions to be associative, an AI design tolerates more mistakes if its behavior is more continuous in its parts than a maximizer’s in its utility function. Category Theory formalizes such intuitions, and even a tool that rewrites all our math in its terms would help a lot, let alone one that invents a math language even better at CTs job of the short sentences being the useful ones.
On the usefulness of proving theorems vs. writing them down: I think there’s more of a back and forth. See for instance Nature’s post on how DM used an AI to guide intuition (https://www.nature.com/articles/s41586-021-04086-x).
To me, it’s going to help humans “babble” more in math by using some extension, like a Github Copilot but for math. And I feel that overall increasing math generation is more “positive for alignment in EV” than generating code, especially when considering agent foundations.
Besides. in ML the correct proofs can happen many years after algorithms are used by everyone in the community (e.g. Adam’s proof shown to be wrong in 2018). Having a way to have a more grounded understanding of things could help both for interpretability & having guarantees of safety.
Writing AIs is not running them. Proving what they would do is, but we need not have the math engine design an AI and prove it safe. We need it to babble about agent foundations in the same way that it would presumably be inspiring to hear Ramanujan talk in his sleep.
The math engine I’m looking for would be able to intuit not only a lemma that helps prove a theorem, but a conjecture, which is just a lemma when you don’t know the theorem. Or a definition, which is to a conjecture as sets are to truth values. A human who has proven many theorems sometimes becomes able to write them in turn, why should language models be any different?
I can sense some math we need: An AI is more interpretable if the task of interpreting it can be decomposed into interpreting its parts, we want the assembly of descriptions to be associative, an AI design tolerates more mistakes if its behavior is more continuous in its parts than a maximizer’s in its utility function. Category Theory formalizes such intuitions, and even a tool that rewrites all our math in its terms would help a lot, let alone one that invents a math language even better at CTs job of the short sentences being the useful ones.
On the usefulness of proving theorems vs. writing them down: I think there’s more of a back and forth. See for instance Nature’s post on how DM used an AI to guide intuition (https://www.nature.com/articles/s41586-021-04086-x).
To me, it’s going to help humans “babble” more in math by using some extension, like a Github Copilot but for math. And I feel that overall increasing math generation is more “positive for alignment in EV” than generating code, especially when considering agent foundations.
Besides. in ML the correct proofs can happen many years after algorithms are used by everyone in the community (e.g. Adam’s proof shown to be wrong in 2018). Having a way to have a more grounded understanding of things could help both for interpretability & having guarantees of safety.