Garrett Baker comments on Open Thread Spring 2024

Garrett Baker 16 Apr 2024 23:44 UTC
5 points
0
A new update
Hi John,
thank you for sharing the job postings. We’re starting something really exciting, and as research leads on the team, we—Paul Lessard and Bruno Gavranović - thought wed provide clarifications.
Symbolica was not started to improve ML using category theory. Instead, Symbolica was founded ~2 years ago, with its 2M seed funding round aimed at tackling the problem of symbolic reasoning, but at the time, its path to getting there wasn’t via categorical deep learning (CDL). The original plan was to use hypergraph rewriting as means of doing learning more efficiently. That approach however was eventually shown unviable.
Symbolica’s pivot to CDL started about five months ago. Bruno had just finished his Ph.D. thesis laying the foundations for the topic and we reoriented much of the organization towards this research direction. In particular, we began: a) refining a roadmap to develop and apply CDL, and b) writing a position paper, in collaboration with with researchers at Google DeepMind which you’ve cited below.
Over these last few months, it has become clear that our hunches about applicability are actually exciting and viable research directions. We’ve made fantastic progress, even doing some of the research we planned to advocate for in the aforementioned position paper. Really, we discovered just how much Taking Categories Seriously gives you in the field of Deep Learning.
Many advances in DL are about creating models which identify robust and general patterns in data (see the Transformers/Attention mechanism, for instance). In many ways this is exactly what CT is about: it is an indispensable tool for many scientists, including ourselves, to understand the world around us: to find robust patterns in data, but also to communicate, verify, and explain our reasoning.
At the same time, the research engineering team of Symbolica has made significant, independent, and concrete progress implementing a particular deep learning model that operates on text data, but not in an autoregressive manner as most GPT-style models do.
These developments were key signals to Vinod and other investors, leading to the closing of the 31M funding round.
We are now developing a research programme merging the two, leveraging insights from theories of structure, e.g. categorical algebra, as means of formalising the process by which we find structure in data. This has twofold consequence: pushing models to identify more robust patterns in data, but also interpretable and verifiable ones.
In summary:
a) The push to apply category theory was not based on a singular whim, as the the post might suggest,
but that instead
b) Symbolica is developing a serious research programme devoted to applying category theory to deep learning, not merely hiring category theorists
All of this is to add extra context for evaluating the company, its team, and our direction, which does not come across in the recently published tech articles.
We strongly encourage interested parties to look at all of the job ads, which we’ve tailored to particular roles. Roughly, in the CDL team, we’re looking for either
1) expertise in category theory, and a strong interest in deep learning, or
2) expertise in deep learning, and a strong interest in category theory.
at all levels of seniority.
Happy to answer any other questions/thoughts.
Bruno Gavranović,
Paul Lessard