I research ways in which the tools and perspective from theoretical physics can be applied to artificial intelligence. In particular, I co-authored of The Principles of Deep Learning Theory with Sho Yaida (also based on research in collaboration with Boris Hanin), which will be published by Cambridge University Press in early 2022. (https://deeplearningtheory.com)
I am a Research Affiliate at the Center for Theoretical Physics at MIT and also a Principal Researcher at Salesforce, having arrived via acquisition of Diffeo where I was Co-Founder and Chief Technology Officer. I am also an Affiliate of the NSF AI Institute for Artificial Intelligence and Fundamental Interactions (IAIFI).
Thanks for your summary of the book!
I think that the post and analysis is some evidence that it might perhaps be tractable to apply tools from the book directly to transformer architectures and LLMs.