That’s a good point about motivated reasoning. I should distinguish arguments that the lazy approach is better for people and arguments that it’s better for me. Whether it’s better for people more generally depends on the reference class we’re talking about. I will assume people who are interested in the foundations of mathematics as a hobby outside of AI safety should take my advise less seriously.
However, I still think that it’s not exactly clear that going the foundational route is actually that useful on a per-unit time basis. The model I proposed wasn’t as simple as “learn the formal math” versus “think more intuitively.” It was specifically a question of whether we should learn the math on an as-needed basis. For that reason, I’m still skeptical that going out and reading textbooks on subjects that are only vaguely related to current machine learning work is valuable for the vast majority of people who want to go into AI safety as quickly as possible.
Sidenote: I think there’s a failure mode of not adequately optimizing time, or being insensitive to time constraints. Learning an entire field of math from scratch takes a lot of time, even for the brightest people alive. I’m worried that, “Well, you never know if subject X might be useful” is sometimes used as a fully general counterargument. The question is not, “Might this be useful?” The question is, “Is this the most useful thing I could learn in the next time interval?”
A lot depends on your model of progress, and whether you’ll be able to predict/recognize what’s important to understand, and how deeply one must understand it for the project at hand.
Perhaps you shouldn’t frame it as “study early” vs “study late”, but “study X” vs “study Y”. If you don’t go deep on math foundations behind ML and decision theory, what are you going deep on instead? It seems very unlikely for you to have significant research impact without being near-expert in at least some relevant topic.
I don’t want to imply that this is the only route to impact, just the only route to impactful research. You can have significant non-research impact by being good at almost anything—accounting, management, prototype construction, data handling, etc.
I don’t want to imply that this is the only route to impact, just the only route to impactful research.
“Only” seems a little strong, no? To me, the argument seems to be better expressed as: if you want to build on existing work where there’s unlikely to be low-hanging fruit, you should be an expert. But what if there’s a new problem, or one that’s incorrectly framed? Why should we think there isn’t low-hanging conceptual fruit, or exploitable problems to those with moderate experience?
Perhaps you shouldn’t frame it as “study early” vs “study late”, but “study X” vs “study Y”.
My point was that these are separate questions. If you begin to suspect that understanding ML research requires an understanding of type theory, then you can start learning type theory. Alternatively, you can learn type theory before researching machine learning—ie. reading machine learning papers—in the hopes that it builds useful groundwork.
But what you can’t do is learn type theory and read machine learning research papers at the same time. You must make tradeoffs. Each minute you spend learning type theory is a minute you could have spent reading more machine learning research.
The model I was trying to draw was not one where I said, “Don’t learn math.” I explicitly said it was a model where you learn math as needed.
My point was not intended to be about my abilities. This is a valid concern, but I did not think that was my primary argument. Even conditioning on having outstanding abilities to learn every subject, I still think my argument (weakly) holds.
Note: I also want to say I’m kind of confused because I suspect that there’s an implicit assumption that reading machine learning research is inherently easier than learning math. I side with the intuition that math isn’t inherently difficult, it just requires memorizing a lot of things and practicing. The same is true for reading ML papers, which makes me confused why this is being framed as a debate over whether people have certain abilities to learn and do research.
That’s a good point about motivated reasoning. I should distinguish arguments that the lazy approach is better for people and arguments that it’s better for me. Whether it’s better for people more generally depends on the reference class we’re talking about. I will assume people who are interested in the foundations of mathematics as a hobby outside of AI safety should take my advise less seriously.
However, I still think that it’s not exactly clear that going the foundational route is actually that useful on a per-unit time basis. The model I proposed wasn’t as simple as “learn the formal math” versus “think more intuitively.” It was specifically a question of whether we should learn the math on an as-needed basis. For that reason, I’m still skeptical that going out and reading textbooks on subjects that are only vaguely related to current machine learning work is valuable for the vast majority of people who want to go into AI safety as quickly as possible.
Sidenote: I think there’s a failure mode of not adequately optimizing time, or being insensitive to time constraints. Learning an entire field of math from scratch takes a lot of time, even for the brightest people alive. I’m worried that, “Well, you never know if subject X might be useful” is sometimes used as a fully general counterargument. The question is not, “Might this be useful?” The question is, “Is this the most useful thing I could learn in the next time interval?”
A lot depends on your model of progress, and whether you’ll be able to predict/recognize what’s important to understand, and how deeply one must understand it for the project at hand.
Perhaps you shouldn’t frame it as “study early” vs “study late”, but “study X” vs “study Y”. If you don’t go deep on math foundations behind ML and decision theory, what are you going deep on instead? It seems very unlikely for you to have significant research impact without being near-expert in at least some relevant topic.
I don’t want to imply that this is the only route to impact, just the only route to impactful research.
You can have significant non-research impact by being good at almost anything—accounting, management, prototype construction, data handling, etc.
“Only” seems a little strong, no? To me, the argument seems to be better expressed as: if you want to build on existing work where there’s unlikely to be low-hanging fruit, you should be an expert. But what if there’s a new problem, or one that’s incorrectly framed? Why should we think there isn’t low-hanging conceptual fruit, or exploitable problems to those with moderate experience?
I like your phrasing better than mine. “only” is definitely too strong. “most likely path to”?
My point was that these are separate questions. If you begin to suspect that understanding ML research requires an understanding of type theory, then you can start learning type theory. Alternatively, you can learn type theory before researching machine learning—ie. reading machine learning papers—in the hopes that it builds useful groundwork.
But what you can’t do is learn type theory and read machine learning research papers at the same time. You must make tradeoffs. Each minute you spend learning type theory is a minute you could have spent reading more machine learning research.
The model I was trying to draw was not one where I said, “Don’t learn math.” I explicitly said it was a model where you learn math as needed.
My point was not intended to be about my abilities. This is a valid concern, but I did not think that was my primary argument. Even conditioning on having outstanding abilities to learn every subject, I still think my argument (weakly) holds.
Note: I also want to say I’m kind of confused because I suspect that there’s an implicit assumption that reading machine learning research is inherently easier than learning math. I side with the intuition that math isn’t inherently difficult, it just requires memorizing a lot of things and practicing. The same is true for reading ML papers, which makes me confused why this is being framed as a debate over whether people have certain abilities to learn and do research.