(Personal bias is heavily towards the upskilling side of the scale)
There are three big advantages to “problem first, fundamentals later”:
You get experience doing research directly
You save time
Anytime you go to learn something for the problem, you will always have the context of “what does this mean in my case?”
3 is a mixed bag—sometimes this will be useful because it brings together ideas from far-away in idea-space; other times it will make it harder for you to learn things on their own terms—you may end up basing your understanding on non-central examples of a thing and end up having trouble putting it in its own context, thus making it harder to make it a central node in your knowledge-web. This makes it harder to use and less cognitively available.
By contrast, the advantages of going hard for context-learning:
Learning bottom-up helps resolve this “tools in-context” problem
I’d bet that if you focus on learning this way, you will feel less of the “why can’t I just get this and move past it already” pressure—which imo is highly likely to end up with poor learning overall
Studying things “in order” will give you a lot more knowledge of how a field progresses—giving a feel for what moving things forward should “feel like from the inside”.
Spaced repetition basically solves the “how do I remember basic isolated facts” problem—an integrated “bottom-up” approach is better-suited to building a web-of knowledge which will then give you affordances as to when to use particular tools.
The “bottom-up” approach has the risk of making you learn only central examples of concepts—this is best mitigated by taking the time and effort to be playful with the whatever you’re learning. Having your “Hamming problem” in mind will also help—it doesn’t need to be a dichotomy between bottom-up and top-down, in that regard.
My recommendation would be to split it and try: for alignment, there’s clearly a “base” of linalg, probability, etc. that in my estimation would be best consumed in its own context, while much of the rest of the work in the field is conceptual enough that mentally tagging what the theories are about (“natural abstractions” or “myopia of LLMs”) is probably sufficient for you to know what you’ll need and when, thus good to index as needed.
(Personal bias is heavily towards the upskilling side of the scale) There are three big advantages to “problem first, fundamentals later”:
You get experience doing research directly
You save time
Anytime you go to learn something for the problem, you will always have the context of “what does this mean in my case?”
3 is a mixed bag—sometimes this will be useful because it brings together ideas from far-away in idea-space; other times it will make it harder for you to learn things on their own terms—you may end up basing your understanding on non-central examples of a thing and end up having trouble putting it in its own context, thus making it harder to make it a central node in your knowledge-web. This makes it harder to use and less cognitively available.
By contrast, the advantages of going hard for context-learning:
Learning bottom-up helps resolve this “tools in-context” problem
I’d bet that if you focus on learning this way, you will feel less of the “why can’t I just get this and move past it already” pressure—which imo is highly likely to end up with poor learning overall
Studying things “in order” will give you a lot more knowledge of how a field progresses—giving a feel for what moving things forward should “feel like from the inside”.
Spaced repetition basically solves the “how do I remember basic isolated facts” problem—an integrated “bottom-up” approach is better-suited to building a web-of knowledge which will then give you affordances as to when to use particular tools.
The “bottom-up” approach has the risk of making you learn only central examples of concepts—this is best mitigated by taking the time and effort to be playful with the whatever you’re learning. Having your “Hamming problem” in mind will also help—it doesn’t need to be a dichotomy between bottom-up and top-down, in that regard.
My recommendation would be to split it and try: for alignment, there’s clearly a “base” of linalg, probability, etc. that in my estimation would be best consumed in its own context, while much of the rest of the work in the field is conceptual enough that mentally tagging what the theories are about (“natural abstractions” or “myopia of LLMs”) is probably sufficient for you to know what you’ll need and when, thus good to index as needed.
Thank you, this makes sense currently!
(Right now I’m on Pearl’s Causality)