Nanashi comments on The most important meta-skill

Nanashi 27 May 2015 21:19 UTC
3 points
I took your advice as well as estimator’s into account and added two paragraphs at the beginning to offer 1. Some research showing that many systems follow a distribution where a small portion of work accounts for a large portion of results, and 2. and explanation as to why it’s generalizable.
- estimator 27 May 2015 23:06 UTC
  1 point
  Parent
  Also, I’d like to compare your system against common sense reasoning baseline. What do you think are the main differences between your approach and usual approaches to skill learning? What will be the difference in actions?
  
  I’m asking that because that your guide contains quite long a list of recommendations/actions, while many of them are used (probably intuitively/implicitly) by almost any sensible person. Also, some of the recommendations clearly have more impact than others. So, what happens if we apply the Pareto principle to your learning system? Which 20% are the most important? What is at the core of your approach?
  - Nanashi 28 May 2015 14:21 UTC
    2 points
    Parent
    As I mentioned in another comment, the difference between this and the “common sense” approach is in what this system does not do.
    
    As for what the 20% of this system that gives you the most bang for your buck? That’s a good question. Right now my “safe” answer is that it’s dependent on the type of skill you’re trying to learn. The trouble is that the common threads among all the skills (“Find the 20% of the skill that yields 80% of the results”) doesn’t have a lot of practical value. Like telling someone that all they need to do to lose weight is eat less and exercise more.
    
    Let me think about it some more and I’ll get back to you.
    - Nanashi 28 May 2015 14:55 UTC
      2 points
      Parent
      So, after some cursory thought, naturally the part of the system that gives you the most bang for your buck are the first 4 steps. The last 3 steps are designed to help you improve, which is a much slower process than just learning the basics.
      
      So, now to figure out how to recursively apply the the skill of learning a skill quickly to the skill “learning skills quickly”.
      - Nanashi 28 May 2015 17:49 UTC
        2 points
        Parent
        Okay, so I made a significant revision of the post. The original ideas are all there, just written in a much less obtuse manner.
        
        A much more logical argument is presented at the beginning, along with constraints.
        “Archetypes” and “Processes” have been replaced by sub-skills and trivial sub-skills.
        The lengthy discourse on strategy has been replaced by simply sorting your list of trivial sub-skills, which accomplishes the same effect.
        The “improvement” has been streamlined greatly.
        Meta-analysis has been removed because it’s really a separate subject.
  - btrettel 29 May 2015 15:26 UTC
    0 points
    Parent
    One piece of information you can use to determine what is most important is the number of other skills which require a certain skill as a prerequisite. Prerequisites should obviously be learned first, and it makes sense to learn them in order of how many doors they open. This is how I prioritize at the moment if I’m not considering subjective measures of “usefulness”.
    
    For my learning goals, I’ve started making concept maps, partly as it helps me understand a subject by understanding how concepts are related, and partly to identify what to learn next as described above. It becomes fairly obvious that I should learn X if I want to learn Y and Z and X is a prerequisite for both.
    - estimator 29 May 2015 18:18 UTC
      0 points
      Parent
      In my experience, in math/science prerequisites often can (and should) be ignored, and learned as you actually need them. People who thoroughly follow all the prerequisites often end up bogged down in numerous science fields which have actually weak connection to what they wanted to learn initially, and then get demotivated and drop out of their endeavor. This is a common failure mode.
      
      Like, you need probability theory to do machine learning, but some you are unlikely to encounter some parts of it, and also there are parts of ML which require very little of it. It totally makes sense to start with them.
      - btrettel 29 May 2015 20:34 UTC
        0 points
        Parent
        I’m thinking more specifically than you are. Rather than learning probability theory to understand ML, learn only what you determine to be necessary for what ML applications you are interested in. The concept maps I use are very specific, and they avoid the weak connection problem you mention. (It’s worth noting that I develop these as an autodidact, so I don’t have to take an entire class to just get a few facts I’m interested in.)
        Nanashi 29 May 2015 20:40 UTC
        1 point
        Parent
        It sounds like both you and estimator are actually both on the same page: estimator seems to be talking about the “prerequisite” in the sense of, “systematic prerequisite”, as in, people say that you should learn X before you learn Y. You seem to be talking about “prerequisite” in the sense that, “skill X is a necessary component of skill Y”
        
        Both of you, however, seem to agree that you should ignore the stuff that is irrelevant to what you are actually trying to accomplish.
        btrettel 29 May 2015 23:36 UTC
        0 points
        Parent
        This is a good way to put it. I may not have been clear.
        
        To use an example, I have a concept map about fluid dynamics that I used in a class I took on turbulence recently. There were a few concepts that I did not understand well at some point, and I wanted to figure out which ones. To be more specific, isotropic tensors are often used in turbulence theory and modeling, but I didn’t really understand how to construct isotropic tensors algebraically. It became pretty clear this was something I should learn given the number of links isotropic tensors had to other concepts.
      - Nornagest 29 May 2015 19:29 UTC
        0 points
        Parent
        On the other hand, if you don’t have a solid grasp of linear algebra, your ability to do most types of machine learning is seriously impaired. You can learn techniques like e.g. matrix inversions as needed to implement the algorithms you’re learning, but if you don’t understand how those techniques work in their original context, they become very hard to debug or optimize. Similarly for e.g. cryptography and basic information theory.
        
        That’s probably more the exception than the rule, though; I sense that the point of most prerequisites in a traditional science curriculum is less to provide skills to build on and more to build habits of rigorous thinking.
        estimator 29 May 2015 20:57 UTC
        0 points
        Parent
        Read what is a matrix, how to add, multiply and invert them, what is a determinant and what is an eigenvector and that’s enough to get you started. There are many algorithms in ML where vectors/matrices are used mostly as a handy notation.
        
        Yes, you will be unable to understand some parts of ML which substantially require linear algebra; yes, understanding ML without linear algebra is harder; yes, you need linear algebra for almost any kind of serious ML research—but it doesn’t mean that you have to spend a few years studying arcane math before you can open a ML textbook.
        Nornagest 29 May 2015 21:06 UTC
        0 points
        Parent
        Who said anything about a few years? If you paid attention in high school, the linear algebra background you need is at most a few months’ worth of work. I was providing a single counterexample, not saying that the full prerequisite list (which, if memory serves, is most of a CS curriculum for your average ML class) is always necessary.
        Lumifer 29 May 2015 19:42 UTC
        0 points
        Parent
        
        if you don’t have a solid grasp of linear algebra, your ability to do most types of machine learning is seriously impaired
        
        That depends on whether you’re doing research or purely applied stuff. For applied use, domain expertise trumps knowing the internal details of the algorithms which you usually just call as pre-build functions—as long as you understand what do they do and where the limits (and the traps) are.
        
        Not many people can invert matrices by hand any more and that’s not a problem for a higher-level understanding of linear algebra. Similarly, you don’t necessarily need to understand, say, how singular value decomposition works in order to do successful higher-level modeling of some domain.
        Nornagest 29 May 2015 19:54 UTC
        0 points
        Parent
        I wasn’t pointing strictly to research, but I was pointing to low-level implementation. It now occurs to me that I might be unusual in this respect—much of my ML experience is in the context of a rather weird environment that didn’t have any existing libraries, leaving me to cut a lot of code myself.
        
        So I might have to back off from “ability to do machine learning”. You can, in retrospect, use ML perfectly competently in a lot of settings even if the closest you’ve ever gotten to a simulated annealing algorithm is plugging the cost function into a Python library; but I have a hard time calling someone an expert if they’ve never written anything lower-level, just as I’d expect a competent software engineer to be able to write a hash table by hand even if every environment they’re likely to encounter will have built-in implementations or at least efficient libraries for it.
        Lumifer 29 May 2015 20:13 UTC
        4 points
        Parent
        
        just as I’d expect a competent software engineer to be able to write a hash table by hand even if every environment they’re likely to encounter will have built-in implementations or at least efficient libraries for it.
        
        I have a feeling that’s a bit of a relic.
        
        Long time ago programming environments were primitive and Real Men wrote their own sorts and hash tables (there is a canonical story from even more Ancient Times). But times have changed. I can’t imagine a serious situation (as opposed to e.g. a programming contest) where I would have to write my own sort routine from scratch—similarly to how I can’t imagine needing to construct a washing machine out of a tub, an electric motor, pulleys, and belts.
        
        I certainly still care about performance properties of various sorts, but I don’t care about their internal details as long as the performance properties hold. I suspect that the interview questions of the “implement a bubble sort on this piece of paper” variety if anything aim more at “have you been paying attention during your CS classes” and less at “do you have a deep understanding of what your program is doing”.
        
        The capacity of human minds is limited and I’ll accept climbing up higher in abstraction levels at the price of forgetting how the lower-level gears turn.
        Nornagest 29 May 2015 21:01 UTC
        0 points
        Parent
        
        I can’t imagine a serious situation (as opposed to e.g. a programming contest) where I would have to write my own sort routine from scratch
        
        You can’t? I’ve had to do that several times. The usual scenario is that there are search/sort routines, but they have inconvenient properties—either they don’t perform well in the specific problem domain I’m dealing with (happens a lot in simulation; functions for efficiently doing certain types of categorization on spatially arranged data are rare outside graphics libraries), or they don’t work on the data types I need and a reduction is impractical for one reason or another, or they exist but can’t be used for legal reasons. Unless you always situate yourself in the most popular subfields, which I frankly find boring, you can’t always count on there being a library that does exactly what you want—all the more so in a still-emerging space like ML.
        
        (I’ve never had to build a washing machine, incidentally, but I’ve had to fix washing machines—twice this year for two different machines, in fact. I could have hired a mechanic or bought a new machine, but either one would have cost me hundreds of dollars.)
        Lumifer 1 Jun 2015 15:25 UTC
        0 points
        Parent
        
        You can’t? I’ve had to do that several times.
        
        Well, I was talking about standard sort routines—the ones where you have a vector of values and a comparator function. Now, search is quite a different beast altogether.
        
        The thing is, most sorting is brute-force where you just sort without taking into account the specific structure of your data. That approach works well with sorting—but it doesn’t work well with search. The obvious problem is that we are interested in searching very large search spaces where brute force is nowhere close to practical. The salvation comes from the particular structure of the space which allows us to be much more efficient that brute-force, but the same structure forces us into custom solutions.
        
        Because the structures of search spaces can be very different, there is a LOT of search algorithms and frequently enough you have to make bespoke versions of them to fit your particular problem. That’s entirely normal. Plus, of course, optimization is a subtype of search and customizing optimizers is also quite common.
        
        but I’ve had to fix washing machines
        
        Sure, so have I. In fact, I probably would be able to construct a washing machine out of a tub, an electric motor, and some parts. It will take a lot of time and will look ugly, but I think it’ll work. That doesn’t mean I’ll feel a need to do this :-)
        Nanashi 29 May 2015 20:44 UTC
        0 points
        Parent
        Yes, this this this this this this this. “The capacity of human minds is limited and I’ll accept climbing up higher in abstraction levels at the price of forgetting how the lower-level gears turn.” If I could upvote this multiple times, I would.
        
        This is the crux of this entire approach. Learn the higher level, applied abstractions. And learn the very basic fundamentals. Forget learning how the lower-level gears turn: just learn the fundamental laws of physics. If you ever need to figure out a lower-level gear, you can just derive it from your knowledge of the fundamentals, combined with your big-picture knowledge of how that gear fits into the overall system.
        estimator 29 May 2015 21:09 UTC
        0 points
        Parent
        That only works if there are few levels of abstraction; I doubt that you can derive how do programs work at the machine codes level based of your knowledge of physics and high-level programming. Sometimes, gears are so small that you can’t even see them on your top level big picture, and sometimes just climbing up one level of abstraction takes enormous effort if you don’t know in advance how to do it.
        
        I think that you should understand, at least once, how the system works on each level and refresh/deepen that knowledge when you need it.
        Nanashi 29 May 2015 21:35 UTC
        0 points
        Parent
        The definition of “fundamentals” differs though, depending on how abstract you get. The more layers of abstraction, the more abstract the fundamentals. If my goal is high-level programming, I don’t need to know how to write code on bare metal.
        
        That’s why I advocate breaking things down until you reach the level of triviality for you personally. Most people will find, “writing a for-loop” to be trivial, without having to go farther down the rabbit hole. At a certain point, breaking things down too far actually makes things less trivial.
      - [deleted] 29 May 2015 18:26 UTC
        0 points
        Parent
        Can I give a counterexample? I think that way of learning things might help if you only need to apply the higher-level skills as you learned them, but if you need to develop or research those fields yourself, I’ve found you really do need the background.
        
        As in, I have been bitten on the ass by my own choice not to double-major in mathematics in undergrad, thus resulting in my having to start climbing the towers of continuous probability and statistics/ML, abstract algebra, logic, real analysis, category theory, and topology in and after my MSc.
        Nanashi 29 May 2015 20:08 UTC
        2 points
        Parent
        There’s a big difference between the fundamentals, and the low-level practical applications. I think the latter is what estimator is referring to. You can’t really make a breakthrough or do real research without a firm grasp of the fundamentals. But you definitely can make a breakthrough in, say, physics, without knowing the exact tensile strength of wood vs. steel. And yet, that type of “Applied Physics” was a pre-requisite at my school for the more advanced fields of physics that I was actually interested in.
        [deleted] 30 May 2015 0:40 UTC
        0 points
        Parent
        
        And yet, that type of “Applied Physics” was a pre-requisite at my school for the more advanced fields of physics that I was actually interested in.
        
        Oh. Really? Dang.
        estimator 29 May 2015 18:45 UTC
        0 points
        Parent
        You’re right; you have to learn solid background for research. But still, it often makes sense to learn in the reversed order.
- estimator 27 May 2015 21:56 UTC
  1 point
  Parent
  Nice, but beware reasoning after you’ve written the bottom line.
  
  As for the actual content, I basically fail to see its area of applicability. For sufficiently complex skills, like say, math, languages or football decision-trees & howto-guides approach will likely fail as too shallow; for isolated skills like changing a tire complex learning approaches are an overkill—just google it and follow the instructions. Can you elaborate languages example further? Because, you know, learning a bunch of phrases from phrasebook to be able to say a few words in a foreign country is a non-issue. Actually learning language is. How would you apply your system to achieve intermediate-level language knowledge? Any other non-trivial skills learning example would also suffice. What skills have you trained by using your learning system, and how?
  - Nanashi 27 May 2015 22:29 UTC
    2 points
    Parent
    Also, when you say “intermediate level language knowledge”, what exactly do you mean? One of the key steps is defining exactly what you want to accomplish and why. I don’t want to create a whole write-up, only to realize that you and I have two different definitions of “intermediate level language knowledge”.
    
    So if you’d tell me the “what” and the “why”, I’ll do the rest.
    - estimator 27 May 2015 22:47 UTC
      0 points
      Parent
      I meant something like this.
      
      … take part in routine conversations; write & understand simple written text; make notes & understand most of the general meaning of lectures, meetings, TV programmes and extract basic information from a written document.
      - Nanashi 27 May 2015 23:27 UTC
        2 points
        Parent
        I’ll give a more in depth breakdown soon but for now, I’d probably take a similar approach that I took to learning to read Japanese : learn basic sentence structure, learn top 150ish vocabulary words, avoid books written in non-romaji. Practice hearing spoken word by listening to speeches and following their transcriptions. My exception protocol for unrecognized words was to look them up. And for irregular sentence structure, to guess based on context. It worked for watching movies and reading, mostly but as you can tell, yoi kakikomu koto ga dekimasen*. I’d have to do some thinking on the writing part, it would most likely involve sticking to simple sentences.
        
        *thats terrible Japanese for “I cannot write well”. I think. I hope.
        estimator 27 May 2015 23:50 UTC
        0 points
        Parent
        But these are the things pretty much everybody does while learning languages.
        Nanashi 28 May 2015 12:40 UTC
        4 points
        Parent
        Well of course they do. Because these things are necessary to learning a language. This is the 20% that’s most efficient. By definition someone who puts in 100% of the effort will be doing what I did.
        
        The efficiency of this approach revolves around what you don’t do. You’re excising the 80%. I didn’t spend long hours learning katakana, hiragana and kanji. I didn’t learn the more complex tenses and conjugations. I didn’t spend time on vocabulary words that are highly situational. Contrast this to a typical Japanese textbook.
        [deleted] 28 May 2015 11:02 UTC
        3 points
        Parent
        There seem to be two major approaches to learning language.
        
        One is to go a language school / course where the teachers, in my experience, teach it like an academic discipline + the usual guess-my-password bullshit, so you get tested and graded on things like grammar, like a test where you need to fill in conjugations / declinations into holes in a text. (Obviously I am talking about languages that have those kinds of things, like Germanic or Romance ones). Case in point: part of my B2 level German exam at the University of Vienna was exactly that kind of hole-filling and it felt really wrong as it has not much to do with commuication, it is a more academic approach.
        
        The other approach is to do something like this for a while, but when you get to that basic point where you can say “Jack would have ordered a beer yesterday if he had money on him” ditch it and pretty much learn from immersion. Screw grammar, just read a lot of books, figure out words from the context, and conduct imaginary or real conversations no matter how bad the grammar is. Real people prefer to communicate with people who talk fast, not correct. Talking with someone saying at a normal speed who is talking like “me no want buy house, me want rent house now” is far better than someone who is like “I no… (long pause) do not? want … (long pause) want to? buy a house, rather… (long pause)… instead? I want to rent it… (long pause) rent one”. I used to be that second guy in 2 languages and it sucked.
        
        (Now of course you may think “but everybody knows immersion is better it is not even new” yeah apparently that everybody does not include the huge European language school chains like Berlitz and their who knows how many students… )
  - Nanashi 27 May 2015 22:26 UTC
    2 points
    Parent
    Basketball is an example. I’m about to head home so I’ll do the ultra-abbreviated TL;DR version:
    
    Goals: Score points, prevent opponent from scoring points.
    Archetypes: Offense (2-point), Offense (3-point), Defense
    Process How-To: Googled “how to layup”, “how to shoot a 3-pointer”, and “how to steal a ball” 3a. Process Failure Points: Missing a shot, getting the ball stolen, missing a pass. 3b. Process Difficulties: Anything involving ball handling or dribbling. Defense.
    Exception Protocol: On offense: Pass the ball to a better player than myself, or set a pick. On defense: play very close to my opponent. 5a. Avoid anything involving dribbling but not scoring. 5b. Prepare and practice two-point shots. 5c. Focus on getting open for a 3-point shot. Practice consistently shooting from 3-point line.
    Get better by playing.
    
    I would say basketball is fairly complex. One thing I didn’t mention in the original post (mainly because it starts to get into the “how do individual people learn”) but for me, I don’t get good at a competitive skill by competing against people who also suck. By getting good enough to be able to play with people who are actually good, it made it easier for me to learn the advanced part of the game faster.
    
    Also, this post has a list of (at least what I think to be) fairly non-trivial skills that I have trained using this method.