NaiveTortoise comments on What are your strategies for avoiding micro-mistakes?

NaiveTortoise 23 Dec 2019 14:37 UTC
3 points
I’ve been more consciously (I think it’s literally impossible to think “without intuition”, but thinking about it as a necessary prerequisite was new for me) thinking about / playing with the recommended approach in this comment since you made it and it’s been helpful, especially in helping me notice the difference between questions where I can “read off” the answer vs. ones where I draw a blank.

However, I’ve also noticed that it’s definitely a sort of “hard mode” in the sense that the more I rely on it the more it forces me to develop intuitions about everything I’m learning before I can think about them effectively. To give an example, I’ve been learning more statistics and there are a bunch of concepts about which I currently have no intuition. E.g., something like the “Gilvenko-Cantelli Theorem”. Historically, I would have just filed it away in my head as “thing that is used to prove variant of Hoeffding that involves a supremum” or, honestly, forgotten about it. But since I’ve been trying to consciously practice developing intuitions, I end up spending a bunch of time thinking about what it’s really saying because I no longer trust myself to use it without understanding it.

Now, I suspect many people in particular people from the LW population would have a response along the lines of, “that’s good, you’re forcing yourself to deeply understand everything you’re learning”. And that’s partly true! On the other hand, I do think there’s something to be said for knowing when to / not to spend time deeply grokking things and just using them as tools, and by forcing myself to rely heavily on intuitions, it becomes harder to do that.

Related to that, I’d be interested in hearing how you and other go about developing such intuitions. (I’ve been compiling my own list.)
- johnswentworth 23 Dec 2019 18:36 UTC
  8 points
  Parent
  My main response here is declarative mathematical frameworks, although I don’t think that post is actually a very good explanation, so I’ll give some examples.
  First, the analogy: let’s say I’m writing a python script, and I want it to pull some data over the internet from some API. The various layers of protocols used by the internet (IP, TCP, HTTP, etc) are specifically designed so that we can move data around without having to think about what’s going on under the hood. Our intuition can operate at a high level of abstraction; we can intuitively “know the answer” without having to worry about the low-level details. If we try to pull www.google.com, and we get back a string that doesn’t look like HTML, then something went wrong—that’s part of our intuition for what the answer should look like.
  A couple more mathy examples...
  If you look at the way physicists and engineers use delta functions or differentials, it clearly works—they build up an intuition for which operations/expressions are “allowed”, and which are not. There’s a “calculus of differentials” and a “calculus of delta functions”, which define what things we can and cannot do with these objects. It is possible to put those calculi on solid foundations—e.g. nonstandard analysis or distribution theory—but we can apparently intuit the rules pretty well even without studying those underlying details. We “know what the answer should look like”. (Personally, I think we could teach the rules better, rather than making physics & engineering students figure them out on the fly, but that still wouldn’t require studying nonstandard analysis etc.)
  Different example: information theory offers some really handy declarative interfaces for certain kinds of problems. One great example I’ve used a lot lately is the data processing inequality (DPI): we have some random variable X which contains some information about Y. We compute some function f(X). The DPI then says that the information in f(X) about Y is no greater than the information in X about Y: processing data cannot add useful information. Garbage in, garbage out. It’s extremely intuitive. If I’m working on a problem, and I see a place where I think “hmm, this is just computation on X, it shouldn’t add any new info” then I can immediately apply the DPI. I don’t have to worry about the details of how the DPI is proven while using it; I can intuit what the answer should look like.
  That’s typically how declarative frameworks show up: there’s a theorem whose statement is very intuitively understandable, though the proof may be quite involved. That’s when it makes sense to hang intuition on the theorem, without necessarily needing to think about the proof (although of course we do need to understand the theorem enough to make sure we’re formalizing our intuition in a valid way!). One could even argue that this is what makes a “good” theorem in the first place.