It’s interesting that the comments on this post are split in terms of whether they interpret the focus to be on math or on deeply. It’s also worth noting that the term “deeply” has many different connotations. Stephen Chew, whom you link to, is using deep learning in the sense of learning something by pondering its meaning and associations. But it’s very much possible for an unsophisticated to learn something deeply in the Chewish sense without acquiring a conceptual understanding of it that has transferable value. For instance, one might “deep learn” the product rule for differentiation:
(fg)′ = f’g + fg’
by saying “each function gets its turn with being differentiated, and then we add up the products.” This is reasonably generalizable (for instance, it generalizes to products of more than two functions, and also to product-like settings in multivariable calculus) but it doesn’t necessarily help with deep conceptual understanding of the rule. On the other hand, the somewhat deeper understanding the product rule for differentiation using the chain rule for partial differentiation (see here) actually helps provide a deep sense of why the result is true.
Now, my example above in some sense disproves my claim. The reason being that the Chewish deep learning of the product rule: “each function gets it chance at being differentiated, and then we add up”—is actually not that far off from the conceptually enlightening explanation based on the chain rule for partial differentiation. So perhaps it is true that attempting Chewish deep learning, without actually having a deep conceptual understanding, enables one to generally get quite close to the correct conceptual understanding.
It’s interesting that the comments on this post are split in terms of whether they interpret the focus to be on math or on deeply. It’s also worth noting that the term “deeply” has many different connotations. Stephen Chew, whom you link to, is using deep learning in the sense of learning something by pondering its meaning and associations. But it’s very much possible for an unsophisticated to learn something deeply in the Chewish sense without acquiring a conceptual understanding of it that has transferable value. For instance, one might “deep learn” the product rule for differentiation:
(fg)′ = f’g + fg’
by saying “each function gets its turn with being differentiated, and then we add up the products.” This is reasonably generalizable (for instance, it generalizes to products of more than two functions, and also to product-like settings in multivariable calculus) but it doesn’t necessarily help with deep conceptual understanding of the rule. On the other hand, the somewhat deeper understanding the product rule for differentiation using the chain rule for partial differentiation (see here) actually helps provide a deep sense of why the result is true.
Now, my example above in some sense disproves my claim. The reason being that the Chewish deep learning of the product rule: “each function gets it chance at being differentiated, and then we add up”—is actually not that far off from the conceptually enlightening explanation based on the chain rule for partial differentiation. So perhaps it is true that attempting Chewish deep learning, without actually having a deep conceptual understanding, enables one to generally get quite close to the correct conceptual understanding.