RSS

Grokking (ML)

TagLast edit: 29 Feb 2024 5:58 UTC by Morpheus

A Phenomenon in machine learning where a machine learning model generalizes to a test set only long after it achieved perfect loss on the training set.

AXRP Epi­sode 29 - Science of Deep Learn­ing with Vikrant Varma

DanielFilan25 Apr 2024 19:10 UTC
20 points
1 comment63 min readLW link

Grokking, mem­o­riza­tion, and gen­er­al­iza­tion — a discussion

29 Oct 2023 23:17 UTC
75 points
11 comments23 min readLW link

Mesa-Op­ti­miz­ers via Grokking

orthonormal6 Dec 2022 20:05 UTC
36 points
4 comments6 min readLW link

QAPR 5: grokking is maybe not *that* big a deal?

Quintin Pope23 Jul 2023 20:14 UTC
114 points
15 comments9 min readLW link

Paper+Sum­mary: OMNIGROK: GROKKING BEYOND ALGORITHMIC DATA

Marius Hobbhahn4 Oct 2022 7:22 UTC
46 points
11 comments1 min readLW link
(arxiv.org)

An in­ter­ac­tive in­tro­duc­tion to grokking and mechanis­tic interpretability

7 Aug 2023 19:09 UTC
23 points
3 comments1 min readLW link
(pair.withgoogle.com)

A Mechanis­tic In­ter­pretabil­ity Anal­y­sis of Grokking

15 Aug 2022 2:41 UTC
373 points
47 comments36 min readLW link1 review
(colab.research.google.com)

Grokking Beyond Neu­ral Networks

Jack Miller30 Oct 2023 17:28 UTC
10 points
0 comments2 min readLW link
(arxiv.org)

A short pro­ject on Mamba: grokking & interpretability

Alejandro Tlaie18 Oct 2024 16:59 UTC
21 points
0 comments6 min readLW link

Ex­plain­ing grokking through cir­cuit efficiency

8 Sep 2023 14:39 UTC
101 points
11 comments3 min readLW link
(arxiv.org)

The sling­shot helps with learning

Wilson Wu31 Oct 2024 23:18 UTC
31 points
0 comments8 min readLW link
No comments.