RSS

Grokking (ML)

TagLast edit: Feb 29, 2024, 5:58 AM by Morpheus

A Phenomenon in machine learning where a machine learning model generalizes to a test set only long after it achieved perfect loss on the training set.

Ex­plain­ing grokking through cir­cuit efficiency

Sep 8, 2023, 2:39 PM
101 points
11 comments3 min readLW link
(arxiv.org)

Grokking, mem­o­riza­tion, and gen­er­al­iza­tion — a discussion

Oct 29, 2023, 11:17 PM
75 points
11 comments23 min readLW link

Mesa-Op­ti­miz­ers via Grokking

orthonormalDec 6, 2022, 8:05 PM
36 points
4 comments6 min readLW link

QAPR 5: grokking is maybe not *that* big a deal?

Quintin PopeJul 23, 2023, 8:14 PM
114 points
15 comments9 min readLW link

Paper+Sum­mary: OMNIGROK: GROKKING BEYOND ALGORITHMIC DATA

Marius HobbhahnOct 4, 2022, 7:22 AM
46 points
11 comments1 min readLW link
(arxiv.org)

An in­ter­ac­tive in­tro­duc­tion to grokking and mechanis­tic interpretability

Aug 7, 2023, 7:09 PM
23 points
3 comments1 min readLW link
(pair.withgoogle.com)

A Mechanis­tic In­ter­pretabil­ity Anal­y­sis of Grokking

Aug 15, 2022, 2:41 AM
373 points
48 comments36 min readLW link1 review
(colab.research.google.com)

Grokking Beyond Neu­ral Networks

Jack MillerOct 30, 2023, 5:28 PM
10 points
0 comments2 min readLW link
(arxiv.org)

AXRP Epi­sode 29 - Science of Deep Learn­ing with Vikrant Varma

DanielFilanApr 25, 2024, 7:10 PM
20 points
1 comment63 min readLW link

Minor in­ter­pretabil­ity ex­plo­ra­tion #1: Grokking of mod­u­lar ad­di­tion, sub­trac­tion, mul­ti­pli­ca­tion, for differ­ent ac­ti­va­tion functions

Rareș BaronFeb 26, 2025, 11:35 AM
3 points
13 comments4 min readLW link

A short pro­ject on Mamba: grokking & interpretability

Alejandro TlaieOct 18, 2024, 4:59 PM
21 points
0 comments6 min readLW link

The sling­shot helps with learning

Wilson WuOct 31, 2024, 11:18 PM
33 points
0 comments8 min readLW link

Am­bigu­ous out-of-dis­tri­bu­tion gen­er­al­iza­tion on an al­gorith­mic task

Feb 13, 2025, 6:24 PM
82 points
6 comments11 min readLW link
No comments.