RSS

Lot­tery Ticket Hypothesis

TagLast edit: Nov 26, 2021, 2:08 PM by Multicore

The Lottery Ticket Hypothesis claims that neural networks used in machine learning get most of their performance from sub-networks that are already present at initialization that approximate the final policy (“winning tickets”). The training process would, under this model, work by increasing weight on the lottery ticket sub-network and reducing weight on the rest of the network.

The hypothesis was proposed in a paper by Jonathan Frankle and Micheal Carbin of MIT CSAIL.

A Mechanis­tic In­ter­pretabil­ity Anal­y­sis of Grokking

Aug 15, 2022, 2:41 AM
373 points
47 comments36 min readLW link1 review
(colab.research.google.com)

Gra­da­tions of In­ner Align­ment Obstacles

abramdemskiApr 20, 2021, 10:18 PM
84 points
22 comments9 min readLW link

Up­dat­ing the Lot­tery Ticket Hypothesis

johnswentworthApr 18, 2021, 9:45 PM
73 points
41 comments2 min readLW link

[Question] Does the lot­tery ticket hy­poth­e­sis sug­gest the scal­ing hy­poth­e­sis?

Daniel KokotajloJul 28, 2020, 7:52 PM
14 points
17 comments1 min readLW link

Un­der­stand­ing “Deep Dou­ble Des­cent”

evhubDec 6, 2019, 12:00 AM
150 points
51 comments5 min readLW link4 reviews

[Question] What hap­pens to var­i­ance as neu­ral net­work train­ing is scaled? What does it im­ply about “lot­tery tick­ets”?

abramdemskiJul 28, 2020, 8:22 PM
25 points
4 comments1 min readLW link

Un­der­stand­ing the Lot­tery Ticket Hy­poth­e­sis

Alex FlintMay 14, 2021, 12:25 AM
50 points
9 comments8 min readLW link

Ex­plor­ing the Lot­tery Ticket Hypothesis

Rauno ArikeApr 25, 2023, 8:06 PM
55 points
3 comments11 min readLW link

Why Neu­ral Net­works Gen­er­al­ise, and Why They Are (Kind of) Bayesian

Joar Skalse29 Dec 2020 13:33 UTC
75 points
58 comments1 min readLW link1 review
No comments.