RSS

Lot­tery Ticket Hypothesis

TagLast edit: 26 Nov 2021 14:08 UTC by Multicore

The Lottery Ticket Hypothesis claims that neural networks used in machine learning get most of their performance from sub-networks that are already present at initialization that approximate the final policy (“winning tickets”). The training process would, under this model, work by increasing weight on the lottery ticket sub-network and reducing weight on the rest of the network.

The hypothesis was proposed in a paper by Jonathan Frankle and Micheal Carbin of MIT CSAIL.

A Mechanis­tic In­ter­pretabil­ity Anal­y­sis of Grokking

15 Aug 2022 2:41 UTC
373 points
47 comments36 min readLW link1 review
(colab.research.google.com)

Gra­da­tions of In­ner Align­ment Obstacles

abramdemski20 Apr 2021 22:18 UTC
81 points
22 comments9 min readLW link

Up­dat­ing the Lot­tery Ticket Hypothesis

johnswentworth18 Apr 2021 21:45 UTC
73 points
41 comments2 min readLW link

[Question] Does the lot­tery ticket hy­poth­e­sis sug­gest the scal­ing hy­poth­e­sis?

Daniel Kokotajlo28 Jul 2020 19:52 UTC
14 points
17 comments1 min readLW link

Un­der­stand­ing “Deep Dou­ble Des­cent”

evhub6 Dec 2019 0:00 UTC
150 points
51 comments5 min readLW link4 reviews

[Question] What hap­pens to var­i­ance as neu­ral net­work train­ing is scaled? What does it im­ply about “lot­tery tick­ets”?

abramdemski28 Jul 2020 20:22 UTC
25 points
4 comments1 min readLW link

Un­der­stand­ing the Lot­tery Ticket Hy­poth­e­sis

Alex Flint14 May 2021 0:25 UTC
50 points
9 comments8 min readLW link

Ex­plor­ing the Lot­tery Ticket Hypothesis

Rauno Arike25 Apr 2023 20:06 UTC
54 points
3 comments11 min readLW link

Why Neu­ral Net­works Gen­er­al­ise, and Why They Are (Kind of) Bayesian

Joar Skalse29 Dec 2020 13:33 UTC
75 points
58 comments1 min readLW link1 review
No comments.