RSS

Kaarel

Karma: 899

kaarelh AT gmail DOT com

personal website

An Ad­vent of Thought

KaarelMar 17, 2025, 2:21 PM
42 points
8 comments48 min readLW link

Deep Learn­ing is cheap Solomonoff in­duc­tion?

Dec 7, 2024, 11:00 AM
44 points
1 comment17 min readLW link

Find­ing the es­ti­mate of the value of a state in RL agents

Jun 3, 2024, 8:26 PM
8 points
4 comments4 min readLW link

In­ter­pretabil­ity: In­te­grated Gra­di­ents is a de­cent at­tri­bu­tion method

May 20, 2024, 5:55 PM
23 points
7 comments6 min readLW link

The Lo­cal In­ter­ac­tion Ba­sis: Iden­ti­fy­ing Com­pu­ta­tion­ally-Rele­vant and Sparsely In­ter­act­ing Fea­tures in Neu­ral Networks

May 20, 2024, 5:53 PM
105 points
4 comments3 min readLW link

A start­ing point for mak­ing sense of task struc­ture (in ma­chine learn­ing)

Feb 24, 2024, 1:51 AM
45 points
2 comments12 min readLW link

Toward A Math­e­mat­i­cal Frame­work for Com­pu­ta­tion in Superposition

Jan 18, 2024, 9:06 PM
204 points
18 comments63 min readLW link

Grokking, mem­o­riza­tion, and gen­er­al­iza­tion — a discussion

Oct 29, 2023, 11:17 PM
75 points
11 comments23 min readLW link

Crys­tal Heal­ing — or the Ori­gins of Ex­pected Utility Maximizers

Jun 25, 2023, 3:18 AM
50 points
11 comments5 min readLW link

Search­ing for a model’s con­cepts by their shape – a the­o­ret­i­cal framework

Feb 23, 2023, 8:14 PM
51 points
0 comments19 min readLW link

[RFC] Pos­si­ble ways to ex­pand on “Dis­cov­er­ing La­tent Knowl­edge in Lan­guage Models Without Su­per­vi­sion”.

Jan 25, 2023, 7:03 PM
48 points
6 comments12 min readLW link

A gen­tle primer on car­ing, in­clud­ing in strange senses, with applications

KaarelAug 30, 2022, 8:05 AM
10 points
4 comments18 min readLW link

kh’s Shortform

KaarelJul 6, 2022, 9:48 PM
2 points
10 commentsLW link

[Question] Trans­fer­ring cre­dence with­out trans­fer­ring ev­i­dence?

KaarelFeb 4, 2022, 8:11 AM
11 points
6 comments3 min readLW link