RSS

Logan Riggs

Karma: 2,982

Veo-2 Can Pro­duce Real­is­tic Ads

Logan Riggs21 Jan 2025 19:13 UTC
14 points
0 comments1 min readLW link

[Ex­er­cise] Four Ex­am­ples of Notic­ing Confusion

Logan Riggs18 Jan 2025 15:29 UTC
8 points
8 comments3 min readLW link

How do you deal w/​ Su­per Stim­uli?

Logan Riggs14 Jan 2025 15:14 UTC
95 points
25 comments3 min readLW link

When AI 10x’s AI R&D, What Do We Do?

Logan Riggs21 Dec 2024 23:56 UTC
72 points
16 comments4 min readLW link

Lo­gan Riggs’s Shortform

Logan Riggs4 Dec 2024 14:52 UTC
7 points
5 comments1 min readLW link

Book a Time to Chat about In­terp Research

Logan Riggs3 Dec 2024 17:27 UTC
47 points
3 comments1 min readLW link

Eval­u­at­ing Sparse Au­toen­coders with Board Game Models

2 Aug 2024 19:50 UTC
38 points
1 comment9 min readLW link

In­ter­pret­ing Prefer­ence Models w/​ Sparse Autoencoders

1 Jul 2024 21:35 UTC
74 points
12 comments9 min readLW link

Was Re­leas­ing Claude-3 Net-Nega­tive?

Logan Riggs27 Mar 2024 17:41 UTC
52 points
5 comments4 min readLW link

Im­prov­ing SAE’s by Sqrt()-ing L1 & Re­mov­ing Low­est Ac­ti­vat­ing Fea­tures

15 Mar 2024 16:30 UTC
26 points
5 comments4 min readLW link

Find­ing Sparse Lin­ear Con­nec­tions be­tween Fea­tures in LLMs

9 Dec 2023 2:27 UTC
69 points
5 comments10 min readLW link

Sparse Au­toen­coders: Fu­ture Work

21 Sep 2023 15:30 UTC
35 points
5 comments6 min readLW link

Sparse Au­toen­coders Find Highly In­ter­pretable Direc­tions in Lan­guage Models

21 Sep 2023 15:30 UTC
159 points
8 comments5 min readLW link

Really Strong Fea­tures Found in Resi­d­ual Stream

Logan Riggs8 Jul 2023 19:40 UTC
69 points
6 comments2 min readLW link

(ten­ta­tively) Found 600+ Monose­man­tic Fea­tures in a Small LM Us­ing Sparse Autoencoders

Logan Riggs5 Jul 2023 16:49 UTC
60 points
1 comment7 min readLW link

[Repli­ca­tion] Con­jec­ture’s Sparse Cod­ing in Small Transformers

16 Jun 2023 18:02 UTC
52 points
0 comments5 min readLW link

[Repli­ca­tion] Con­jec­ture’s Sparse Cod­ing in Toy Models

2 Jun 2023 17:34 UTC
24 points
0 comments1 min readLW link

[Si­mu­la­tors sem­i­nar se­quence] #2 Semiotic physics—revamped

27 Feb 2023 0:25 UTC
24 points
23 comments13 min readLW link

Mak­ing Im­plied Stan­dards Explicit

Logan Riggs25 Feb 2023 20:02 UTC
22 points
0 comments4 min readLW link

Pro­posal for In­duc­ing Steganog­ra­phy in LMs

Logan Riggs12 Jan 2023 22:15 UTC
22 points
3 comments2 min readLW link