RSS

Lucius Bushnaq

Karma: 3,828

AI notkilleveryoneism researcher, focused on interpretability.

Personal account, opinions are my own.

I have signed no contracts or agreements whose existence I cannot mention.

Proof idea: SLT to AIT

Lucius BushnaqFeb 10, 2025, 11:14 PM
40 points
15 comments6 min readLW link

[Question] Can we in­fer the search space of a lo­cal op­ti­miser?

Lucius BushnaqFeb 3, 2025, 10:17 AM
25 points
5 comments3 min readLW link

At­tri­bu­tion-based pa­ram­e­ter decomposition

Jan 25, 2025, 1:12 PM
107 points
21 comments4 min readLW link
(publications.apolloresearch.ai)

Ac­ti­va­tion space in­ter­pretabil­ity may be doomed

Jan 8, 2025, 12:49 PM
145 points
32 comments8 min readLW link

In­tri­ca­cies of Fea­ture Geom­e­try in Large Lan­guage Models

Dec 7, 2024, 6:10 PM
68 points
0 comments12 min readLW link

Deep Learn­ing is cheap Solomonoff in­duc­tion?

Dec 7, 2024, 11:00 AM
44 points
1 comment17 min readLW link

Cir­cuits in Su­per­po­si­tion: Com­press­ing many small neu­ral net­works into one

Oct 14, 2024, 1:06 PM
130 points
9 comments13 min readLW link

The Hes­sian rank bounds the learn­ing coefficient

Lucius BushnaqAug 8, 2024, 8:55 PM
68 points
10 comments4 min readLW link

A List of 45+ Mech In­terp Pro­ject Ideas from Apollo Re­search’s In­ter­pretabil­ity Team

Jul 18, 2024, 2:15 PM
120 points
18 comments18 min readLW link

Lu­cius Bush­naq’s Shortform

Lucius BushnaqJul 6, 2024, 9:08 AM
6 points
76 comments1 min readLW link

Apollo Re­search 1-year update

May 29, 2024, 5:44 PM
93 points
0 comments7 min readLW link

In­ter­pretabil­ity: In­te­grated Gra­di­ents is a de­cent at­tri­bu­tion method

May 20, 2024, 5:55 PM
23 points
7 comments6 min readLW link

The Lo­cal In­ter­ac­tion Ba­sis: Iden­ti­fy­ing Com­pu­ta­tion­ally-Rele­vant and Sparsely In­ter­act­ing Fea­tures in Neu­ral Networks

May 20, 2024, 5:53 PM
105 points
4 comments3 min readLW link

Char­bel-Raphaël and Lu­cius dis­cuss interpretability

Oct 30, 2023, 5:50 AM
111 points
7 comments21 min readLW link

An­nounc­ing Apollo Research

May 30, 2023, 4:17 PM
217 points
11 comments8 min readLW link

Basin broad­ness de­pends on the size and num­ber of or­thog­o­nal features

Aug 27, 2022, 5:29 PM
36 points
21 comments6 min readLW link

What Is The True Name of Mo­du­lar­ity?

Jul 1, 2022, 2:55 PM
39 points
10 comments12 min readLW link

Ten ex­per­i­ments in mod­u­lar­ity, which we’d like you to run!

Jun 16, 2022, 9:17 AM
62 points
3 comments9 min readLW link

Pro­ject In­tro: Selec­tion The­o­rems for Modularity

Apr 4, 2022, 12:59 PM
73 points
20 comments16 min readLW link

The­o­ries of Mo­du­lar­ity in the Biolog­i­cal Literature

Apr 4, 2022, 12:48 PM
51 points
13 comments7 min readLW link