Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
Alexandre Variengien
Karma:
634
All
Posts
Comments
New
Top
Old
My guess at Conjecture’s vision: triggering a narrative bifurcation
Alexandre Variengien
6 Feb 2024 19:10 UTC
75
points
12
comments
16
min read
LW
link
The case for training frontier AIs on Sumerian-only corpus
Alexandre Variengien
,
Charbel-Raphaël
and
Jonathan Claybrough
15 Jan 2024 16:40 UTC
130
points
15
comments
3
min read
LW
link
A Universal Emergent Decomposition of Retrieval Tasks in Language Models
Alexandre Variengien
and
Eric Winsor
19 Dec 2023 11:52 UTC
84
points
3
comments
10
min read
LW
link
(arxiv.org)
Capture the Flag Mechanistic Interpretability Challenges
Alejandro Acelas
and
Alexandre Variengien
8 Sep 2023 23:00 UTC
24
points
0
comments
7
min read
LW
link
Input Swap Graphs: Discovering the role of neural network components at scale
Alexandre Variengien
12 May 2023 9:41 UTC
92
points
0
comments
33
min read
LW
link
An introduction to language model interpretability
Alexandre Variengien
20 Apr 2023 22:22 UTC
14
points
0
comments
9
min read
LW
link
Some common confusion about induction heads
Alexandre Variengien
28 Mar 2023 21:51 UTC
64
points
4
comments
5
min read
LW
link
Gliders in Language Models
Alexandre Variengien
25 Nov 2022 0:38 UTC
30
points
11
comments
10
min read
LW
link
Some Lessons Learned from Studying Indirect Object Identification in GPT-2 small
KevinRoWang
,
Alexandre Variengien
,
Arthur Conmy
,
Buck
and
jsteinhardt
28 Oct 2022 23:55 UTC
101
points
9
comments
9
min read
LW
link
2
reviews
(arxiv.org)
Apply to the Machine Learning For Good bootcamp in France
Alexandre Variengien
17 Jun 2022 7:32 UTC
10
points
0
comments
1
min read
LW
link
Croesus, Cerberus, and the magpies: a gentle introduction to Eliciting Latent Knowledge
Alexandre Variengien
27 May 2022 17:58 UTC
17
points
0
comments
16
min read
LW
link
Back to top