Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
Guillaume Corlouer
Karma:
116
All
Posts
Comments
New
Top
Old
An information-theoretic study of lying in LLMs
Annah
and
Guillaume Corlouer
2 Aug 2024 10:06 UTC
16
points
0
comments
4
min read
LW
link
Degeneracies are sticky for SGD
Guillaume Corlouer
and
Nicolas Macé
16 Jun 2024 21:19 UTC
56
points
1
comment
16
min read
LW
link
Understanding mesa-optimization using toy models
tilmanr
,
rusheb
,
Guillaume Corlouer
,
Dan Valentine
,
afspies
,
mivanitskiy
and
Can
7 May 2023 17:00 UTC
43
points
2
comments
10
min read
LW
link
Metalignment: Deconfusing metaethics for AI alignment.
Guillaume Corlouer
23 Aug 2019 10:25 UTC
13
points
7
comments
3
min read
LW
link
Back to top