Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
Yeu-Tong Lau
Karma:
52
All
Posts
Comments
New
Top
Old
Understanding Positional Features in Layer 0 SAEs
bilalchughtai
and
Yeu-Tong Lau
29 Jul 2024 9:36 UTC
43
points
0
comments
5
min read
LW
link
An adversarial example for Direct Logit Attribution: memory management in gelu-4l
Can
,
Yeu-Tong Lau
,
James Dao
and
Jett Janiak
30 Aug 2023 17:36 UTC
17
points
0
comments
8
min read
LW
link
(arxiv.org)
Back to top