Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
cmathw
Karma:
81
All
Posts
Comments
New
Top
Old
Gated Attention Blocks: Preliminary Progress toward Removing Attention Head Superposition
cmathw
,
Dennis Akar
and
Lee Sharkey
8 Apr 2024 11:14 UTC
42
points
4
comments
15
min read
LW
link
Polysemantic Attention Head in a 4-Layer Transformer
Jett Janiak
,
cmathw
and
StefanHex
9 Nov 2023 16:16 UTC
51
points
0
comments
6
min read
LW
link
Back to top