Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
dkirmani comments on
Real-Time Research Recording: Can a Transformer Re-Derive Positional Info?
dkirmani
8 Nov 2022 11:07 UTC
LW: 1 AF: 1
0
AF
guessing this wouldn’t work without causal attention masking
Neel Nanda
8 Nov 2022 12:32 UTC
LW: 2 AF: 1
0
AF
Parent
Yeah, I think that’s purely symmetric.
Back to top
guessing this wouldn’t work without causal attention masking
Yeah, I think that’s purely symmetric.