Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
Jessica Rumbelow
Karma:
1,137
AI researcher
All
Posts
Comments
New
Top
Old
Jessica Rumbelow’s Shortform
Jessica Rumbelow
9 Aug 2024 16:27 UTC
6
points
1
comment
1
min read
LW
link
Why did ChatGPT say that? Prompt engineering and more, with PIZZA.
Jessica Rumbelow
3 Aug 2024 12:07 UTC
40
points
2
comments
4
min read
LW
link
Introducing Leap Labs, an AI interpretability startup
Jessica Rumbelow
6 Mar 2023 16:16 UTC
103
points
12
comments
1
min read
LW
link
SolidGoldMagikarp III: Glitch token archaeology
mwatkins
and
Jessica Rumbelow
14 Feb 2023 10:17 UTC
91
points
35
comments
16
min read
LW
link
SolidGoldMagikarp II: technical details and more recent findings
mwatkins
and
Jessica Rumbelow
6 Feb 2023 19:09 UTC
113
points
45
comments
13
min read
LW
link
SolidGoldMagikarp (plus, prompt generation)
Jessica Rumbelow
and
mwatkins
5 Feb 2023 22:02 UTC
679
points
206
comments
12
min read
LW
link
1
review
Guardian AI (Misaligned systems are all around us.)
Jessica Rumbelow
25 Nov 2022 15:55 UTC
15
points
6
comments
2
min read
LW
link
The Ground Truth Problem (Or, Why Evaluating Interpretability Methods Is Hard)
Jessica Rumbelow
17 Nov 2022 11:06 UTC
27
points
2
comments
2
min read
LW
link
Why I’m Working On Model Agnostic Interpretability
Jessica Rumbelow
11 Nov 2022 9:24 UTC
27
points
9
comments
2
min read
LW
link
Back to top