Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
Luke Bailey
Karma:
96
Stanford PhD Student
All
Posts
Comments
New
Top
Old
Image Hijacks: Adversarial Images can Control Generative Models at Runtime
Scott Emmons
,
Luke Bailey
and
Euan Ong
20 Sep 2023 15:23 UTC
58
points
9
comments
1
min read
LW
link
(arxiv.org)
Tensor Trust: An online game to uncover prompt injection vulnerabilities
Luke Bailey
and
qxcv
1 Sep 2023 19:31 UTC
30
points
0
comments
5
min read
LW
link
(tensortrust.ai)
Examples of Prompts that Make GPT-4 Output Falsehoods
scasper
and
Luke Bailey
22 Jul 2023 20:21 UTC
21
points
5
comments
6
min read
LW
link
Back to top
N
W
F
A
C
D
E
F
G
H
I
Customize appearance
Current theme:
default
A
C
D
E
F
G
H
I
Less Wrong (text)
Less Wrong (link)
Invert colors
Reset to defaults
OK
Cancel
Hi, I’m Bobby the Basilisk! Click on the minimize button (
) to minimize the theme tweaker window, so that you can see what the page looks like with the current tweaked values. (But remember,
the changes won’t be saved until you click “OK”!
)
Theme tweaker help
Show Bobby the Basilisk
OK
Cancel