Archive
Sequences
About
Search
Log In
Questions
Events
Shortform
Alignment Forum
AF Comments
Home
Featured
All
Tags
Recent
Comments
RSS
Luke Bailey
Karma:
96
Stanford PhD Student
All
Posts
Comments
New
Top
Old
Image Hijacks: Adversarial Images can Control Generative Models at Runtime
Scott Emmons
,
Luke Bailey
and
Euan Ong
20 Sep 2023 15:23 UTC
58
points
9
comments
1
min read
LW
link
(arxiv.org)
Tensor Trust: An online game to uncover prompt injection vulnerabilities
Luke Bailey
and
qxcv
1 Sep 2023 19:31 UTC
30
points
0
comments
5
min read
LW
link
(tensortrust.ai)
Examples of Prompts that Make GPT-4 Output Falsehoods
scasper
and
Luke Bailey
22 Jul 2023 20:21 UTC
21
points
5
comments
6
min read
LW
link
Back to top