Ollie J

Karma: 186

Ollie J Jun 13, 2024, 12:15 PM
2 points
0
in reply to: gw’s comment on: [Paper] AI Sandbagging: Language Models can Strategically Underperform on Evaluations
Fixed, thanks for flagging

[Paper] AI Sandbagging: Language Models can Strategically Underperform on Evaluations

Teun van der Weij, Felix Hofstätter, Ollie J, Sam F. Brown and Francis Rhys Ward

Jun 13, 2024, 10:04 AM

84 points

10 comments2 min readLW link

(arxiv.org)

Tall Tales at Different Scales: Evaluating Scaling Trends For Deception In Language Models

Felix Hofstätter, Francis Rhys Ward, HarrietW, LAThomson, Ollie J, Patrik Bartak and Sam F. Brown

Nov 8, 2023, 11:37 AM

49 points

0 comments18 min readLW link

ChatGPT banned in Italy over privacy concerns

Ollie JMar 31, 2023, 5:33 PM

18 points

4 comments1 min readLW link

(www.bbc.co.uk)

Ollie J Feb 24, 2023, 8:36 PM
1 point
0
on: Meta “open sources” LMs competitive with Chinchilla, PaLM, and code-davinci-002 (Paper)
The link for the github repo is broken, it includes the comma at the end.

Whisper’s Wild Implications

Ollie JJan 3, 2023, 12:17 PM

19 points

6 comments5 min readLW link

Ollie J Nov 23, 2022, 1:25 PM
9 points
2
on: Human-level Diplomacy was my fire alarm
I wonder how it would update its strategies if you negotiated in an unorthodox way:
- “If you help me win, I will donate £5000 across various high-impact charities”
- “If you don’t help me win, I will kill somebody”

Ollie J Jun 16, 2022, 2:14 PM
29 points
on: Contra Hofstadter on GPT-3 Nonsense
There exist many articles like this littered throughout the internet, where authors perform surface-level analysis and ask GPT-3 some question (usually basic arithmetic), then point at the wrong answer and make some conclusion (“GPT-3 is clueless”). They almost never state the parameters of the used model or give the whole input prompt.
GPT-3 is very capable of saying “I don’t know” (or “yo be real”), but due to its training dataset it likely won’t say it on its own accord.
GPT-3 is not an oracle or some other kind of agent. GPT-3 is a simulator of such agents. To get GPT-3 to act as a truthful oracle, explicit instruction must be given in the input prompt to do so.

Ollie J Mar 31, 2022, 8:46 AM
6 points
on: Meta wants to use AI to write Wikipedia articles; I am Nervous™
I’m positive that as these language models become more accessible and powerful, their misuse will grow massively. However, I believe open sourcing is the best option here; having access to such model allows us to create accurate automatic classifiers that detect outputs from such models. Media websites (e.g. Wikipedia, Twitter) could include this classifier in their pipeline for submitting new media.
Making such technologies closed source leaves researchers in the dark; due to the scaling-transformer hype, only a tiny fraction of the world’s population have the financial means to train a SOTA transformer model.

Keyboard shortcuts

Keys shown in yellow (e.g., ]) are accesskeys, and require a browser-specific modifier key (or keys).

Keys shown in grey (e.g., ?) do not require any modifier keys.

General
? Show keyboard shortcuts
Esc Hide keyboard shortcuts

Site navigation
h Go to Home (a.k.a. “Frontpage”) view
f Go to Featured (a.k.a. “Curated”) view
a Go to All (a.k.a. “Community”) view
m Go to Meta view
v Go to Tags view
c Go to Recent Comments view
r Go to Archive view
q Go to Sequences view
t Go to About page
u Go to User or Login page
o Go to Inbox page

Page navigation
, Jump up to top of page
. Jump down to bottom of page
/ Jump to top of comments section
s Search

Page actions
n New post or comment
e Edit current post

Post/comment list views
. Focus next entry in list
, Focus previous entry in list
; Cycle between links in focused entry
Enter Go to currently focused entry
Esc Unfocus currently focused entry
] Go to next page
[ Go to previous page
\ Go to first page
e Edit currently focused post

Editor
k Bold text
i Italic text
l Insert hyperlink
q Blockquote text

Appearance
= Increase text size
- Decrease text size
0 Reset to default text size
′ Cycle through content width settings
1 Switch to default theme [A]
2 Switch to dark theme [B]
3 Switch to grey theme [C]
4 Switch to ultramodern theme [D]
5 Switch to simple theme [E]
6 Switch to brutalist theme [F]
7 Switch to ReadTheSequences theme [G]
8 Switch to classic Less Wrong theme [H]
9 Switch to modern Less Wrong theme [I]
; Open theme tweaker
Enter Save changes and close theme tweaker
Esc Close theme tweaker (without saving)

Slide shows
l Start/resume slideshow
Esc Exit slideshow
→↓ Next slide
←↑ Previous slide
Space Reset slide zoom

Miscellaneous
x Switch to next view on user page
z Switch to previous view on user page
` Toggle compact comment list view
g Toggle anti-kibitzer