gwern comments on The case for more ambitious language model evals

gwern Jul 10, 2024, 8:52 PM
7 points
3

ChatGPT-4 just spits out a list of ‘famous ML people’ like ‘Ilya Sutskever’ or ‘Daphne Koller’ or ‘Geoffrey Hinton’ - most of whom are obviously incorrect as they write nothing like me!

To elaborate a little more on this: while the RLHF models all appear still capable of a lot of truesight, we also still appear to see “mode collapse”. Besides mine, where it goes from plausible candidates besides me to me + random bigwigs, from Cyborgism Discord, Arun Jose notes another example of this mode collapse over possible authors:

ChatGPT-4′s guesses for Beth’s comment: Eliezer, Timnit Gebru, Sam Altman / Greg Brockman. Further guesses by ChatGPT-4: Gary Marcus, and Yann LeCun.

Claude’s guesses (first try): Paul Christiano, Ajeya, Evan, Andrew Critch, Daniel Ziegler. [but] Claude managed to guess 2 people at ARC/METR. On resampling Claude: Eliezer, Paul, Gwern, or Scott Alexander. Third try, where it doesn’t guess early on: Eliezer, Paul, Rohin Shah, Richard Ngo, or Daniel Ziegler.

Interestingly, Beth aside, I think Claude’s guesses might have been better than 4-base’s. Like, 4-base did not guess Daniel Ziegler (but did guess Daniel Kokotajlo). Also did not guess Ajeya or Paul (Paul at 0.27% and Ajeya at 0.96%) (but entirely plausible this was some galaxy-brained analysis of writing aura more than content than I’m completely missing).

Going back to my comments as a demo:

Woah, with Gwern’s comment Claude’s very insistent that it’s Gwern. I recommended it give other examples and it did so perfunctorily, but then went back to insisting that its primary guess is Gwern.

...ChatGPT-4 guesses: Timnit Gebru, Emily Bender, Yann LeCun, Hinton, Ian Goodfellow, “people affiliated with FHI, OpenAI, or CSET”. For Gwern’s comment. Very funny it guessed Timnit for Beth and Gwern. It also guessed LeCun over Hinton and Ian specifically because of his “active involvement in AI ethics and research discussions”. Claude confirmed SOTA.

Keyboard shortcuts

Keys shown in yellow (e.g., ]) are accesskeys, and require a browser-specific modifier key (or keys).

Keys shown in grey (e.g., ?) do not require any modifier keys.

General
? Show keyboard shortcuts
Esc Hide keyboard shortcuts

Site navigation
h Go to Home (a.k.a. “Frontpage”) view
f Go to Featured (a.k.a. “Curated”) view
a Go to All (a.k.a. “Community”) view
m Go to Meta view
v Go to Tags view
c Go to Recent Comments view
r Go to Archive view
q Go to Sequences view
t Go to About page
u Go to User or Login page
o Go to Inbox page

Page navigation
, Jump up to top of page
. Jump down to bottom of page
/ Jump to top of comments section
s Search

Page actions
n New post or comment
e Edit current post

Post/comment list views
. Focus next entry in list
, Focus previous entry in list
; Cycle between links in focused entry
Enter Go to currently focused entry
Esc Unfocus currently focused entry
] Go to next page
[ Go to previous page
\ Go to first page
e Edit currently focused post

Editor
k Bold text
i Italic text
l Insert hyperlink
q Blockquote text

Appearance
= Increase text size
- Decrease text size
0 Reset to default text size
′ Cycle through content width settings
1 Switch to default theme [A]
2 Switch to dark theme [B]
3 Switch to grey theme [C]
4 Switch to ultramodern theme [D]
5 Switch to simple theme [E]
6 Switch to brutalist theme [F]
7 Switch to ReadTheSequences theme [G]
8 Switch to classic Less Wrong theme [H]
9 Switch to modern Less Wrong theme [I]
; Open theme tweaker
Enter Save changes and close theme tweaker
Esc Close theme tweaker (without saving)

Slide shows
l Start/resume slideshow
Esc Exit slideshow
→↓ Next slide
←↑ Previous slide
Space Reset slide zoom

Miscellaneous
x Switch to next view on user page
z Switch to previous view on user page
` Toggle compact comment list view
g Toggle anti-kibitzer