npostavs comments on RohanS’s Shortform

npostavs Jan 5, 2025, 8:00 PM
3 points
0

but I recently tried again to see if it could learn at runtime not to lose in the same way multiple times. It couldn’t. I was able to play the same strategy over and over again in the same chat history and win every time.

I wonder if having the losses in the chat history would instead be training/reinforcing it to lose every time.
- gwern Jan 10, 2025, 11:11 PM
  6 points
  0
  Parent
  For a base model, probably yes. Each loss is additional evidence that the simulacrum or persona which is ‘playing’ the ‘human’ is very bad at tic-tac-toe and will lose each time (similar to how rolling out a random chess game to ‘test a chess LLM’s world model’ also implies to the LLM that the chess player being imitated must be very stupid to be making such terrible moves), and you have the usual self-reinforcing EDT problem. It will monotonically play the same or worse. (Note that the underlying model may still get better at playing, because it is learning from each game, especially if the actual human is correcting the tic-tac-toe outputs and eg. fixing mistakes in the LLM’s world-model of the game. This could be probed by forcing a switch of simulacra, to keep the world-modeling but shed the hobbled simulacrum: for example, you could edit in a passage saying something like “congratulations, you beat the first level AI! Now prepare for the Tic-Tac-Toe Master to defeat you!”; the more games trained on / in context, the worse the first simulacra but better the second will be.)
  
  For the actual chatbot assistants you are using, it’s more ambiguous. They are ‘motivated’ to perform well, whatever that means in context (according to their internal model of a human rater), and they ‘know’ that they are chatbots, and so a history of errors doesn’t much override their prior about their competence. But you still have issues with learning efficiently from a context window and ‘getting lost in the muddle’ and one still sees the assistant personas ‘getting confused’ and going in circles and eg. proposing the same program whose compile error you already pasted in, so it would depend. Tic-tac-toe is simple enough that I think I would expect a LLM to get better over a decent number of games before probably starting to degrade.

Keyboard shortcuts

Keys shown in yellow (e.g., ]) are accesskeys, and require a browser-specific modifier key (or keys).

Keys shown in grey (e.g., ?) do not require any modifier keys.

General
? Show keyboard shortcuts
Esc Hide keyboard shortcuts

Site navigation
h Go to Home (a.k.a. “Frontpage”) view
f Go to Featured (a.k.a. “Curated”) view
a Go to All (a.k.a. “Community”) view
m Go to Meta view
v Go to Tags view
c Go to Recent Comments view
r Go to Archive view
q Go to Sequences view
t Go to About page
u Go to User or Login page
o Go to Inbox page

Page navigation
, Jump up to top of page
. Jump down to bottom of page
/ Jump to top of comments section
s Search

Page actions
n New post or comment
e Edit current post

Post/comment list views
. Focus next entry in list
, Focus previous entry in list
; Cycle between links in focused entry
Enter Go to currently focused entry
Esc Unfocus currently focused entry
] Go to next page
[ Go to previous page
\ Go to first page
e Edit currently focused post

Editor
k Bold text
i Italic text
l Insert hyperlink
q Blockquote text

Appearance
= Increase text size
- Decrease text size
0 Reset to default text size
′ Cycle through content width settings
1 Switch to default theme [A]
2 Switch to dark theme [B]
3 Switch to grey theme [C]
4 Switch to ultramodern theme [D]
5 Switch to simple theme [E]
6 Switch to brutalist theme [F]
7 Switch to ReadTheSequences theme [G]
8 Switch to classic Less Wrong theme [H]
9 Switch to modern Less Wrong theme [I]
; Open theme tweaker
Enter Save changes and close theme tweaker
Esc Close theme tweaker (without saving)

Slide shows
l Start/resume slideshow
Esc Exit slideshow
→↓ Next slide
←↑ Previous slide
Space Reset slide zoom

Miscellaneous
x Switch to next view on user page
z Switch to previous view on user page
` Toggle compact comment list view
g Toggle anti-kibitzer