npostavs comments on RohanS’s Shortform

npostavs 5 Jan 2025 20:00 UTC
1 point
0

but I recently tried again to see if it could learn at runtime not to lose in the same way multiple times. It couldn’t. I was able to play the same strategy over and over again in the same chat history and win every time.

I wonder if having the losses in the chat history would instead be training/reinforcing it to lose every time.