Carl Feynman comments on So how well is Claude playing Pokémon?

Carl Feynman Mar 9, 2025, 2:49 PM
2 points
0
A task like this, at which the AI is lousy but not hopeless, is an excellent feedback signal for RL. It’s also an excellent feedback signal for “grad student descent”: have a human add mechanisms, and see if Claude gets better. This is a very good sign for capabilities, unfortunately.
- ChristianKl Mar 9, 2025, 10:54 PM
  2 points
  0
  Parent
  It’s quite easy to use Pokemon playing as feedback signal for becoming better at playing Pokemon. If you naively do that, the AI would learn how to solve the game but doesn’t necessarily train executive function.
  A task like doing computer programming where you have to find a lot of different solutions is likely providing better feedback for RL.
  - Carl Feynman Mar 10, 2025, 12:48 AM
    2 points
    0
    Parent
    True. I was generalizing it to a system that tries to solve lots of Pokémon-like tasks in various artificial worlds, rather than just expecting it to solve Pokémon over and over. But I didn’t say that, I just imagined in my mind and assumed everyone else would too. Thank you for making it explicit!
    - ChristianKl Mar 10, 2025, 2:05 PM
      2 points
      0
      Parent
      It depends on how much Pokémon-like tasks are available. Given that a lot of capital goes into creating each Pokémon game, there aren’t that many Pokémon games. I would expect the number of games that are very Pokémon-like to also be limited.
      - Carl Feynman Mar 10, 2025, 4:30 PM
        7 points
        1
        Parent
        When I say Pokémon-type games, I don’t mean games recounting the adventures of Ash Ketchum and Pikachu. I mean games with a series of obstacles set in a large semi-open world, with things you can carry, a small set of available actions at each point, and a goal of progressing past the obstacles. Such games can be manufactured in unlimited quantities by a program. They can also be “peopled” by simple LLMs, for increased complexity. They don’t actually have to be fun to play or look at, so the design requirements are loose.
        
        There have been attempts at reinforcement learning using unlimited computer-generated games. They haven’t worked that well. I think the key feature that favors Pokémon-like games is that when the player dies or gets stuck, they can go back to the beginning and try again. This rewards trial-and-error learning to get past obstacles, keeping a long-term memory, and to re-plan your approach when something doesn’t work. These are capabilities in which current LLMs are notably lacking.
        Another way of saying what Claude’s missing skill is: managing long-term memory. You need to remember important stuff, forget minor stuff, summarize things, and realize when a conclusion in your memory is wrong and needs correction.