I agree: if you’ve ever played any of the Pokemon games, it’s clear that a true uniform distribution over actions would not finish any time that a human could ever observe it, and the time would have to be galactic. There are just way too many bottlenecks and long trajectories and reset points, including various ways to near-guarantee (or guarantee?) failure like discarding items or Pokemon, and if you’ve looked at any Pokemon AI projects or even just Twitch Plays Pokemon, this becomes apparent—they struggle to get out of Pallet Town in a reasonable time, never mind the absurdity of playing through the rest of game and beating the Elite Four etc, and that’s with much smarter move selection than pure random.
various ways to near-guarantee (or guarantee?) failure
Yep, you can guarantee failure by ending up in a softlocked state. One example of this is the Lorelei softlock where you’re locked into a move that will never run out, and the opposing Pokemon always heals itself long before you knock it out[1]. There are many, many ways you can do this, especially in generation 1.
Thanks for the correction! I’ve added the following footnote:
Actually it turns out this hasn’t been done, sorry! A couple RNG attempts were completed, but they involved some human direction/cheating. The point still stands only in the sense that, if Claude took more random/exploratory actions rather than carefully-reasoned shortsighted actions, he’d do better.
This is not true. It would take an absurd amount of time
https://www.reddit.com/r/twitchplayspokemon/comments/1y7o7b/rng_plays_pokemon_is_a_joke_heres_why/
https://www.youtube.com/@winningsequence
I agree: if you’ve ever played any of the Pokemon games, it’s clear that a true uniform distribution over actions would not finish any time that a human could ever observe it, and the time would have to be galactic. There are just way too many bottlenecks and long trajectories and reset points, including various ways to near-guarantee (or guarantee?) failure like discarding items or Pokemon, and if you’ve looked at any Pokemon AI projects or even just Twitch Plays Pokemon, this becomes apparent—they struggle to get out of Pallet Town in a reasonable time, never mind the absurdity of playing through the rest of game and beating the Elite Four etc, and that’s with much smarter move selection than pure random.
Yep, you can guarantee failure by ending up in a softlocked state. One example of this is the Lorelei softlock where you’re locked into a move that will never run out, and the opposing Pokemon always heals itself long before you knock it out[1]. There are many, many ways you can do this, especially in generation 1.
You can get out of it, but with an absurdly low chance of ~1 in 68 quindecillion.
Thanks for the correction! I’ve added the following footnote: