Chris_Leong comments on Pacing Outside the Box: RNNs Learn to Plan in Sokoban

Chris_Leong 26 Jul 2024 16:14 UTC
LW: 2 AF: 1
0
AF
I would love to hear what shard theorists make of this.

We could describe this AI as having learned a meta-shard—pace around at the start so that you have time to plan.

But at the point where we’ve allowed meta-shards, maybe we’ve already undermined the main claims of shard theory?
- Adrià Garriga-alonso 26 Jul 2024 21:07 UTC
  LW: 1 AF: 1
  0
  AF Parent
  Maybe in this case it’s a “confusion” shard? While it seems to be planning and produce optimizing behavior, it’s not clear that it will behave as a utility maximizer.