I would love to hear what shard theorists make of this.We could describe this AI as having learned a meta-shard—pace around at the start so that you have time to plan.But at the point where we’ve allowed meta-shards, maybe we’ve already undermined the main claims of shard theory?
Maybe in this case it’s a “confusion” shard? While it seems to be planning and produce optimizing behavior, it’s not clear that it will behave as a utility maximizer.
I would love to hear what shard theorists make of this.
We could describe this AI as having learned a meta-shard—pace around at the start so that you have time to plan.
But at the point where we’ve allowed meta-shards, maybe we’ve already undermined the main claims of shard theory?
Maybe in this case it’s a “confusion” shard? While it seems to be planning and produce optimizing behavior, it’s not clear that it will behave as a utility maximizer.