To be more fair here, part of the problem is that a boundary in common usage is only non-arbitrary to the extent someone has limits to how much they can shift/control their boundary, and thus type 1 and type 2 essentially merge.
Indeed, one of the most central points of embedded agency/physical universality is precisely that boundaries are in principle arbitrarily shiftable, and thus that the current boundary has no ontological/special meaning, which is a big part of why I think the boundaries program isn’t a useful safety target, mostly because of the fact that it’s too easy to change the boundary like how EAs have done it, and there are other, better safety targets that don’t rely on the assumption of an unchanging boundary.
To be more fair here, part of the problem is that a boundary in common usage is only non-arbitrary to the extent someone has limits to how much they can shift/control their boundary, and thus type 1 and type 2 essentially merge.
Indeed, one of the most central points of embedded agency/physical universality is precisely that boundaries are in principle arbitrarily shiftable, and thus that the current boundary has no ontological/special meaning, which is a big part of why I think the boundaries program isn’t a useful safety target, mostly because of the fact that it’s too easy to change the boundary like how EAs have done it, and there are other, better safety targets that don’t rely on the assumption of an unchanging boundary.