Thane Ruthenis comments on Value Formation: An Overarching Model

Thane Ruthenis 4 Jan 2023 19:29 UTC
2 points
0
… there seems to be a selection effect where GPS-instances that don’t care about preserving future API-call-context gets removed, leaving only subagent-y GPS-instances over time.
You’re not taking into account larger selection effects on agents, which select against agents that purge all those “myopic” GPS-instances. The advantage of shards and other quick-and-dirty heuristics is that they’re fast — they’re what you’re using in a fight, or when making quick logical leaps, etc. Agents which purge all of them, and keep only slow deliberative reasoning, don’t live long. Or, rather, agents which are dominated by strong deliberative reasoning tend not to do that to begin with, because they recognize the value of said quick heuristics.
In other words: not all shards/subagents are completely selfish and sociopathic, some/most want select others around. So even those that don’t “defend themselves” can be protected by others, or not even be targeted to begin with.
Examples:
- A “chips-are-tasty” shard is probably not so advanced as to have reflective capabilities, and e. g. a more powerful “health” shard might want it removed. But if you have some even more powerful preferences for “getting to enjoy things”, or a dislike of erasing your preferences for practical reasons, the health-shard’s attempts might be suppressed.
- A shard which implements a bunch of highly effective heuristics for escaping certain death is probably not one that any other shard/GPS instance would want removed.