The main problem I see here is generality of some niches that are more powerful. For example, nanotech-designer AI can start by thinking only about molecular structures, but eventually it stumbles upon situations “I need to design nanotech swarm that is aligned to constructor goal” or “what if I am a pile of computing matter that was created by other nanotech swarms (technically, all multicellular life is a multitude of nanotech swarms)”, “what if my goal is not aligned with goal of nanotech swarm that created me”, etc.
Is this a problem? I think the ontology addresses this. I’d have phrased what you just described as the agent exiting an “opening” in the niche ((2) in the image).
If theres an attractor that exists outside the enclosure (the ‘what if’ thoughts you mention count, I think, since they pull the agent towards states outside the niche), if there’s some force pushing the agent outwards (curiosity/search/information seeking), and if there are holes/openings, then I expect there to be unexpected failures from finding novel solutions
It’s a problem in a sense that you need to make your systems either weaker or very expensive (in terms of alignment tax, see, for example, davidads’ Open Agency Architecture) relative to unconstrained systems.
The main problem I see here is generality of some niches that are more powerful. For example, nanotech-designer AI can start by thinking only about molecular structures, but eventually it stumbles upon situations “I need to design nanotech swarm that is aligned to constructor goal” or “what if I am a pile of computing matter that was created by other nanotech swarms (technically, all multicellular life is a multitude of nanotech swarms)”, “what if my goal is not aligned with goal of nanotech swarm that created me”, etc.
Is this a problem? I think the ontology addresses this.
I’d have phrased what you just described as the agent exiting an “opening” in the niche ((2) in the image).
If theres an attractor that exists outside the enclosure (the ‘what if’ thoughts you mention count, I think, since they pull the agent towards states outside the niche), if there’s some force pushing the agent outwards (curiosity/search/information seeking), and if there are holes/openings, then I expect there to be unexpected failures from finding novel solutions
It’s a problem in a sense that you need to make your systems either weaker or very expensive (in terms of alignment tax, see, for example, davidads’ Open Agency Architecture) relative to unconstrained systems.