Cognition is the membrane, its sanity and alignment insulating the physical world, its capability providing the option of having scarier things pass through. It’s an example of membrane vs. boundary distinction, because the membrane is a physical machine, the AI, not some line in the sand. And if it lets through what it shouldn’t, the world dies (metaphorically for the world, literally for the people in it), so there is reason to maintain it in good condition. But it’s a weird example, because the other side of the membrane looks into the platonic realm, not into another physical location, and it selectively lets through ideas/designs/behaviors, not physical compounds. An analogous example would be a radio, a device made out of atoms that selectively listens to electromagnetic signals.
The proposed alignment technique is guarding against hallucinations on the level of chatbot’s personality rather than only of facts it voices, avoiding masks that have fictional personalities with fictional values. Not making up values strengthens the prior towards human values.
Just in case you haven’t seen it: «Boundaries/Membranes» and AI safety compilation, «Boundaries» for formalizing a bare-bones morality. But you seem to be talking about this as a membrane insulating cognition, which is something I haven’t thought of before… it’s an interesting idea i think, i don’t know what to make of it. Do let me know if you get more thoughts on it:)
Cognition is the membrane, its sanity and alignment insulating the physical world, its capability providing the option of having scarier things pass through. It’s an example of membrane vs. boundary distinction, because the membrane is a physical machine, the AI, not some line in the sand. And if it lets through what it shouldn’t, the world dies (metaphorically for the world, literally for the people in it), so there is reason to maintain it in good condition. But it’s a weird example, because the other side of the membrane looks into the platonic realm, not into another physical location, and it selectively lets through ideas/designs/behaviors, not physical compounds. An analogous example would be a radio, a device made out of atoms that selectively listens to electromagnetic signals.
The proposed alignment technique is guarding against hallucinations on the level of chatbot’s personality rather than only of facts it voices, avoiding masks that have fictional personalities with fictional values. Not making up values strengthens the prior towards human values.
Oh, huh. I’m not sure that’s in the scope I mean with «membranes/boundaries»