These are very good questions. First, two general clarifications:
A. «Boundaries» are not partitions of physical space; they are partitions of a causal graphical model that is an abstraction over the concrete physical world-model.
B. To “pierce” a «boundary» is to counterfactually (with respect to the concrete physical world-model) cause the abstract model that represents the boundary to increase in prediction error (relative to the best augmented abstraction that uses the same state-space factorization but permits arbitrary causal dependencies crossing the boundary).
So, to your particular cases:
Probably not. There is no fundamental difference between sound and contact. Rather, the fundamental difference is between the usual flow of information through the senses and other flows of information that are possible in the concrete physical world-model but not represented in the abstraction. An interaction that pierces the membrane is one which breaks the abstraction barrier of perception. Ordinary speech acts do not. Only sounds which cause damage (internal state changes that are not well-modelled as mental states) or which otherwise exceed the “operating conditions” in the state space of the «boundary» layer (e.g. certain kinds of superstimuli) would pierce the «boundary».
Almost surely not. This is why, as an agenda for AI safety, it will be necessary to specify a handful of constructive goals, such as provision of clean water and sustenance and the maintenance of hospitable atmospheric conditions, in addition to the «boundary»-based safety prohibitions.
Definitely not. Omission of beneficial actions is not a counterfactual impact.
Probably. This causes prediction error because the abstraction of typical human spatial positions is that they have substantial ability to affect their position between nearby streets by simple locomotory action sequences. But if a human is already effectively imprisoned, then adding more concrete would not create additional/counterfactual prediction error.
Probably not. Provision of resources (that are within “operating conditions”, i.e. not “out-of-distribution”) is not a «boundary» violation as long as the human has the typical amount of control of whether to accept them.
Definitely not. Exploiting behavioural tendencies which are not counterfactually corrupted is not a «boundary» violation.
Maybe. If the ad’s effect on decision-making tendencies is well modelled by the abstraction of typical in-distribution human interactions, then using that channel does not violate the «boundary». Unprecedented superstimuli would, but the precedented patterns in advertising are already pretty bad. This is a weak point of the «boundaries» concept, in my view. We need additional criteria for avoiding psychological harm, including superpersuasion. One is simply to forbid autonomous superhuman systems from communicating to humans at all: any proposed actions which can be meaningfully interpreted by sandboxed human-level supervisory AIs as messages with nontrivial semantics could be rejected. Another approach is Mariven’s criterion for deception, but applying this criterion requires modelling human mental states as beliefs about the world (which is certainly not 100% scientifically accurate). I would like to see more work here, and more different proposed approaches.
We need additional criteria for avoiding psychological harm, including superpersuasion. One is simply to forbid autonomous superhuman systems from communicating to humans at all
Unfortunately this is probably not on the table, as they are currently being used as weapons in economic warfare between the USA, China, and everyone else. tiktok primarily educational inside china. Advertisers have direct incentive to violate. We need a way to use <<membranes>> that will, on the margin, help protect against anyone violating them, not just avoid doing so itself.
Is a cell getting infected by a virus a boundary violation?
What I think makes this tricky is that viruses generally don’t physically penetrate cell membranes. Instead, cells just “let in” some viruses (albeit against their better judgement).
Then once you answer the above, please also consider:
Is a cell taking in nutrients from its environment a boundary violation?
I don’t know what makes this different from the virus example (at least as long as we’re not allowed to refer to preferences).
any proposed actions which can be meaningfully interpreted by sandboxed human-level supervisory AIs as messages with nontrivial semantics could be rejected.
I want to give a big +1 on preventing membrane piercing not just by having AIs respect membranes, but also by using technology to empower membranes to be stronger and better at self-defense.
Hmmm. It’s becoming apparent to me that I don’t want to regard membrane piercing as a necessarily objective phenomenon. Membrane piercing certainly isn’t always visible from every perspective.
That said, I think it’s still possible to prevent “membrane piercing”, even if whether it occurred can be somewhat subjective.
Responding to some of your examples:
Is it piercing a membrane if I speak and it distracts you, but I don’t touch you otherwise
Again: I don’t actually care so much about whether this is or isn’t a membrane piercing, and I don’t want to make a decision on that in this case. Instead, I want to talk about what actions taken by which agents make the most sense for preventing the outcome if we do consider it to be a membrane piercing.
In most everyday cases, I think the best answer is “if someone’s actions are supposedly distracting you, you shouldn’t blame anyone else for distracting you, you should just get stronger and become less distractible”. I believe this because it can be really hard to know other agent’s boundaries, and if you just let other agents tell you your boundaries you can get mugged too easily.
However, in some cases, self-defense is infact insufficient, and usually in these cases as a society we collectively agree that e.g. “no one should blow an airhorn in your ear—in this case we’re going to blame the person that did that”
What about if I destroy all your food sources but don’t touch your body?
It depends on how far out we can find the membranes. For example, if the membranes go so far out as to include property rights then this could be addressed.
What if I enclose your house completely with concrete while you’re in it?
Again depends on how far out we go with the membranes: in this case, probably: how much of the law is included.
It depends on how far out we can find the membranes. For example, if the membranes go so far out as to include property rights then this could be addressed.
I sort of agree, but my food sources are not my property, they’re a farmer’s property.
I edited numbers into my questions, could you edit to make your response numbered and get each one?
Good stuff.
What’s “piercing”?
Is it piercing a membrane if I speak and it distracts you, but I don’t touch you otherwise?
What about if I destroy all your food sources but don’t touch your body?
What if you’re dying and I have a cure but don’t share it?
What if I enclose your house completely with concrete while you’re in it?
How about if I give you food you would have chosen to buy anyway, but I give it to you for free?
What about if I offer you a bad trade I know you’ll choose to make because of an ad you just saw?
What about if I’m the one showing you an ad rather than simply being in the right place at the right time to take advantage of someone else’s ad?
These are very good questions. First, two general clarifications:
A. «Boundaries» are not partitions of physical space; they are partitions of a causal graphical model that is an abstraction over the concrete physical world-model.
B. To “pierce” a «boundary» is to counterfactually (with respect to the concrete physical world-model) cause the abstract model that represents the boundary to increase in prediction error (relative to the best augmented abstraction that uses the same state-space factorization but permits arbitrary causal dependencies crossing the boundary).
So, to your particular cases:
Probably not. There is no fundamental difference between sound and contact. Rather, the fundamental difference is between the usual flow of information through the senses and other flows of information that are possible in the concrete physical world-model but not represented in the abstraction. An interaction that pierces the membrane is one which breaks the abstraction barrier of perception. Ordinary speech acts do not. Only sounds which cause damage (internal state changes that are not well-modelled as mental states) or which otherwise exceed the “operating conditions” in the state space of the «boundary» layer (e.g. certain kinds of superstimuli) would pierce the «boundary».
Almost surely not. This is why, as an agenda for AI safety, it will be necessary to specify a handful of constructive goals, such as provision of clean water and sustenance and the maintenance of hospitable atmospheric conditions, in addition to the «boundary»-based safety prohibitions.
Definitely not. Omission of beneficial actions is not a counterfactual impact.
Probably. This causes prediction error because the abstraction of typical human spatial positions is that they have substantial ability to affect their position between nearby streets by simple locomotory action sequences. But if a human is already effectively imprisoned, then adding more concrete would not create additional/counterfactual prediction error.
Probably not. Provision of resources (that are within “operating conditions”, i.e. not “out-of-distribution”) is not a «boundary» violation as long as the human has the typical amount of control of whether to accept them.
Definitely not. Exploiting behavioural tendencies which are not counterfactually corrupted is not a «boundary» violation.
Maybe. If the ad’s effect on decision-making tendencies is well modelled by the abstraction of typical in-distribution human interactions, then using that channel does not violate the «boundary». Unprecedented superstimuli would, but the precedented patterns in advertising are already pretty bad. This is a weak point of the «boundaries» concept, in my view. We need additional criteria for avoiding psychological harm, including superpersuasion. One is simply to forbid autonomous superhuman systems from communicating to humans at all: any proposed actions which can be meaningfully interpreted by sandboxed human-level supervisory AIs as messages with nontrivial semantics could be rejected. Another approach is Mariven’s criterion for deception, but applying this criterion requires modelling human mental states as beliefs about the world (which is certainly not 100% scientifically accurate). I would like to see more work here, and more different proposed approaches.
You’re sure this is the case even if the disease is about to violate the <<boundary>> and the cure will prevent that?
Unfortunately this is probably not on the table, as they are currently being used as weapons in economic warfare between the USA, China, and everyone else. tiktok primarily educational inside china. Advertisers have direct incentive to violate. We need a way to use <<membranes>> that will, on the margin, help protect against anyone violating them, not just avoid doing so itself.
he says a bit in this direction- see my other comment
Here’s a tricky example I’ve been thinking about:
Is a cell getting infected by a virus a boundary violation?
What I think makes this tricky is that viruses generally don’t physically penetrate cell membranes. Instead, cells just “let in” some viruses (albeit against their better judgement).
Then once you answer the above, please also consider:
Is a cell taking in nutrients from its environment a boundary violation?
I don’t know what makes this different from the virus example (at least as long as we’re not allowed to refer to preferences).
I want to give a big +1 on preventing membrane piercing not just by having AIs respect membranes, but also by using technology to empower membranes to be stronger and better at self-defense.
Thanks for writing this! I largely agree (and the rest I need to think more about)
Edit: just see Davidad’s comment
Hmmm. It’s becoming apparent to me that I don’t want to regard membrane piercing as a necessarily objective phenomenon. Membrane piercing certainly isn’t always visible from every perspective.
That said, I think it’s still possible to prevent “membrane piercing”, even if whether it occurred can be somewhat subjective.
Responding to some of your examples:
Again: I don’t actually care so much about whether this is or isn’t a membrane piercing, and I don’t want to make a decision on that in this case. Instead, I want to talk about what actions taken by which agents make the most sense for preventing the outcome if we do consider it to be a membrane piercing.
In most everyday cases, I think the best answer is “if someone’s actions are supposedly distracting you, you shouldn’t blame anyone else for distracting you, you should just get stronger and become less distractible”. I believe this because it can be really hard to know other agent’s boundaries, and if you just let other agents tell you your boundaries you can get mugged too easily.
However, in some cases, self-defense is infact insufficient, and usually in these cases as a society we collectively agree that e.g. “no one should blow an airhorn in your ear—in this case we’re going to blame the person that did that”
It depends on how far out we can find the membranes. For example, if the membranes go so far out as to include property rights then this could be addressed.
Again depends on how far out we go with the membranes: in this case, probably: how much of the law is included.
I sort of agree, but my food sources are not my property, they’re a farmer’s property.
I edited numbers into my questions, could you edit to make your response numbered and get each one?