In order to protect boundaries, we must first understand how they get violated.
Let’s say there’s a cat, and it gets stabbed by a sword. That’s a boundary violation (a.k.a. membrane piercing). In order for that to have happened, three conditions must have been met:
There was a sword.
The cat and the sword collided.
The cat wasn’t strong enough to resist penetration from the sword.
More generally, in order for any existing membrane to be pierced, three conditions must have all been met:
There was a potential threat. (E.g., a sword, or a person with a sword.)
The moral patient and the threat collided.
The victim failed to adequately defend itself. (Because if the cat was better at self-defense — if its skin was thicker or if it was able to dodge — then it would not have been successfully stabbed.)
Protecting agent boundaries
Each of these three conditions then implies ways of preventing boundary violations (a.k.a. membrane piercing):
1. There was a potential threat.
→ Minimize potential threats
2. There was a collision.
→ Minimize dangerous collisions
→ Predict and prevent collisions before they occur.
→ Prevent collisions by putting distance between threats and moral patients.
→ Prevent premeditated collisions by pre-committing to retribution.
3. The victim failed to defend itself.
→ Empower the membranes of humans and other moral patients to be better at self-defense.
How human societies already try to solve this problem
As a helpful analogy, here’s some examples of how modern human societies try to solve this problem:
Minimizepotential threats
Restrict access to weapons (e.g., nukes, bioweapons, etc.)
Minimize potential perpetrators (i.e., e.g., some fictional societies predict and eliminate potential psychopaths).
Minimizedangerous collisions
Protect high-risk individuals, e.g. put them witness protection
Prevent collisions before they occur, e.g. predictive policing, traffic lights.
Police crimes after they occur.
Empowermembranes to be better at self-defense
Infosec defense: Use good security practices and strong encryption.
Biological defense: Develop and use beneficial vaccines.
Manipulation defense: Reduce unhelpful cognitive biases and emotional insecurities.
How this applies to AI safety:
Minimizepotential AI threats
(this is obvious/boring so I’m omitting it)
Minimizedangerous AI collisions
(this is obvious/boring so I’m omitting it)
Empower membranes to be better at self-defense
Empower the membranes of humans and other moral patients to be more resilient to collisions with threats. Examples:
Manipulation defense: You have an AI assistant that filters potentially-adversarial information for you.
Crime defense: Police have AI assistants that help them predict, deduce, investigate, and prevent crime.
Physical threat defense: (If nanotech works out) You have an AI assistant that shields you from physical threats.
Biological defense: Faster better vaccines, personal antibody printers, etc.
Cybersecurity defense: Good security practices and strong encryption. Software encryption can be arbitrarily strong.
c.f. writing about this from Foresight Institute: (1), (2), (3)…
Legal defense: personal AI assistants for e.g. interfacing with contracts and the legal system.
Bargaining: personal AI assistants for negotiation.
Protecting agent boundaries
If the preservation of an agent’s boundary is necessary for that agent’s safety, how can that boundary/membrane be protected?
How agent boundaries get violated
In order to protect boundaries, we must first understand how they get violated.
Let’s say there’s a cat, and it gets stabbed by a sword. That’s a boundary violation (a.k.a. membrane piercing). In order for that to have happened, three conditions must have been met:
There was a sword.
The cat and the sword collided.
The cat wasn’t strong enough to resist penetration from the sword.
More generally, in order for any existing membrane to be pierced, three conditions must have all been met:
There was a potential threat. (E.g., a sword, or a person with a sword.)
The moral patient and the threat collided.
The victim failed to adequately defend itself. (Because if the cat was better at self-defense — if its skin was thicker or if it was able to dodge — then it would not have been successfully stabbed.)
Protecting agent boundaries
Each of these three conditions then implies ways of preventing boundary violations (a.k.a. membrane piercing):
1. There was a potential threat.
→ Minimize potential threats
2. There was a collision.
→ Minimize dangerous collisions
→ Predict and prevent collisions before they occur.
→ Prevent collisions by putting distance between threats and moral patients.
→ Prevent premeditated collisions by pre-committing to retribution.
3. The victim failed to defend itself.
→ Empower the membranes of humans and other moral patients to be better at self-defense.
How human societies already try to solve this problem
As a helpful analogy, here’s some examples of how modern human societies try to solve this problem:
Minimize potential threats
Restrict access to weapons (e.g., nukes, bioweapons, etc.)
Minimize potential perpetrators (i.e., e.g., some fictional societies predict and eliminate potential psychopaths).
Minimize dangerous collisions
Protect high-risk individuals, e.g. put them witness protection
Prevent collisions before they occur, e.g. predictive policing, traffic lights.
Police crimes after they occur.
Empower membranes to be better at self-defense
Infosec defense: Use good security practices and strong encryption.
Biological defense: Develop and use beneficial vaccines.
Manipulation defense: Reduce unhelpful cognitive biases and emotional insecurities.
How this applies to AI safety:
Minimize potential AI threats
(this is obvious/boring so I’m omitting it)
Minimize dangerous AI collisions
(this is obvious/boring so I’m omitting it)
Empower membranes to be better at self-defense
Empower the membranes of humans and other moral patients to be more resilient to collisions with threats. Examples:
Manipulation defense: You have an AI assistant that filters potentially-adversarial information for you.
Crime defense: Police have AI assistants that help them predict, deduce, investigate, and prevent crime.
Physical threat defense: (If nanotech works out) You have an AI assistant that shields you from physical threats.
Biological defense: Faster better vaccines, personal antibody printers, etc.
Cybersecurity defense: Good security practices and strong encryption. Software encryption can be arbitrarily strong.
c.f. writing about this from Foresight Institute: (1), (2), (3)…
Legal defense: personal AI assistants for e.g. interfacing with contracts and the legal system.
Bargaining: personal AI assistants for negotiation.
Human intelligence enhancement
Cyborgism
Mark Miller and Allison Duettmann (Foresight Institute) outline more ideas in the form of “Active Shields” here: 7. DEFEND AGAINST PHYSICAL THREATS | Multipolar Active Shields. Cf Engines of Creation by Eric Drexler.
Related: We have to Upgrade – Jed McCaleb