Most of my boundaries work so far has been focused on protecting boundaries “from the outside”. For example, maybe davidad’s OAA could produce some kind of boundary-defending global police AI.
But, imagine parenting a child and protecting them by keeping them inside all day. Seems kind of lame. Something else you could do, though, is not restrict the child and instead allow them to become stronger and better at defending themselves.
So: you can defend boundaries “from the outside”, or you can empower those boundaries to be better at protecting themselves “from the inside”. (Because, if everyone could defend themselves perfectly, then we wouldn’t need AI safety, lol)
Defending boundaries “from the inside” has the advantage of encouraging individual agents/moral patients to be more autonomous and sovereign.
I put some examples of what this might look like in Protecting agent boundaries:
Empower membranes to be better at self-defense
Empower the membranes of humans and other moral patients to be more resilient to collisions with threats. Examples:
Manipulation defense: You have an AI assistant that filters potentially-adversarial information for you.
Crime defense: Police have AI assistants that help them predict, deduce, investigate, and prevent crime.
Physical threat defense: (If nanotech works out) You have an AI assistant that shields you from physical threats.
Biological defense: Faster better vaccines, personal antibody printers, etc.
Cybersecurity defense: Good security practices and strong encryption. Software encryption can be arbitrarily strong.
Legal defense: personal AI assistants for e.g. interfacing with contracts and the legal system.
Bargaining: personal AI assistants for negotiation.
Mark Miller and Allison Duettmann (Foresight Institute) outline more ideas in the form of “Active Shields” here: 7. DEFEND AGAINST PHYSICAL THREATS | Multipolar Active Shields. Cf Engines of Creation by Eric Drexler.
Related: We have to Upgrade – Jed McCaleb
I’m looking to talk to people about the plausibility of empowering boundaries to be better at defending themselves / cyborgism. Let me know, or leave a comment if you know anyone who’s thinking about this.
Some thoughts:
First, it sounds like you might be interested the idea of d/acc from this Vitalik Buterin post, which advocates for building a “defense favoring” world. There are a lot of great examples of things we can do now to make the world more defense favoring, but when it comes to strongly superhuman AI I get the sense that things get a lot harder.
Second, there doesn’t seem like a clear “boundaries good” or “boundaries bad” story to me. Keeping a boundary secure tends to impose some serious costs on the bandwidth of what can be shared across it. Pre-industrial Japan maintained a very strict boundary with the outside world to prevent foreign influence, and the cost was falling behind the rest of the world technologically.
My left and right hemispheres are able to work so well together because they don’t have to spend resources protecting themselves from each other. Good cooperative thinking among people also relies on trust making it possible to loosen boundaries of thought. Weakening borders between countries can massively increase trade, and also relies on trust between the participant countries. The problem with AI is that we can’t give it that level of trust, and so we need to build boundaries, but the ultimate cost seems to be that we eventually get left behind. Creating the perfect boundary that only lets in the good and never the bad, and doesn’t incur a massive cost, seems like a really massive challenge and I’m not sure what that would look like.
Finally, when I think of Cyborgism, I’m usually thinking of it in terms of taking control over the “cyborg period” of certain skills, or the period of time where human+AI teams still outperform either humans or AIs on their own. In this frame, if we reach a point where AIs broadly outperform human+AI teams, then baring some kind of coordination, humans won’t have the power to protect themselves from all the non-human agency out there (and it’s up to us to make good use of the cyborg period before then!)
In that frame, I could see “protecting boundaries” intersecting with cyborgism, for example in that AI could help humans perform better oversight and guard against disempowerment around the end of some critical cyborg period. Developing a cyborgism that scales to strongly superhuman AI has both practical challenges (like the kind neuralink seeks to overcome), as well as requiring you to solve it’s own particular version of alignment problem (e.g. how can you trust the AI you are merging with won’t just eat your mind).
Hence “membranes”, a way to pass things through in a controlled way rather than either allowing or disallowing everything. In this sense absence of a membrane is a degenerate special case of a membrane, so there is no tradeoff between presence and absence of boundaries/membranes, only between different possible membranes. If the other side of a membrane is sufficiently cooperative, the membrane can be more permissive. If a strong/precise membrane is too costly to maintain, it should be weaker/sloppier.
yea