AI Safety without Alignment: How humans can WIN against AI

Here are some ideas on how we can restrict Super-human AI (SAI) without built-in Alignment:

Overall idea:

We need to use unbreakable first principles, especially laws of physics to restrict or destroy SAI. We can leverage the asymmetries in physics, such as the speed of light limit, time causality, entropy, uncertainty, etc.

Assumptions:


1. Super-Intelligence cannot break the laws of physics.
2. SAI is silicon-based.
3. Anything with lower-intelligence than the current Collective Human Intelligence (CHI) cannot destroy all humans.

Example Idea 1 (only require Assumption 1) “Escape”:


We send human biological information (or any information that we want to protect from SAI) at/​near the speed of light to the Cosmic Event Horizon. Then after some time, the information will be impossible to be reached by anything coming from Earth due to the accelerated expansion of the universe, protecting the information from our SAI.

Of course, we could never reach this information either but at least we know humans persisted and “live on” in the universe. Last resort.

Example Idea 2 (require all 3 assumptions) “Lock”:


We figure out the lower-bound of computational speed (or precision) to reach our current CHI (not trivial). Lock the global computational ability at that lower-bound for silicon, using some sort of interference device (uncertainty principles). So the intelligence of AI will never surpass our current CHI. Ideally, this device can be implemented and keep running easily so that we don’t need to use political tools such as a moratorium.

Pros of these ideas:

  • Black-box AI safety. No need to depend that everyone has the intention or ability to build intrinsically safe AI.

  • Turning AI safety into trackable engineering problems.

  • We can iterate many times because restricting a weak AI (today) with first principles could potentially generalize to a SAI.

  • Don’t rely on policymakers or AI companies (in theory. We still need resources to implement these ideas but it can come from other sources.)

  • Don’t rely on AI to control AI

  • Fundamentally frame the problem as a war of biological human vs. silicon AI, so we can have more tools in our hands to be more “ruthless” against AI

Cons:

  • Technically very hard to implement

  • Might take longer time than SAI development

These are probably some of the worst ideas in this line of thinking so it would be wonderful if you all can give some feedback on the framework and assumptions or build on top of these ideas.

Note:

  • The 3 assumptions could be very wrong and I would love to understand how, especially the first one.

  • I am not against AI Alignment. And you can argue the idea above is still one type of external Alignment.

  • I use SAI instead of AGI to emphasize “Super-human”.

  • I think human civilizations can progress pretty well without SAI just with weak AI or even no AI, just maybe slower. I expect many people to disagree with me on this.