Steve_Omohundro comments on Limitations on Formal Verification for AI Safety

Steve_Omohundro 21 Aug 2024 18:07 UTC
20 points
2
Challenge 4: AI advances, including AGI, are not likely to be disruptively helpful for improving formal verification-based models until it’s too late.
Yes, this is our biggest challenge, I think. Right now very few people have experience with formal systems and historically the approach has been met with continuous misunderstanding and outright hostility.
In 1949 Alan Turing laid out the foundations for provably correct software: “An Early Program Proof by Alan Turing”. What should have happened (in my book) is that computer science was recognized as a mathematical science capable of mathematically proving the correctness of its designs. And through the years, there have indeed been many who were inspired by that vision and made great and important contributions.
But, unfortunately, the field was resistant to correctness and we got wave after wave of “sloppy programming” which continues to haunt us to this day. For example, the July 19, 2024 CrowdStrke Incident has been called the largest IT outage in history causing over $10 billion in financial damages. It was caused by a sloppy error in an update to Microsoft Windows security software. This is an outrage and it has been an ongoing outrage almost since the time of Turing.
In another post, I mentioned a similar outrage in scientific computing. It has been known for many decades how to mathematically precisely perform scientific computing. And yet the standard remains “slap the code up against the wall and see if the outputs look reasonable”. It is unknown how many scientific results, medical analyses, or engineering designs are flawed because people couldn’t be bothered to perform their computations correctly.
Cryptography has similar issues. The mainstream is based on computational cryptography which has unknown vulnerability to powerful AI attack. Meanwhile, provable correct “Information Theoretic Cryptography” languishes in a few academic conferences with very little use in practice.
So humanity is way behind where we should be on the formalization front. Our ancestors sloppiness may now be our downfall. One hope, as you mention, is the rapid advancement of AI theorem proving, AI autoformalization, AI verified program synthesis, and other related areas.
It looks to me that we are now in a race of AI for safety vs. AI for unsafe capabilities. The more people who are aware of the issues and the potential solutions using the new formal technologies, the greater the chance we have to survive.
I think the safest world would be one where humanity does not create AI. Next would be where we have a long enough pause in AI development to carefully create safe infrastructure. Next would be restricting powerful AI to a few highly regulated government labs. Next would be a world with tight controls on GPUs and datacenters with no computational overhang. Unfortunately, it appears that all those ships have already sailed. We now have huge computational overhang and powerful open source AI models. The Llama 3.1 8B model can run on a $90 Raspberry Pi 5. So we have to rebuild the world with safe infrastructure in the face of rapidly improving and uncontrolled AI capabilities.
It appears that the massive needed change will only happen *after* large scale AI-powered attacks and destruction begins. I think the greatest contribution to humanity’s survival right now is to create detailed plans for building provably safe infrastructure, so that when the enabling technologies appear and the world begins demanding safe technology, there is a plan for moving forward.
- faul_sname 21 Aug 2024 22:10 UTC
  4 points
  2
  Parent
  
  I think the greatest contribution to humanity’s survival right now is to create detailed plans for building provably safe infrastructure, so that when the enabling technologies appear and the world begins demanding safe technology, there is a plan for moving forward.
  
  There are enough places provably-safe-against-physical-access hardware would be an enormous value-add that you don’t need to wait to start working on it until the world demands safe technology for existential reasons. Look at the demand for secure enclaves, which are not provably secure, but are “probably good enough because you are unlikely to have a truly determined adversary”.
  
  The easiest way to convince people that they, personally, should care more about provable correctness over immediately-obvious practical usefulness is to demonstrate that provable correctness is possible, not too costly, and has clear benefits to them, personally.
  - Steve_Omohundro 22 Aug 2024 17:25 UTC
    3 points
    6
    Parent
    I totally agree! I think this technology is likely to be the foundation of many future capabilities as well as safety. What I meant was that society is unlikely to replace today’s insecure and unreliable power grid controllers, train network controllers, satellite networks, phone system, voting machines, etc. until some big event forces that. And that if the community produces comprehensive provable safety design principles, those are more likely to get implemented at that point.
    - faul_sname 22 Aug 2024 18:13 UTC
      4 points
      0
      Parent
      My point was more that I expect there to be more value in producing provable safety design demos and provable safety design tutorials than in provable safety design principles, because I think the issue is “people don’t know how, procedurally, to implement provable safety in systems they build or maintain” than it is “people don’t know how to think about provable safety but if their philosophical confusion was resolved they wouldn’t have too many further implementation difficulties”.
      
      So having any examples at all would be super useful, and if you’re trying to encourage “any examples at all” one way of encouraging that is to go “look, you can make billions of dollars if you can build this specific example”.

Steve_Omohundro comments on Limitations on Formal Verification for AI Safety

Challenge 4: AI advances, including AGI, are not likely to be disruptively helpful for improving formal verification-based models until it’s too late.