Yap, I checked out Mayhem a while ago—it seems to be an automated test case generation engine. From what I read—they are probably not using Reverse Code Engineering (RCE) tools for their analysis; instead, they use the source of an app – still, this is pretty good. Their software found some vulnerabilities, but honestly, I am a bit disappointed they didn’t find more (because they are likely not going lower level). However, they deserve a lot of credit for being a pioneer in cyber-defense. The Challenge was 6 years ago—So, is Mayhem all we have? No further progress, no breakthroughs?
My post was inspired by Deepmind’s “Reward is enough” and my (limited) experience doing RCE. (Deepmind’s approach is applicable, IMHO). So, I assume we will see more defense innovations developed (Offensive applications are probably not commercial). I assume that most happen behind closed doors. (Actually, I believe if I hear/read that the US has impressive cyberwar capabilities—and Mayhem is not impressive enough.)
Unfortunately, Hacker-AI will more likely be used as an attack tool; for using it in defense, results from Hacker-AI are too slowly deployable.
In my opinion, Hacker-AI forces us to make Cyber-defense proactive (not reactive). Also, we must think about preventing damage (or mitigating it quickly) -- and having redundancy in different security methods.
From what I read they are probably not using Reverse Code Engineering (RCE) tools for their analysis; instead, they use the source of an app.
Where did you read this? Mayhem is totally using reverse code engineering. Mayhem would be pretty uninteresting and it wouldn’t have won DARPA Cyber Grand Challenge if it required source code. From their FAQ:
Q. Does Mayhem for Code require source code?
A. No, Mayhem for Code does not require source code to test an application. Mayhem for Code works with both source code and binaries. Mayhem for Code finds vulnerabilities before and after software release.
Indeed, I didn’t see that or forgot about it—I was pulling my memory when responding. So you might be right that Mayhem is doing RCE.
But what I remember distinctively from the DARPA challenge (also using my memory): their “hacking environment” was a simplified sandbox, not a full CISC with a complex OS.
In the “capture-the-flag” (CTF) hacking competition with humans (at DEFCON 2016), Mayhem was last. This was 6 years ago, we don’t know how good it is now (probably MUCH better, but still, it is a commercial tool used in defense).
The key is harnessing AIs for defense, like finding and fixing all vulnerabilities in a software program before it gets released. “We’d then live in a world where software vulnerabilities were a thing of the past,” he says.
I don’t share that optimism. Why? Complexity creates a combinatorial explosion (to an extremely large search space). What if vulnerabilities are like (bad) chess moves: could AI tell us all mistakes we could make (in advance)? I don’t want to misuse this analogy: but the question is: Is hacking a finite game (with a chance we can remove all vulnerabilities) - or is it an infinite game?
In 2016, DARPA funded Cyber Grand Challenge. (What would Earth do without DARPA? Remember DARPA funded Moderna.) Systems had to demonstrate:
Automatic vulnerability finding on previously-unknown binaries
Automatic patching of binaries without sacrificing performance
Automatic exploit generation
And they were demonstrated. Mayhem by ForAllSecure won the challenge. ForAllSecure is still in business and Mayhem is available for sales.
Yap, I checked out Mayhem a while ago—it seems to be an automated test case generation engine.
From what I read—they are probably not using Reverse Code Engineering (RCE) tools for their analysis; instead, they use the source of an app – still, this is pretty good. Their software found some vulnerabilities, but honestly, I am a bit disappointed they didn’t find more (because they are likely not going lower level). However, they deserve a lot of credit for being a pioneer in cyber-defense. The Challenge was 6 years ago—So, is Mayhem all we have? No further progress, no breakthroughs?
My post was inspired by Deepmind’s “Reward is enough” and my (limited) experience doing RCE. (Deepmind’s approach is applicable, IMHO). So, I assume we will see more defense innovations developed (Offensive applications are probably not commercial). I assume that most happen behind closed doors. (Actually, I believe if I hear/read that the US has impressive cyberwar capabilities—and Mayhem is not impressive enough.)
Unfortunately, Hacker-AI will more likely be used as an attack tool; for using it in defense, results from Hacker-AI are too slowly deployable.
In my opinion, Hacker-AI forces us to make Cyber-defense proactive (not reactive). Also, we must think about preventing damage (or mitigating it quickly) -- and having redundancy in different security methods.
Where did you read this? Mayhem is totally using reverse code engineering. Mayhem would be pretty uninteresting and it wouldn’t have won DARPA Cyber Grand Challenge if it required source code. From their FAQ:
Q. Does Mayhem for Code require source code?
A. No, Mayhem for Code does not require source code to test an application. Mayhem for Code works with both source code and binaries. Mayhem for Code finds vulnerabilities before and after software release.
Indeed, I didn’t see that or forgot about it—I was pulling my memory when responding. So you might be right that Mayhem is doing RCE.
But what I remember distinctively from the DARPA challenge (also using my memory): their “hacking environment” was a simplified sandbox, not a full CISC with a complex OS.
In the “capture-the-flag” (CTF) hacking competition with humans (at DEFCON 2016), Mayhem was last. This was 6 years ago, we don’t know how good it is now (probably MUCH better, but still, it is a commercial tool used in defense).
I am more worried about the attack tools developed behind close doors. I read recently about Bruce Schneier (“When AI Becomes the Hacker”—May 14, 21):
I don’t share that optimism. Why? Complexity creates a combinatorial explosion (to an extremely large search space). What if vulnerabilities are like (bad) chess moves: could AI tell us all mistakes we could make (in advance)? I don’t want to misuse this analogy: but the question is: Is hacking a finite game (with a chance we can remove all vulnerabilities) - or is it an infinite game?