I agree with about everything you said as well as several more criticisms along those lines you didn’t say. I am probably more familiar with these issues than anyone else on this website with the possible exception of Jason Gross.
Now, suppose we can magic all that away. How much then will this reduce AI risk?
As others have written, I think you have to get very close to perfection before you get much of a win against the kind of AGI everybody on here is worried about, because you have to assume that it can find very subtle bugs. Also, if you assume it has access to the Internet or any other large selection of targets, it will attack the thing that has not been hardened… so you have to get everything hardened before this very smart adversary pops up.
But it sure can’t hurt. And it would help other stuff, too.
Hey, can I ask an almost unrelated question that you’re free to ignore or answer as a private message OR answer here? How good is formal verification for time and space these days?
I must disagree with the first claim. Defense-in-depth is very much a thing in cybersecurity. The whole “attack surface” idea assumes that, if you compromise any application, you can take over an entire machine or network of machines. That is still sometimes true, but continually less so. Think it’s game over if you get root on a machine? Not if it’s running SELinux.
Hey, can I ask an almost unrelated question that you’re free to ignore or answer as a private message OR answer here? How good is formal verification for time and space these days?
I can speak only in broad strokes here, as I have not published in verification. My publications are predominantly in programming tools of some form, mostly in program transformation and synthesis.
There are two main subfields that fight over the term “verification”: model checking and mechanized/interactive theorem proving. This is not counting people like Dawson Engler, who write very unsound static analysis tools but call it “verification” anyway. I give an ultra-brief overview of verification in https://www.pathsensitive.com/2021/03/why-programmers-shouldnt-learn-theory.html
I am more knowledgable about mechanized theorem proving, since my department has multiple labs who work in this area and I’ve taken a few of their seminars. But asking about time/space of verification really just makes sense for the automated part. I attended CAV in 2015 and went to a few model checking talks at ICSE 2016, and more recently talked to a friend on AWS’s verification team about what some people there are doing with CBMC. Okay, and I guess I talked to someone who used to do model checking on train systems in France just two days ago. Outside of that exposure, I am super not-up-to-date with what’s going on. But I’d still expect massive breakthroughs to make the news rounds over to my corner of academia, so I’ll give my sense of the status quo.
Explicit state enumeration can crush programs with millions or billions of states, while symbolic model checking routinely handles $10^100$ states.
Those are both very small numbers. To go bigger, you need induction or abstraction, something fully automated methods are still bad at.
Yes, we can handle exponentially large things, but the exponential still wins. There’s a saying of SAT solvers “either it runs instantly or it takes forever.” I believe this is less true of model checking, though still true. (Also, many model checkers use SAT.)
If you want to model check something, either you check a very small program like a device driver, or you develop some kind of abstract model and check that instead.
Hmm. It looks like my reply notifications are getting batched now. I didn’t realize I’d set that up.
I’ve reordered some of this, because the latter parts get into the weeds a lot and may not be worth reading. I advise that anybody who gets bored stop reading there, because it’s probably not going to get more interesting.
For background, I haven’t been doing security hands-on for the last few years, but I did it full time for about 25 years before that, and I still watch the space. I started out long enough ago that “cyber” sets my teeth on edge...
State of practice in defense
[...] assumes that, if you compromise any application, you can take over an entire machine or network of machines. That is still sometimes true, but continually less so.
Well yes but… not that much less. A lot of what’s done is, shall we say, “aspirational”, and a lot of the rest works much better to reduce the rate of damage from human adversaries than it would to resist a total takeover from AGI that had decided a given system was on the critical path for its success.
Today, if you’re a real-world organization with a significant IT infrastructure, and you hire a skilled human penetration tester (or team), and you give them a reasonable amount of time, and you don’t set artificially limiting “rules of engagement”, they will almost always reach whatever objectives you set, by default full administrative access to all or almost all of your systems.
All the changes over the last decade or two have not, in the end, appreciably reduced the chance of being “owned” by (sufficiently motivated) humans. The cost will be higher, but the probability of success is still close to one if somebody is willing to pay that cost.
And the cost isn’t really prohibitive. It’s more the kind of cost increase that redirects an attacker to another victim than the kind that convinces them to go out of the penetrating business altogether. There are just too many possible ways to get in. The limiting factors are attacker time, expertise, and motivation.
In the AGI scenario we’re talking about, all three of those limits presumably get a lot less restrictive.
… and that’s on the “big machines” with professional admins. Out on some random IOT device, you’re lucky if you have process memory isolation (or the hardware to support it), let alone SELinux. And you probably don’t get regular updates either, especially after a couple of years.
The state of practice in the legacy systems is still really, really bad. I don’t think there’s been any qualitative change, nor any reason to expect qualitative change. And administrative configuration will probably be outside the domain of formal verification, at least at first.
So if legacy systems or bad configurations can be used as a path around whatever is formally proved, they will be.
Attack surface
The whole “attack surface” idea assumes that, if you compromise any application, you can take over an entire machine or network of machines.
There’s no “squishy center” assumption in the idea of attack surface. You have a certain attack surface for each compartment. Taking one compartment exposes attack surfaces of other compartments. You find a route through a series of compartments to some objective or objectives. Often a shorter route than one might think.
You do, however, get to try to route around anything you can’t break through. At any given time you have quite a lot of potentially productive attack surface available to you,. You can choose whatever next step you think has the best chance of getting you what you need most easily. And you usually get to try again even if you’re detected.
Kinds of defense in depth
Defense-in-depth is very much a thing in cybersecurity.
Defense in depth means different things to different people.
The original military concept, from trench warfare, is actually a bad metaphor for computing security, where your enemy never “takes losses”, doesn’t have to cross “exposed terrain” in any meaningfully similar way, has almost no logistical needs, and isn’t generally delayed twice by the same terrain.
The closest analog to the military version is what “defense in depth” usually means to the operational security people in, say, an IT department. You put layers of barriers in place so that after breaching one layer, the attacker has to do new reconnaissance and planning before being able to move to the next. Almost all of the value of that is in delay, and it relies on assuming that the attacker needs human-scale amounts of time at each stage.
On the software development side, “defense in depth” in computing can also mean having more than one mechanism meant to outright stop your adversary, so that if they find a bug in one, they’re still totally stymied by another.
To do the first kind properly, you need to have measures that slow the adversary down “enough” at critical stages. To do the second kind properly, you have to have at least one measure that completely stops the adversary by itself; it’s just that you don’t know which measure it will be.
Defense in depth versus AGI
Both kinds of defense in depth suffer badly when your presumed adversary is a “canonical” AGI. It has superhuman speed, so even if you delay it it will still be moving faster than you can react. And it has a superhuman ability to find bugs, which means that it’s likely to find ways through all of your theoretically redundant protections.
I’m not modeling the AGI as magic here, just weakly superhuman. Even normal humans find ways into things all the time. If the AGI even manages to be significantly faster, without an absolute improvement in bug-finding, that’s probably enough.
Automated response won’t fix the speed problem, by the way. People have been selling that for ages, but they haven’t made it work. In fact, it can often be exploited for an attacker’s own ends. If it were built, automated response against an AGI “as smart as a human, but faster” would of course itself probably have to be “as smart as a human, but faster”… which isn’t available until after you need it.
More on the attacker’s task
Think it’s game over if you get root on a machine?
As of today, for the vast majority of machines, YES.
If you’re not checkmated, you are at least at the point where a graceful player would resign.
My “game over” objectives in attacking a machine may not even require me to get root at all, let alone defeat SELinux or whatever. If I’m attacking your database server because I want the data, and I compromise the database itself, I don’t need root. Running an application in a VM may keep me from getting total control of the hardware it’s running on, but if the application itself is my target, then I don’t care. People forget that too easily.
Even if you assume that I do need root, if I have access to a large supply of zero-days, then it’s game over if I can even get a shell. Based on experience with them being found, kernel zero-days always exist. Those give you root and evade SELinux at the same time. The adversary we’re worried about has to be presumed able to find them very frequently indeed.
You can have all of the intra-machine security controls you want, and it won’t matter if they’re all enforced using the same few mechanisms, which it turns out I can evade. Once I know how to penetrate the kernels you actually use and escape the hypervisors you actually use, it doesn’t matter what you put on top of them. Not that I’m saying those are the only things you can attack; they definitely are not. But owning them obviates almost all your other layers of defense.
Between machines, you can have all the network isolation rules you want, and it won’t matter if I can control the devices enforcing them. Dedicated firewalls actually have really bad security track records if you look at them as targets in themselves. You can limit the available network services, but that doesn’t matter if the adversary can penetrate the ones you do use.
Add another couple of things for me to break, and I still won’t have to break everything. There’s not necessarily so much depth there as it might seem.
… and I don’t even need any of that if the system is misconfigured, which it often is. On most real systems, even in pretty well run corporate or government environments, I can expect to find a credential lying around in an unprotected file, or a sudo configuration that lets me run a command that’s too trusting, or an application that “has to have” access to something that lets me move toward my ultimate goal, or whatever.
I would expect to personally be able to get total control of almost any actual production machine where I could get a shell, if I were willing to concentrate on it for weeks or months, and to be able to hop from there to other parts of the network if it belonged to a large organization. I would not expect SELinux to stop me. I could fail, but that would not be the most probable case. I’m far from an up-to-date professional-grade penetration specialist, and even further from a superhuman AGI. Either of those would be a lot faster than I would be and have a higher chance of success.
I expect it to stay that way for decades, for much the same reasons that I don’t expect to be able to deploy formal methods quickly everywhere.
We’re always talking about attacking the weakest targets (that will help our objectives), never the strongest. Including, of course, the targets with the worst human management.
SELinux
Not if it’s running SELinux.
So on the example of SELinux (and similar stuff like AppArmor): Many, probably still most, sysadmins still disable SELinux, even when it’s on by default in their distributions, because they find it inconvenient and/or don’t understand it. SELinux policies are themselves complicated, ad-hoc lists of constraints on the behavior of incompletely understood programs, and aren’t always expressive enough to distinguish things that must be allowed from things that give away dangerous amounts of control. And SELinux is, of course, implemented inside an enormous ultra-complicated monolithic kernel, with which SELinux itself has an almost fractal interface. And, in the end, the attacker may be able to reach the objective within the intended constraints set by SELinux anyway.
Really appreciate this informative and well-written answer. Nice to hear from someone on the ground about SELinux instead of the NSA’s own presentations.
I phrased my question about time and space badly. I was interested in proving the time and space behavior of the software “under scrutiny”, not in the resource consumption of the verification systems themsvelves.
It would be nice to be able to prove things like “this program will never allocate more than X memory”, or “this service will always respond to any given request within Y time”.
I phrased my question about time and space badly. I was interested in proving the time and space behavior of the software “under scrutiny”, not in the resource consumption of the verification systems themsvelves.
LOL!
I know a few people who have worked in this area. Jan Hoffman and Peng Gong have worked on automatically inferring complexity. Tristan Knoth has gone the other way, including resource bounds in specs for program synthesis. There’s a guy who did an MIT Ph. D. on building an operating system in Go, and as part of it needed an analyzer that can upper-bound the memory consumption of a system call. I met someone at CU Boulder working under Bor-Yuh Evan Chang who was also doing static analysis of memory usage, but I forget whom.
So, those are some things that were going on. About all of these are 5+ years old, and I have no more recent updates. I’ve gone to one of Peng’s talks and read none of these papers.
I agree with about everything you said as well as several more criticisms along those lines you didn’t say. I am probably more familiar with these issues than anyone else on this website with the possible exception of Jason Gross.
Now, suppose we can magic all that away. How much then will this reduce AI risk?
As others have written, I think you have to get very close to perfection before you get much of a win against the kind of AGI everybody on here is worried about, because you have to assume that it can find very subtle bugs. Also, if you assume it has access to the Internet or any other large selection of targets, it will attack the thing that has not been hardened… so you have to get everything hardened before this very smart adversary pops up.
But it sure can’t hurt. And it would help other stuff, too.
Hey, can I ask an almost unrelated question that you’re free to ignore or answer as a private message OR answer here? How good is formal verification for time and space these days?
I must disagree with the first claim. Defense-in-depth is very much a thing in cybersecurity. The whole “attack surface” idea assumes that, if you compromise any application, you can take over an entire machine or network of machines. That is still sometimes true, but continually less so. Think it’s game over if you get root on a machine? Not if it’s running SELinux.
I can speak only in broad strokes here, as I have not published in verification. My publications are predominantly in programming tools of some form, mostly in program transformation and synthesis.
There are two main subfields that fight over the term “verification”: model checking and mechanized/interactive theorem proving. This is not counting people like Dawson Engler, who write very unsound static analysis tools but call it “verification” anyway. I give an ultra-brief overview of verification in https://www.pathsensitive.com/2021/03/why-programmers-shouldnt-learn-theory.html
I am more knowledgable about mechanized theorem proving, since my department has multiple labs who work in this area and I’ve taken a few of their seminars. But asking about time/space of verification really just makes sense for the automated part. I attended CAV in 2015 and went to a few model checking talks at ICSE 2016, and more recently talked to a friend on AWS’s verification team about what some people there are doing with CBMC. Okay, and I guess I talked to someone who used to do model checking on train systems in France just two days ago. Outside of that exposure, I am super not-up-to-date with what’s going on. But I’d still expect massive breakthroughs to make the news rounds over to my corner of academia, so I’ll give my sense of the status quo.
Explicit state enumeration can crush programs with millions or billions of states, while symbolic model checking routinely handles $10^100$ states.
Those are both very small numbers. To go bigger, you need induction or abstraction, something fully automated methods are still bad at.
Yes, we can handle exponentially large things, but the exponential still wins. There’s a saying of SAT solvers “either it runs instantly or it takes forever.” I believe this is less true of model checking, though still true. (Also, many model checkers use SAT.)
If you want to model check something, either you check a very small program like a device driver, or you develop some kind of abstract model and check that instead.
Hmm. It looks like my reply notifications are getting batched now. I didn’t realize I’d set that up.
I’ve reordered some of this, because the latter parts get into the weeds a lot and may not be worth reading. I advise that anybody who gets bored stop reading there, because it’s probably not going to get more interesting.
For background, I haven’t been doing security hands-on for the last few years, but I did it full time for about 25 years before that, and I still watch the space. I started out long enough ago that “cyber” sets my teeth on edge...
State of practice in defense
Well yes but… not that much less. A lot of what’s done is, shall we say, “aspirational”, and a lot of the rest works much better to reduce the rate of damage from human adversaries than it would to resist a total takeover from AGI that had decided a given system was on the critical path for its success.
Today, if you’re a real-world organization with a significant IT infrastructure, and you hire a skilled human penetration tester (or team), and you give them a reasonable amount of time, and you don’t set artificially limiting “rules of engagement”, they will almost always reach whatever objectives you set, by default full administrative access to all or almost all of your systems.
All the changes over the last decade or two have not, in the end, appreciably reduced the chance of being “owned” by (sufficiently motivated) humans. The cost will be higher, but the probability of success is still close to one if somebody is willing to pay that cost.
And the cost isn’t really prohibitive. It’s more the kind of cost increase that redirects an attacker to another victim than the kind that convinces them to go out of the penetrating business altogether. There are just too many possible ways to get in. The limiting factors are attacker time, expertise, and motivation.
In the AGI scenario we’re talking about, all three of those limits presumably get a lot less restrictive.
… and that’s on the “big machines” with professional admins. Out on some random IOT device, you’re lucky if you have process memory isolation (or the hardware to support it), let alone SELinux. And you probably don’t get regular updates either, especially after a couple of years.
The state of practice in the legacy systems is still really, really bad. I don’t think there’s been any qualitative change, nor any reason to expect qualitative change. And administrative configuration will probably be outside the domain of formal verification, at least at first.
So if legacy systems or bad configurations can be used as a path around whatever is formally proved, they will be.
Attack surface
There’s no “squishy center” assumption in the idea of attack surface. You have a certain attack surface for each compartment. Taking one compartment exposes attack surfaces of other compartments. You find a route through a series of compartments to some objective or objectives. Often a shorter route than one might think.
You do, however, get to try to route around anything you can’t break through. At any given time you have quite a lot of potentially productive attack surface available to you,. You can choose whatever next step you think has the best chance of getting you what you need most easily. And you usually get to try again even if you’re detected.
Kinds of defense in depth
Defense in depth means different things to different people.
The original military concept, from trench warfare, is actually a bad metaphor for computing security, where your enemy never “takes losses”, doesn’t have to cross “exposed terrain” in any meaningfully similar way, has almost no logistical needs, and isn’t generally delayed twice by the same terrain.
The closest analog to the military version is what “defense in depth” usually means to the operational security people in, say, an IT department. You put layers of barriers in place so that after breaching one layer, the attacker has to do new reconnaissance and planning before being able to move to the next. Almost all of the value of that is in delay, and it relies on assuming that the attacker needs human-scale amounts of time at each stage.
On the software development side, “defense in depth” in computing can also mean having more than one mechanism meant to outright stop your adversary, so that if they find a bug in one, they’re still totally stymied by another.
To do the first kind properly, you need to have measures that slow the adversary down “enough” at critical stages. To do the second kind properly, you have to have at least one measure that completely stops the adversary by itself; it’s just that you don’t know which measure it will be.
Defense in depth versus AGI
Both kinds of defense in depth suffer badly when your presumed adversary is a “canonical” AGI. It has superhuman speed, so even if you delay it it will still be moving faster than you can react. And it has a superhuman ability to find bugs, which means that it’s likely to find ways through all of your theoretically redundant protections.
I’m not modeling the AGI as magic here, just weakly superhuman. Even normal humans find ways into things all the time. If the AGI even manages to be significantly faster, without an absolute improvement in bug-finding, that’s probably enough.
Automated response won’t fix the speed problem, by the way. People have been selling that for ages, but they haven’t made it work. In fact, it can often be exploited for an attacker’s own ends. If it were built, automated response against an AGI “as smart as a human, but faster” would of course itself probably have to be “as smart as a human, but faster”… which isn’t available until after you need it.
More on the attacker’s task
As of today, for the vast majority of machines, YES.
If you’re not checkmated, you are at least at the point where a graceful player would resign.
My “game over” objectives in attacking a machine may not even require me to get root at all, let alone defeat SELinux or whatever. If I’m attacking your database server because I want the data, and I compromise the database itself, I don’t need root. Running an application in a VM may keep me from getting total control of the hardware it’s running on, but if the application itself is my target, then I don’t care. People forget that too easily.
Even if you assume that I do need root, if I have access to a large supply of zero-days, then it’s game over if I can even get a shell. Based on experience with them being found, kernel zero-days always exist. Those give you root and evade SELinux at the same time. The adversary we’re worried about has to be presumed able to find them very frequently indeed.
You can have all of the intra-machine security controls you want, and it won’t matter if they’re all enforced using the same few mechanisms, which it turns out I can evade. Once I know how to penetrate the kernels you actually use and escape the hypervisors you actually use, it doesn’t matter what you put on top of them. Not that I’m saying those are the only things you can attack; they definitely are not. But owning them obviates almost all your other layers of defense.
Between machines, you can have all the network isolation rules you want, and it won’t matter if I can control the devices enforcing them. Dedicated firewalls actually have really bad security track records if you look at them as targets in themselves. You can limit the available network services, but that doesn’t matter if the adversary can penetrate the ones you do use.
Add another couple of things for me to break, and I still won’t have to break everything. There’s not necessarily so much depth there as it might seem.
… and I don’t even need any of that if the system is misconfigured, which it often is. On most real systems, even in pretty well run corporate or government environments, I can expect to find a credential lying around in an unprotected file, or a sudo configuration that lets me run a command that’s too trusting, or an application that “has to have” access to something that lets me move toward my ultimate goal, or whatever.
I would expect to personally be able to get total control of almost any actual production machine where I could get a shell, if I were willing to concentrate on it for weeks or months, and to be able to hop from there to other parts of the network if it belonged to a large organization. I would not expect SELinux to stop me. I could fail, but that would not be the most probable case. I’m far from an up-to-date professional-grade penetration specialist, and even further from a superhuman AGI. Either of those would be a lot faster than I would be and have a higher chance of success.
Today, I saw a Twitter thread that gives you an idea what things are like on the ground out there: https://twitter.com/DebugPrivilege/status/1538567999501770754
I expect it to stay that way for decades, for much the same reasons that I don’t expect to be able to deploy formal methods quickly everywhere.
We’re always talking about attacking the weakest targets (that will help our objectives), never the strongest. Including, of course, the targets with the worst human management.
SELinux
So on the example of SELinux (and similar stuff like AppArmor): Many, probably still most, sysadmins still disable SELinux, even when it’s on by default in their distributions, because they find it inconvenient and/or don’t understand it. SELinux policies are themselves complicated, ad-hoc lists of constraints on the behavior of incompletely understood programs, and aren’t always expressive enough to distinguish things that must be allowed from things that give away dangerous amounts of control. And SELinux is, of course, implemented inside an enormous ultra-complicated monolithic kernel, with which SELinux itself has an almost fractal interface. And, in the end, the attacker may be able to reach the objective within the intended constraints set by SELinux anyway.
Really appreciate this informative and well-written answer. Nice to hear from someone on the ground about SELinux instead of the NSA’s own presentations.
I phrased my question about time and space badly. I was interested in proving the time and space behavior of the software “under scrutiny”, not in the resource consumption of the verification systems themsvelves.
It would be nice to be able to prove things like “this program will never allocate more than X memory”, or “this service will always respond to any given request within Y time”.
LOL!
I know a few people who have worked in this area. Jan Hoffman and Peng Gong have worked on automatically inferring complexity. Tristan Knoth has gone the other way, including resource bounds in specs for program synthesis. There’s a guy who did an MIT Ph. D. on building an operating system in Go, and as part of it needed an analyzer that can upper-bound the memory consumption of a system call. I met someone at CU Boulder working under Bor-Yuh Evan Chang who was also doing static analysis of memory usage, but I forget whom.
So, those are some things that were going on. About all of these are 5+ years old, and I have no more recent updates. I’ve gone to one of Peng’s talks and read none of these papers.