I don’t think the plan is to hack CEV together in lisp and see what happens. Writing provably correct software is possible today, it’s just extremely time-consuming. Contrast this with our incomplete knowledge of physics, and lack of criteria for what constitutes “good enough” physical security.
A bad hardware call seems far more likely to me than a fatal programming slip-up in a heavily verified software system. For software, we have axiomatic systems. There is no comparable method for assessing physical security, nothing but our best guess at what’s possible in a universe we don’t understand. So I’d much rather stake the future of humanity on the consistency of ZFC or some such than on the efficacy of some “First Law of Not Crossing the Beams” scheme, even if the latter is just so darn clever that no human has yet thought up a counter-example.
Writing provably correct software is possible today, it’s just extremely time-consuming.
Can you name a documented nontrivial program with >1000 lines of code that ran correctly on the first try? What data exists on long computer programs that were not tested early on in their development?
A bad hardware call seems far more likely to me than a fatal programming slip-up in a heavily verified software system. For software, we have axiomatic systems. There is no comparable method for assessing physical security, nothing but our best guess at what’s possible in a universe we don’t understand. So I’d much rather stake the future of humanity on the consistency of ZFC or some such than on the efficacy of some “First Law of Not Crossing the Beams” scheme, even if the latter is just so darn clever that no human has yet thought up a counter-example.
A couple of points: provably correct software and external hardware measures are not mutually exclusive. And if there really is a race going on between those who are concerned with AI safety and those who aren’t, it likely makes sense for those who are concerned with safety to do a certain amount of corner-cutting.
How would one prove friendliness anyway? Does anyone have any idea of how to do this? Intuitively it seems to me that proving honesty would be considerably easier. If you can prove that an AI is honest and you’ve got a strong box for it, then you can modify its source code until it says the kind of thing you want in your conversations with it, and get somewhere close to friendliness. Maybe from there proving friendliness will be doable. it just seems to me that the greatest engineering accomplishment in history isn’t going to come about without a certain amount of experimentation and prototyping.
Can you name a documented nontrivial program with >1000 lines of code that ran correctly on the first try? What data exists on long computer programs that were not tested early on in their development?
Well, the software that runs the critical systems on the Space Shuttle is a common example. Yes, the components that make up the software probably underwent unit testing prior to launch, but testing (and debugging) made up only a very small fraction of the cost of assuring the software would not fail.
Correctness proofs and other formal techniques are used to verify all modern mass-market high-performance microprocessor designs, a la Intel and AMD. The techniques used to verify the designs above the level of abstraction of the logic gate are largely tranferable to software.
A lot of military systems use formal techniques to verify the software. Warplanes for example have used “fly-by-wire” for decades: in a “fly-by-wire” plane, loss of function of the fly-by-wire system causes the plane to drop out of the sky or immediately to go into a spin or something dire like that. I believe the fly-by-wire system of modern warplanes entails significant amounts of software (in addition to digital electronics), but I am not completely sure.
Just because most software-development efforts do not need and do not want to pay for formal methods does not mean that formal methods do not exist or that they have not been successfully used in ambitious efforts.
it seems to me that proving honesty would be considerably easier. If you can prove that an AI is honest and you’ve got a strong box for it, then you can modify its source code until it says the kind of thing you want in your conversations with it
These notions of honesty and conversation seem kind of anthropormophic. I agree that we want a transparent AI, something that we know what it’s doing and why, but the means by which such transparency could be achieved would probably not be very similar to what a human does when she’s trying to tell you the truth. I would actually guess that AI transparency would be much simpler than human honesty in some relevant sense. When you ask a human, “Well, why did you do this-and-such?” she’ll use introspection to magically come up with a loose tangle of thoughts which sort-of kind-of resemble a high-level explanation which is probably wrong, and then try to tell you about the experience using mere words which you probably won’t understand. Whereas one might imagine asking a cleanly-designed AI the same question, and getting back a detailed printout of what actually happened.
Traceback (most recent call last):
File "si-ai.py", line 5945434, in <module>
if Neuron_0X$a8f3==True: a8prop(0X$a8f3,234):
File "si-ai.py", line 5945989, in a8prop ...
And if there really is a race going on between those who are concerned with AI safety and those who aren’t, it likely makes sense for those who are concerned with safety to do a certain amount of corner-cutting.
This is a good argument against wasting a bunch of time on elaborate physical safeguards.
It might be cheaper in terms of resources to build a safeguard that is 50% reliable than to look over your code once more to have it go from being 99.9% reliable to 99.95% reliable.
If AI research is anything like research in certain other fields, there’s a significant speed benefit to be gained from being able to experiment and prototype.
Re: “Provability is not about setting a standard that is too high, it is about knowing what you are doing—like, at all.”
Normally, in the context of writing computer programs, that concept refers to the idea of “provable correctness”—which means something rather different. It refers to the mathematical concept of proof—and not to what you are talking about.
FWIW, the idea that we’ll be able to “prove” very much of interest about the long-term consequences of the behaviour of superintelligent machines on humanity strikes me as wrong-headed.
That’s planet-scale social science you are talking about there—and look at how rubbish we are at proving anything about complex computer programs.
Since trying to prove things about your programs probably slows down development dramatically, those with good intentions would be well-advised NOT to employ such strategies—or else they will simply be sidelined by those who are less paranoid.
FWIW, the idea that we’ll be able to “prove” very much of interest about the long-term consequences of the behaviour of superintelligent machines on humanity strikes me as wrong-headed.
The idea is to prove properties of actions, not consequences. From Knowability of FAI:
You can try to prove a theorem along the lines of: “Providing that the transistors in this computer chip behave the way they’re supposed to, the AI that runs on this chip will always try to be Friendly.” You’re going to prove a statement about the search the AI carries out to find its actions. Metaphorically speaking, you’re going to prove that the AI will always, to the best of its knowledge, seek to move little old ladies to the other side of the street and avoid the deaths of nuns. To prove this formally, you would have to precisely define “try to be Friendly”: the complete criterion that the AI uses to choose among its actions—including how the AI learns a model of reality from experience, and how the AI identifies the goal-valent aspects of the reality it learns to model.
Once you’ve formulated this precise definition, you still can’t prove an absolute certainty that the AI will be Friendly in the real world, because a series of cosmic rays could still hit all of the transistors at exactly the wrong time to overwrite the entire program with an evil AI. Or Descartes’s infinitely powerful deceiving demon could have fooled you into thinking that there was a computer in front of you, when in fact it’s a hydrogen bomb. Or the Dark Lords of the Matrix could reach into the computer simulation that is our world, and replace the AI with Cthulhu. What you can prove with mathematical certitude is that if all the transistors in the chip work correctly, the AI “will always try to be Friendly”—after you’ve given “try to be Friendly” a precise definition in terms of how the AI learns a model of the world, identifies the important things in it, and chooses between actions, these all being events that happen inside the computer chip.
It doesn’t make the problem any less ridiculous or intractable. I don’t see any good reason for thinking such an approach is called for. No machine intelligence project I have heard of has ever done anything remotely like this—and yet such projects produce results that are typically helpful and benign.
Proofs can get complicated—and can be wrong. Extensive unit testing is a more practical and realistic approach for most programming problems. The virtues of such testing have become widely appreciated in recent years.
I don’t follow your logic. Provable correctness is an esoteric field, applicable only to the most security sensitive applications. Machine intelligence is not usually such a field.
Since provability is of little practical significance for programmers, I don’t see how it follows that an understanding of all this means that it should be allocated more funding. The most obvious reaction is to forget about provable correctness—and to make sure that other people don’t waste too much of their precious time on something so irrelevant and unhelpful.
Provable correctness is esoteric and not very useful right now, but how do we know it can’t become much more useful and be relevant to many new things including FAI implementations? I’m aware of the fundamental results such as Rice’s theorem regarding what can and can’t be proven correct, but does it follow that there’s nothing for formal verification to contribute?
I agree that most likely, non-100%-certainty code checking is going to remain the only practical approach. We’re probably going to need some narrow AI to help us there, too.
Well, we don’t know for sure—since predicting the future it difficult. However, looking at the current situation, artificial intelligence programs are some of the most difficult things to apply formal proof to—since they are so complicated.
It is not my position that “there’s nothing for formal verification to contribute”. We have some code verification today—in Java bytecode programs—for example. It is
just a considerable distance below the top-level behaviour of the system.
Also, one of the routes most likely to get to the goal first involves reinforcement learning—which results in the internal representation and goal structure being written by machines—a notorious way of producing incomprehensible code.
What we can program has always been far ahead of what we can prove about our programs. Proofs may advance, but programming skills seem likely to advance further. We may not see exactly the same lag forever, but there will probably be quite a lag for quite a while—and intelligent machines are probably not too far off now.
Fortunately, provable correctness is not only not very useful, it is also not in much demand. Most computer programs don’t need to be perfect—good enough is quite acceptable. When competing with the bug-ridden human brain, that is especially likely to be true.
You only need to use “provable correctness” for highly security-critical applications. The only people considering it are the most paranoid researchers.
We have some code verification today—in Java bytecode programs—for example.
The properties that can be proved by the Java bytecode verifier are far weaker than the properties that can be proved by the Java source-to-bytecode compiler. If everyone were willing to distribute source code, there would be no need for a bytecode verifier in Java. So this isn’t a shining example of the usefulness of machine code verification, unfortunately.
What we can program has always been far ahead of what we can prove about our programs. Proofs may advance, but programming skills seem likely to advance further.
The power of our verification tools varies greatly with the programming style chosen. (Compare proving things about Haskell programs, to proving similar things about C code.) A research direction I’m personally interested in is this: how can we construct programming styles (languages, libraries, etc) that allow much greater verification power while at the same time not being too constraining for human programmers?
Re: “If everyone were willing to distribute source code”
Right—but that premise is incorrect. The point of having bytecode is to have a condensed format that is suitable for distribution and direct execution.
The example also illustrates the connection between provability and security.
Bytecode is not significantly more condensed than appropriately transformed sourcecode (strip comments and whitespace and compress). And the cost of compiling Java source is low; it’s not a problem except for low-end embedded platforms.
From a technical POV, source is better suited than bytecode for distribution, and indeed for everything. Not only can you prove stronger things about it than you can about bytecode, source-level compatibility is more resilient to library changes than bytecode-level ABI compatibility.
The only reason people don’t always distribute Java source is because they don’t want end-users to easily read and modify it. (Which is mostly irrational, given the quality of bytecode-to-source decomplation tools.) And this concern isn’t very relevant to issues of AI development and verification.
People actually ship bytecode—rather than compressed source code—because the runtimes expect bytecode, and compilation before execution is a time-consuming additional step.
The “proving stronger things” claim seems highly dubious. You can decomplie compiled bytecode into source code, and what comes out is of reasonable quality—unless the bytecode was compressed after compilation. Excepting comments and whitespace, these two formats contain very similar information.
People actually ship bytecode—rather than compressed source code—because the runtimes expect bytecode, and compilation before execution is a time-consuming additional step.
This is a circular argument. The compiler is small compared to the runtime and standard library, and could easily be included in the requirements for a JRE. ETA: one reason Java was designed to run bytecode is that business-types are afraid of distributing source code to end users, and were much more afraid before the relatively recent rise to fame of from-source languages like Python.
(Yes, the original Java was designed to run on very constrained platforms, but that has been changed long since. By Java 1.2, requiring JREs to compile code before running it and treating bytecode as a cache would not have seemed very onerous.)
Excepting comments and whitespace, these two formats contain very similar information.
Decompilation works as well as it does only because compilers store extra information in the bytecode. This information (generics being an infamous example) isn’t a part of the bytecode spec and can’t be used by the runtime—it’s only there to help decompilation, debugging etc. So while it’s strictly true that the bytecode format can contain all the information you want, in practice this is little different from shipping stripped-down bytecode that doesn’t contain that extra info and also the sourcecode.
You appear to have mis-read my argument before labeling it as circular. I didn’t say a compiler was too large for a JRE. I said compilation was a time-consuming additional step.
Your bytecode history seems revisionist. Java actually has bytecode for many reasons—including size issues and embedded applications:
“Java bytecodes are designed to be easy to interpret on any machine, or to dynamically translate into native machine code if required by performance demands.”
I agree that bytecode itself is useful for performance. The idea of verifying it instead of verifying the source code is what I’m arguing against. What I was suggesting is to always distribute the code, verify that on installation, and store trusted bytecode in a cache for performance.
I expect that—if you get a reply—you will hear claims that compilation to bytecode on the client would damage start-up performance, that compressed source code is less compact than compressed bytecode, hindering applet distribution, and claims of additional burdens on embedded environments.
You would still need a bytecode verifier for backwards compatibility—so that would be two verification systems to maintain.
The main reason the JRE won’t be changed today is that a lot of the code running on it isn’t written in Java. And that’s a good reason.
I was merely saying that for a single project needing the best possible code verification assistance, it was more fruitful to look at language-level tools than at bytecode-level.
ETA: one reason Java was designed to run bytecode is that business-types are afraid of distributing source code to end users, and were much more afraid before the relatively recent rise to fame of from-source languages like Python.
If AI research is anything like research in certain other fields, there’s a significant speed benefit to be gained from being able to experiment and prototype.
Yes, you can learn much more from witnessing one failure than from dozens of thought experiments. But things change when a failure has a significant chance of exterminating the human race.
I don’t think the plan is to hack CEV together in lisp and see what happens. Writing provably correct software is possible today, it’s just extremely time-consuming. Contrast this with our incomplete knowledge of physics, and lack of criteria for what constitutes “good enough” physical security.
A bad hardware call seems far more likely to me than a fatal programming slip-up in a heavily verified software system. For software, we have axiomatic systems. There is no comparable method for assessing physical security, nothing but our best guess at what’s possible in a universe we don’t understand. So I’d much rather stake the future of humanity on the consistency of ZFC or some such than on the efficacy of some “First Law of Not Crossing the Beams” scheme, even if the latter is just so darn clever that no human has yet thought up a counter-example.
Can you name a documented nontrivial program with >1000 lines of code that ran correctly on the first try? What data exists on long computer programs that were not tested early on in their development?
A couple of points: provably correct software and external hardware measures are not mutually exclusive. And if there really is a race going on between those who are concerned with AI safety and those who aren’t, it likely makes sense for those who are concerned with safety to do a certain amount of corner-cutting.
How would one prove friendliness anyway? Does anyone have any idea of how to do this? Intuitively it seems to me that proving honesty would be considerably easier. If you can prove that an AI is honest and you’ve got a strong box for it, then you can modify its source code until it says the kind of thing you want in your conversations with it, and get somewhere close to friendliness. Maybe from there proving friendliness will be doable. it just seems to me that the greatest engineering accomplishment in history isn’t going to come about without a certain amount of experimentation and prototyping.
Well, the software that runs the critical systems on the Space Shuttle is a common example. Yes, the components that make up the software probably underwent unit testing prior to launch, but testing (and debugging) made up only a very small fraction of the cost of assuring the software would not fail.
Correctness proofs and other formal techniques are used to verify all modern mass-market high-performance microprocessor designs, a la Intel and AMD. The techniques used to verify the designs above the level of abstraction of the logic gate are largely tranferable to software.
A lot of military systems use formal techniques to verify the software. Warplanes for example have used “fly-by-wire” for decades: in a “fly-by-wire” plane, loss of function of the fly-by-wire system causes the plane to drop out of the sky or immediately to go into a spin or something dire like that. I believe the fly-by-wire system of modern warplanes entails significant amounts of software (in addition to digital electronics), but I am not completely sure.
Just because most software-development efforts do not need and do not want to pay for formal methods does not mean that formal methods do not exist or that they have not been successfully used in ambitious efforts.
Thanks.
These notions of honesty and conversation seem kind of anthropormophic. I agree that we want a transparent AI, something that we know what it’s doing and why, but the means by which such transparency could be achieved would probably not be very similar to what a human does when she’s trying to tell you the truth. I would actually guess that AI transparency would be much simpler than human honesty in some relevant sense. When you ask a human, “Well, why did you do this-and-such?” she’ll use introspection to magically come up with a loose tangle of thoughts which sort-of kind-of resemble a high-level explanation which is probably wrong, and then try to tell you about the experience using mere words which you probably won’t understand. Whereas one might imagine asking a cleanly-designed AI the same question, and getting back a detailed printout of what actually happened.
This is a good argument against wasting a bunch of time on elaborate physical safeguards.
It might be cheaper in terms of resources to build a safeguard that is 50% reliable than to look over your code once more to have it go from being 99.9% reliable to 99.95% reliable.
If AI research is anything like research in certain other fields, there’s a significant speed benefit to be gained from being able to experiment and prototype.
You are missing the point of “proving Friendliness”. 50% would be awesome. See my old reply to Shane Legg for more.
Re: “Provability is not about setting a standard that is too high, it is about knowing what you are doing—like, at all.”
Normally, in the context of writing computer programs, that concept refers to the idea of “provable correctness”—which means something rather different. It refers to the mathematical concept of proof—and not to what you are talking about.
Great comment, especially:
I thought I was reading my own writing for a second there.
No, there’s no certainty. But if you prove an AI is Friendly, it might work. If you don’t prove it Friendly, it has no chance at all of working.
FWIW, the idea that we’ll be able to “prove” very much of interest about the long-term consequences of the behaviour of superintelligent machines on humanity strikes me as wrong-headed.
That’s planet-scale social science you are talking about there—and look at how rubbish we are at proving anything about complex computer programs.
Since trying to prove things about your programs probably slows down development dramatically, those with good intentions would be well-advised NOT to employ such strategies—or else they will simply be sidelined by those who are less paranoid.
The idea is to prove properties of actions, not consequences. From Knowability of FAI:
It doesn’t make the problem any less ridiculous or intractable. I don’t see any good reason for thinking such an approach is called for. No machine intelligence project I have heard of has ever done anything remotely like this—and yet such projects produce results that are typically helpful and benign.
Proofs can get complicated—and can be wrong. Extensive unit testing is a more practical and realistic approach for most programming problems. The virtues of such testing have become widely appreciated in recent years.
Therefore we should divert some resources from FAI theory to better tools for proving code correct.
I don’t follow your logic. Provable correctness is an esoteric field, applicable only to the most security sensitive applications. Machine intelligence is not usually such a field.
Since provability is of little practical significance for programmers, I don’t see how it follows that an understanding of all this means that it should be allocated more funding. The most obvious reaction is to forget about provable correctness—and to make sure that other people don’t waste too much of their precious time on something so irrelevant and unhelpful.
Provable correctness is esoteric and not very useful right now, but how do we know it can’t become much more useful and be relevant to many new things including FAI implementations? I’m aware of the fundamental results such as Rice’s theorem regarding what can and can’t be proven correct, but does it follow that there’s nothing for formal verification to contribute?
I agree that most likely, non-100%-certainty code checking is going to remain the only practical approach. We’re probably going to need some narrow AI to help us there, too.
AI is only marginally code. You need to prove that the theory is good first.
While I’m thinking of it: would an analogy to cryptology be useful at this juncture?
Well, we don’t know for sure—since predicting the future it difficult. However, looking at the current situation, artificial intelligence programs are some of the most difficult things to apply formal proof to—since they are so complicated.
It is not my position that “there’s nothing for formal verification to contribute”. We have some code verification today—in Java bytecode programs—for example. It is just a considerable distance below the top-level behaviour of the system.
Also, one of the routes most likely to get to the goal first involves reinforcement learning—which results in the internal representation and goal structure being written by machines—a notorious way of producing incomprehensible code.
What we can program has always been far ahead of what we can prove about our programs. Proofs may advance, but programming skills seem likely to advance further. We may not see exactly the same lag forever, but there will probably be quite a lag for quite a while—and intelligent machines are probably not too far off now.
Fortunately, provable correctness is not only not very useful, it is also not in much demand. Most computer programs don’t need to be perfect—good enough is quite acceptable. When competing with the bug-ridden human brain, that is especially likely to be true.
You only need to use “provable correctness” for highly security-critical applications. The only people considering it are the most paranoid researchers.
The properties that can be proved by the Java bytecode verifier are far weaker than the properties that can be proved by the Java source-to-bytecode compiler. If everyone were willing to distribute source code, there would be no need for a bytecode verifier in Java. So this isn’t a shining example of the usefulness of machine code verification, unfortunately.
The power of our verification tools varies greatly with the programming style chosen. (Compare proving things about Haskell programs, to proving similar things about C code.) A research direction I’m personally interested in is this: how can we construct programming styles (languages, libraries, etc) that allow much greater verification power while at the same time not being too constraining for human programmers?
Re: “If everyone were willing to distribute source code”
Right—but that premise is incorrect. The point of having bytecode is to have a condensed format that is suitable for distribution and direct execution.
The example also illustrates the connection between provability and security.
Bytecode is not significantly more condensed than appropriately transformed sourcecode (strip comments and whitespace and compress). And the cost of compiling Java source is low; it’s not a problem except for low-end embedded platforms.
From a technical POV, source is better suited than bytecode for distribution, and indeed for everything. Not only can you prove stronger things about it than you can about bytecode, source-level compatibility is more resilient to library changes than bytecode-level ABI compatibility.
The only reason people don’t always distribute Java source is because they don’t want end-users to easily read and modify it. (Which is mostly irrational, given the quality of bytecode-to-source decomplation tools.) And this concern isn’t very relevant to issues of AI development and verification.
People actually ship bytecode—rather than compressed source code—because the runtimes expect bytecode, and compilation before execution is a time-consuming additional step.
The “proving stronger things” claim seems highly dubious. You can decomplie compiled bytecode into source code, and what comes out is of reasonable quality—unless the bytecode was compressed after compilation. Excepting comments and whitespace, these two formats contain very similar information.
This is a circular argument. The compiler is small compared to the runtime and standard library, and could easily be included in the requirements for a JRE. ETA: one reason Java was designed to run bytecode is that business-types are afraid of distributing source code to end users, and were much more afraid before the relatively recent rise to fame of from-source languages like Python.
(Yes, the original Java was designed to run on very constrained platforms, but that has been changed long since. By Java 1.2, requiring JREs to compile code before running it and treating bytecode as a cache would not have seemed very onerous.)
Decompilation works as well as it does only because compilers store extra information in the bytecode. This information (generics being an infamous example) isn’t a part of the bytecode spec and can’t be used by the runtime—it’s only there to help decompilation, debugging etc. So while it’s strictly true that the bytecode format can contain all the information you want, in practice this is little different from shipping stripped-down bytecode that doesn’t contain that extra info and also the sourcecode.
You appear to have mis-read my argument before labeling it as circular. I didn’t say a compiler was too large for a JRE. I said compilation was a time-consuming additional step.
Your bytecode history seems revisionist. Java actually has bytecode for many reasons—including size issues and embedded applications:
“Java bytecodes are designed to be easy to interpret on any machine, or to dynamically translate into native machine code if required by performance demands.”
http://java.sun.com/docs/white/langenv/Neutral.doc1.html
You seem to agree—since you have now revised your comment.
I agree that bytecode itself is useful for performance. The idea of verifying it instead of verifying the source code is what I’m arguing against. What I was suggesting is to always distribute the code, verify that on installation, and store trusted bytecode in a cache for performance.
Right—well, speak to Sun/Oracle.
I expect that—if you get a reply—you will hear claims that compilation to bytecode on the client would damage start-up performance, that compressed source code is less compact than compressed bytecode, hindering applet distribution, and claims of additional burdens on embedded environments.
You would still need a bytecode verifier for backwards compatibility—so that would be two verification systems to maintain.
The main reason the JRE won’t be changed today is that a lot of the code running on it isn’t written in Java. And that’s a good reason.
I was merely saying that for a single project needing the best possible code verification assistance, it was more fruitful to look at language-level tools than at bytecode-level.
Yes, you can learn much more from witnessing one failure than from dozens of thought experiments. But things change when a failure has a significant chance of exterminating the human race.
So it depends on how good a box you can make for how little money.