JB: So when you imagine “seed AIs” that keep on improving themselves and eventually become smarter than us, how can you reasonably hope that they’ll avoid making truly spectacular mistakes? How can they learn really new stuff without a lot of risk?
EY: The best answer I can offer is that they can be conservative externally and deterministic internally.
Eliezer never justifies why he wants determinism. It strikes me as a fairly bizarre requirement to impose. Or perhaps he means something different by determinism than does everyone else familiar with computers. Does he simply mean that he wants the hardware to be reliable?
A deterministic algorithm, if run twice with the same inputs, follows the same steps and produces the same outputs each time. A non-deterministic algorithm will not necessarily follow the same steps, and may not even generate the same result.
It has been part of the folklore since Dijkstra’s “A Discipline of Programming” that well-written non-deterministic programs may be even easier to understand and prove correct than their deterministic counterparts.
It has been part of the folklore since Dijkstra’s “A Discipline of Programming” that well-written non-deterministic programs may be even easier to understand and prove correct than their deterministic counterparts.
theory that some actions are undetermined: the philosophical theory that human beings have free will and their actions are not always and completely determined by previous events.
But that is pretty much beside the point, because
You have provided no reasons why I should think that what you meant is true.
We are not talking about increasing confidence in a system by testing it. Certainly Eliezer is not talking about ‘test-cases’. We are talking about proofs of correctness.
indeterminism [...] theory that some actions are undetermined: the philosophical theory that human beings have free will and their actions are not always and completely determined by previous events.
Uh, no way! Indeterminism is best not defined in terms of humans.
I’m not sure what your point is here. Are you saying that ‘indeterminism’, in the sense of your substituted definition, is what you really meant when you wrote:
What indeterminism actually does is makes test-cases into an unreliable form of evidence.
The intended idea was pretty simple: in a deterministic system if you test it, and it works you know it will work under the same circumstances in the future—whereas in a system that lacks determinism, if you test it, and it works, that doesn’t mean it will work in the future.
The hardware and the software. Think of a provably correct compiler.
The main relevant paragraph in this interview is the one in part 2 whose first sentence is “The catastrophic sort of error, the sort you can’t recover from, is an error in modifying your own source code.”
Interesting fact: The recent paper Finding and Understanding Bugs in C Compilers found miscompilation bugs in all compilers tested except for one, CompCert, which was unique in that its optimizer was built on a machine-checked proof framework.
Yes, but I don’t see what relevance that paragraph has to his desire for ‘determinism’. Unless he has somehow formed the impression that ‘non-deterministic’ means ‘error-prone’ or that it is impossible to formally prove correctness of non-deterministic algorithms. In fact, hardware designs are routinely proven correct (ironically, using modal logic) even though the hardware being vetted is massively non-deterministic internally.
From the context, I think what EY means is that the AI must be structured so that all changes to source code can be proved safe-to-the-goal-system before being implemented.
On the other hand, I’m not sure why EY calls that “deterministic” rather than using another adjective.
Not at all. That essay simply says that non-deterministic algorithms don’t perform better than deterministic ones (for some meanings of ‘non-deterministic algorithms’). But the claim that needs to be explained is how determinism helps to prevent “making truly spectacular mistakes”.
Right. No doubt he is thinking he doesn’t want a cosmic ray hitting his friendly algorithm, and turning it into an unfriendly one. That means robustness—or error detection and correction. Determinism seems to be a reasonable approach to this which makes proving things about the results about as easy as possible.
In Part 2:
Eliezer never justifies why he wants determinism. It strikes me as a fairly bizarre requirement to impose. Or perhaps he means something different by determinism than does everyone else familiar with computers. Does he simply mean that he wants the hardware to be reliable?
What do you (and ‘everyone else familiar with computers’) mean by determinism?
A deterministic algorithm, if run twice with the same inputs, follows the same steps and produces the same outputs each time. A non-deterministic algorithm will not necessarily follow the same steps, and may not even generate the same result.
It has been part of the folklore since Dijkstra’s “A Discipline of Programming” that well-written non-deterministic programs may be even easier to understand and prove correct than their deterministic counterparts.
Again, with the power of randomness.
What indeterminism actually does is makes test-cases into an unreliable form of evidence.
I don’t think you said what you mean.
But that is pretty much beside the point, because
You have provided no reasons why I should think that what you meant is true.
We are not talking about increasing confidence in a system by testing it. Certainly Eliezer is not talking about ‘test-cases’. We are talking about proofs of correctness.
Uh, no way! Indeterminism is best not defined in terms of humans.
I’m not sure what your point is here. Are you saying that ‘indeterminism’, in the sense of your substituted definition, is what you really meant when you wrote:
?
If so, what do you propose we do about it?
The intended idea was pretty simple: in a deterministic system if you test it, and it works you know it will work under the same circumstances in the future—whereas in a system that lacks determinism, if you test it, and it works, that doesn’t mean it will work in the future.
The hardware and the software. Think of a provably correct compiler.
The main relevant paragraph in this interview is the one in part 2 whose first sentence is “The catastrophic sort of error, the sort you can’t recover from, is an error in modifying your own source code.”
Interesting fact: The recent paper Finding and Understanding Bugs in C Compilers found miscompilation bugs in all compilers tested except for one, CompCert, which was unique in that its optimizer was built on a machine-checked proof framework.
Yes, but I don’t see what relevance that paragraph has to his desire for ‘determinism’. Unless he has somehow formed the impression that ‘non-deterministic’ means ‘error-prone’ or that it is impossible to formally prove correctness of non-deterministic algorithms. In fact, hardware designs are routinely proven correct (ironically, using modal logic) even though the hardware being vetted is massively non-deterministic internally.
From the context, I think what EY means is that the AI must be structured so that all changes to source code can be proved safe-to-the-goal-system before being implemented.
On the other hand, I’m not sure why EY calls that “deterministic” rather than using another adjective.
Does the worse than random essay help to explain?
Not at all. That essay simply says that non-deterministic algorithms don’t perform better than deterministic ones (for some meanings of ‘non-deterministic algorithms’). But the claim that needs to be explained is how determinism helps to prevent “making truly spectacular mistakes”.
Right. No doubt he is thinking he doesn’t want a cosmic ray hitting his friendly algorithm, and turning it into an unfriendly one. That means robustness—or error detection and correction. Determinism seems to be a reasonable approach to this which makes proving things about the results about as easy as possible.