Thanks for the response. My thoughts at this point are that
We seem to have differing views of how to best do what you call “reference class tennis” and how useful it can be. I’ll probably be writing about my views more in the future.
I find it plausible that AGI will have to follow a substantially different approach from “normal” software. But I’m not clear on the specifics of what SI believes those differences will be and why they point to the “proving safety/usefulness before running” approach over the “tool” approach.
We seem to have differing views of how frequently today’s software can be made comprehensible via interfaces. For example, my intuition is that the people who worked on the Netflix Prize algorithm had good interfaces for understanding “why” it recommends what it does, and used these to refine it. I may further investigate this matter (casually, not as a high priority); on SI’s end, it might be helpful (from my perspective) to provide detailed examples of existing algorithms for which the “tool” approach to development didn’t work and something closer to “proving safety/usefulness up front” was necessary.
Canonical software development examples emphasizing “proving safety/usefulness before running” over the “tool” software development approach are cryptographic libraries and NASA space shuttle navigation.
At the time of writing this comment, there was recent furor over software called CryptoCat that didn’t provide enough warnings that it was not properly vetted by cryptographers and thus should have been assumed to be inherently insecure. Conventional wisdom and repeated warnings from the security community state that cryptography is extremely difficult to do properly and attempting to create your own may result in catastrophic results. A similar thought and development process goes into space shuttle code.
It seems that the FAI approach to “proving safety/usefulness” is more similar to the way cryptographic algorithms are developed than the (seemingly) much faster “tool” approach, which is more akin to web development where the stakes aren’t quite as high.
EDIT: I believe the “prove” approach still allows one to run snippets of code in isolation, but tends to shy away from running everything end-to-end until significant effort has gone into individual component testing.
The analogy with cryptography is an interesting one, because...
In cryptography, even after you’ve proven that a given encryption scheme is secure, and that proof has been centuply (100 times) checked by different researchers at different institutions, it might still end up being insecure, for many reasons.
Examples of reasons include:
The proof assumed mathematical integers/reals, of which computer integers/floating point numbers are just an approximation.
The proof assumed that the hardware the algorithm would be running on was reliable (e.g. a reliable source of randomness).
The proof assumed operations were mathematical abstractions and thus exist out of time, and thus neglected side channel attacks which measures how long a physical real world CPU took to execute a the algorithm in order to make inferences as to what the algorithm did (and thus recover the private keys).
The proof assumed the machine executing the algorithm was idealized in various ways, when in fact a CPU emits heat other electromagnetic waves, which can be detected and from which inferences can be drawn, etc.
Thanks for the response. My thoughts at this point are that
We seem to have differing views of how to best do what you call “reference class tennis” and how useful it can be. I’ll probably be writing about my views more in the future.
I find it plausible that AGI will have to follow a substantially different approach from “normal” software. But I’m not clear on the specifics of what SI believes those differences will be and why they point to the “proving safety/usefulness before running” approach over the “tool” approach.
We seem to have differing views of how frequently today’s software can be made comprehensible via interfaces. For example, my intuition is that the people who worked on the Netflix Prize algorithm had good interfaces for understanding “why” it recommends what it does, and used these to refine it. I may further investigate this matter (casually, not as a high priority); on SI’s end, it might be helpful (from my perspective) to provide detailed examples of existing algorithms for which the “tool” approach to development didn’t work and something closer to “proving safety/usefulness up front” was necessary.
Canonical software development examples emphasizing “proving safety/usefulness before running” over the “tool” software development approach are cryptographic libraries and NASA space shuttle navigation.
At the time of writing this comment, there was recent furor over software called CryptoCat that didn’t provide enough warnings that it was not properly vetted by cryptographers and thus should have been assumed to be inherently insecure. Conventional wisdom and repeated warnings from the security community state that cryptography is extremely difficult to do properly and attempting to create your own may result in catastrophic results. A similar thought and development process goes into space shuttle code.
It seems that the FAI approach to “proving safety/usefulness” is more similar to the way cryptographic algorithms are developed than the (seemingly) much faster “tool” approach, which is more akin to web development where the stakes aren’t quite as high.
EDIT: I believe the “prove” approach still allows one to run snippets of code in isolation, but tends to shy away from running everything end-to-end until significant effort has gone into individual component testing.
The analogy with cryptography is an interesting one, because...
In cryptography, even after you’ve proven that a given encryption scheme is secure, and that proof has been centuply (100 times) checked by different researchers at different institutions, it might still end up being insecure, for many reasons.
Examples of reasons include:
The proof assumed mathematical integers/reals, of which computer integers/floating point numbers are just an approximation.
The proof assumed that the hardware the algorithm would be running on was reliable (e.g. a reliable source of randomness).
The proof assumed operations were mathematical abstractions and thus exist out of time, and thus neglected side channel attacks which measures how long a physical real world CPU took to execute a the algorithm in order to make inferences as to what the algorithm did (and thus recover the private keys).
The proof assumed the machine executing the algorithm was idealized in various ways, when in fact a CPU emits heat other electromagnetic waves, which can be detected and from which inferences can be drawn, etc.