Maybe I’m missing the point, but it seems that good old-fashioned formal verification would largely mitigate the issue of bugs in the classical sense, which seems to be what MIRI already intends to use. The greatest problem lies outside of the issue of verifying that the program functions in the obvious sense; it lies in ensuring that the specification is referring to that which we want it to refer: the full complement of human values. This is a philosophical and scientific problem that is likely beyond the purview of formal verification or best programming practices.
I think you may indeed be missing the point, which is not
that formal bug-minimizing software development techniques will suffice to produce safe AI
but
that producing safe AI requires (what is currently) extraordinary success at “getting things right first time”, so that
it would be beneficial to get better at doing that, and to foster a culture of trying to do it,
and practice at making bug-free software might be an effective way to do so
not least because value-specification and programming are sufficiently parallel that some of the same techniques or patterns of thought might be useful in both.
However, what I take to be your main point—that value-specification and software development are in fact not terribly similar, so that practice at one may not help much with the other—is as applicable either way.
I agree that there are some important disanalogies between the two problems. I thought software development was an unusually good domain to start trying to learn the general skill, mostly because it offers easy-to-generate complex challenges where it’s simple to assess success.
that producing safe AI requires (what is currently) extraordinary success at “getting things right first time”, so that
See my reply above—this line of thought is fundamentally mistaken. Simulation testing is far more of an effective general solution than formal verification.
Just to clarify: I was stating the thesis of the OP, not asserting it. Neither am I now denying it. (I don’t find myself altogether convinced either by the OP or by your arguments for why the OP is “probably completely mistaken”.)
Maybe I’m missing the point, but it seems that good old-fashioned formal verification would largely mitigate the issue of bugs in the classical sense, which seems to be what MIRI already intends to use. The greatest problem lies outside of the issue of verifying that the program functions in the obvious sense; it lies in ensuring that the specification is referring to that which we want it to refer: the full complement of human values. This is a philosophical and scientific problem that is likely beyond the purview of formal verification or best programming practices.
I think you may indeed be missing the point, which is not
that formal bug-minimizing software development techniques will suffice to produce safe AI
but
that producing safe AI requires (what is currently) extraordinary success at “getting things right first time”, so that
it would be beneficial to get better at doing that, and to foster a culture of trying to do it,
and practice at making bug-free software might be an effective way to do so
not least because value-specification and programming are sufficiently parallel that some of the same techniques or patterns of thought might be useful in both.
However, what I take to be your main point—that value-specification and software development are in fact not terribly similar, so that practice at one may not help much with the other—is as applicable either way.
Yes, gjm’s summary is right.
I agree that there are some important disanalogies between the two problems. I thought software development was an unusually good domain to start trying to learn the general skill, mostly because it offers easy-to-generate complex challenges where it’s simple to assess success.
See my reply above—this line of thought is fundamentally mistaken. Simulation testing is far more of an effective general solution than formal verification.
Just to clarify: I was stating the thesis of the OP, not asserting it. Neither am I now denying it. (I don’t find myself altogether convinced either by the OP or by your arguments for why the OP is “probably completely mistaken”.)