I agree that early electronics were buggy until we learned to build them reliably—and perhaps we can solve this for gradient-descent based learning, though many are skeptical of that, since many of the problems have been shown to be pretty fundamental. I also agree that any system is inscrutable until you understand it, but unlike early electronics, no-one understands these massive lists of numbers that produce text, and human brains can’t build them, they just program a process to grow them
I can say from experience that no one “understands” complex software and hardware systems. Any little bug can take weeks to isolate and patch, and you end up in calls with multiple domain specialists. Rare non deterministic bugs you will never find the cause for in the lifetime of the project.
This means you need to use black box methods for testing and reliability analysis. You empirically measure how often it fails, you don’t know how often it will fail by looking at the design.
These same methods apply mostly unchanged to AI. The only major difference I see with AI is the possibility of coordinated failure, where multiple AI systems conspire to all fail at the same time. This implies that AI alignment may ultimately be based around the simple idea of reducing complex AI systems that can fail in a coordinated way, to simpler AI systems that fail independently. (Note this doesn’t mean it will be simple to do. See what humans have to do to control fire. Just 3 ingredients you have to keep apart, yet you need all this infrastructure)
This is the “pinnacle” of software engineering at present : rip apart your software from this complex, coupled thing to a bunch of separate simpler things that are each testable, and a modern flavor is to do things like make every element in a complex GUI use a separate copy of the JavaScript libraries.
I can say from experience that no one “understands” complex software and hardware systems. Any little bug can take weeks to isolate and patch, and you end up in calls with multiple domain specialists. Rare non deterministic bugs you will never find the cause for in the lifetime of the project.
This means you need to use black box methods for testing and reliability analysis. You empirically measure how often it fails, you don’t know how often it will fail by looking at the design.
These same methods apply mostly unchanged to AI. The only major difference I see with AI is the possibility of coordinated failure, where multiple AI systems conspire to all fail at the same time. This implies that AI alignment may ultimately be based around the simple idea of reducing complex AI systems that can fail in a coordinated way, to simpler AI systems that fail independently. (Note this doesn’t mean it will be simple to do. See what humans have to do to control fire. Just 3 ingredients you have to keep apart, yet you need all this infrastructure)
This is the “pinnacle” of software engineering at present : rip apart your software from this complex, coupled thing to a bunch of separate simpler things that are each testable, and a modern flavor is to do things like make every element in a complex GUI use a separate copy of the JavaScript libraries.