I’d be shocked if there was anyone to whom it was mechanistically transparent how a laptop loads a website, down to the gates in the laptop.
So, I don’t think I’m saying that you have to mechanistically understand how every single gate works—rather, that you should be able to understand intermediate-level sub-systems and how they combine to produce the functionality of the laptop. The understanding of the intermediate-level sub-systems has to be pretty complete, but probably need not be totally complete—in the laptop case, you can just model a uniform random error rate and you’ll be basically right, and I imagine there should be something analogous with neural networks. Of course, you need somebody to be in charge of understanding the neurons in order to build to your understanding of the intermediate-level sub-systems, but it doesn’t seem to me that there needs to be any single person who understands all the neurons entirely—or indeed even any single person who needs to understand all the intermediate-level sub-systems entirely.
I think I should not have used the laptop example, it’s not really communicating what I meant it to communicate. I was trying to convey “mechanistic transparency is hard” rather than “mechanistic transparency requires a single person to understand everything”.
I guess I still don’t understand why you believe mechanistic transparency is hard. The way I want to use the term, as far as I can tell, laptops are acceptably mechanistically transparent to the companies that create them. Do you think I’m wrong?
No, which is why I want to stop using the example.
(The counterfactual I was thinking of was more like “imagine we handed a laptop to 19th-century scientists, can they mechanistically understand it?” But even that isn’t a good analogy, it overstates the difficulty.)
Do we mechanistically understand laptops?
So, I don’t think I’m saying that you have to mechanistically understand how every single gate works—rather, that you should be able to understand intermediate-level sub-systems and how they combine to produce the functionality of the laptop. The understanding of the intermediate-level sub-systems has to be pretty complete, but probably need not be totally complete—in the laptop case, you can just model a uniform random error rate and you’ll be basically right, and I imagine there should be something analogous with neural networks. Of course, you need somebody to be in charge of understanding the neurons in order to build to your understanding of the intermediate-level sub-systems, but it doesn’t seem to me that there needs to be any single person who understands all the neurons entirely—or indeed even any single person who needs to understand all the intermediate-level sub-systems entirely.
I think I should not have used the laptop example, it’s not really communicating what I meant it to communicate. I was trying to convey “mechanistic transparency is hard” rather than “mechanistic transparency requires a single person to understand everything”.
I guess I still don’t understand why you believe mechanistic transparency is hard. The way I want to use the term, as far as I can tell, laptops are acceptably mechanistically transparent to the companies that create them. Do you think I’m wrong?
No, which is why I want to stop using the example.
(The counterfactual I was thinking of was more like “imagine we handed a laptop to 19th-century scientists, can they mechanistically understand it?” But even that isn’t a good analogy, it overstates the difficulty.)