DanielFilan comments on An Analytic Perspective on AI Alignment

DanielFilan 5 Mar 2020 20:34 UTC
LW: 2 AF: 1
0
AF

More pessimistically, one could imagine that the reason why no one has thought very hard about it is because in practice, it doesn’t really help you that much to have a mechanistic understanding of a neural network in order to do useful work.

I think I just think the ‘market’ here is ‘inefficient’? Like, I think this just isn’t a thing that people have really thought of, and those that have have gained semi-useful insight into neural networks by doing similar things (e.g. figuring out that adding a picture of a baseball to a whale fin will cause a network to misclassify the image as a great white shark). It also seems to me that recognition tasks (as opposed to planning/reasoning tasks) are going to be the hardest to get this kind of mechanistic transparency for, and also the kinds of tasks where transparency is easiest and ML systems are best.

Therefore, understanding real-world intelligent machines requires mostly understanding the tricks they do to be compute-efficient, rather than understanding the mathematical underpinnings.

I think I understand what you mean here, but also think that there can be tricks that reduce computational cost that have some sort of mathematical backbone—it seems to me that this is common in the study of algorithms. Note also that we don’t have to understand all possible real-world intelligent machines, just the ones that we build, making the requirement less stringent.