Early rationalist writing on the threats of unaligned AGI emerged out of thinking on GOFAI systems that were supposed to operate on rationalist or logical thought processes. Everything is explicitly coded and transparent. Based on this framework, if an AI system operates on pure logic, then you’d better ensure that you specify a goal that doesn’t leave any loopholes in it. In other words, AI would follow the laws exactly as they were written, not as they were intended by fuzzy human minds or the spirit animating them. Since the early alignment researchers could figure out how to logically specify human values and goals that would parse for a symbolic AI without leaving loopholes large enough to threaten humanity, they grew pessimistic about the whole prospect of alignment. This pessimism has infected the field and remains with us today, even with the rise of deep learning with deep neural networks.
I will leave you with a quote from Eliezer Yudkowsky which I believe encapsulates this old view of how AI, and alignment, were supposed to work.
Most of the time, the associational, similarity-based architecture of biological neural structures is a terrible inconvenience. Human evolution always works with neural structures—no other type of computational substrate is available—but some computational tasks are so ill-suited to the architecture that one must turn incredible hoops to encode them neurally. (This is why I tend to be instinctively suspicious of someone who says, ‘Let’s solve this problem with a neural net!’ When the human mind comes up with a solution, it tends to phrase it as code, not a neural network. ‘If you really understood the problem,’ I think to myself, ‘you wouldn’t be using neural nets.’) - Contextualizing seed-AI proposals
Early rationalist writing on the threats of unaligned AGI emerged out of thinking on GOFAI systems that were supposed to operate on rationalist or logical thought processes. Everything is explicitly coded and transparent. Based on this framework, if an AI system operates on pure logic, then you’d better ensure that you specify a goal that doesn’t leave any loopholes in it. In other words, AI would follow the laws exactly as they were written, not as they were intended by fuzzy human minds or the spirit animating them. Since the early alignment researchers could figure out how to logically specify human values and goals that would parse for a symbolic AI without leaving loopholes large enough to threaten humanity, they grew pessimistic about the whole prospect of alignment. This pessimism has infected the field and remains with us today, even with the rise of deep learning with deep neural networks.
I will leave you with a quote from Eliezer Yudkowsky which I believe encapsulates this old view of how AI, and alignment, were supposed to work.