Tachikoma answers Are we too confident about unaligned AGI killing off humanity?

Tachikoma 7 Mar 2023 3:15 UTC
7 points
0
Early rationalist writing on the threats of unaligned AGI emerged out of thinking on GOFAI systems that were supposed to operate on rationalist or logical thought processes. Everything is explicitly coded and transparent. Based on this framework, if an AI system operates on pure logic, then you’d better ensure that you specify a goal that doesn’t leave any loopholes in it. In other words, AI would follow the laws exactly as they were written, not as they were intended by fuzzy human minds or the spirit animating them. Since the early alignment researchers could figure out how to logically specify human values and goals that would parse for a symbolic AI without leaving loopholes large enough to threaten humanity, they grew pessimistic about the whole prospect of alignment. This pessimism has infected the field and remains with us today, even with the rise of deep learning with deep neural networks.
I will leave you with a quote from Eliezer Yudkowsky which I believe encapsulates this old view of how AI, and alignment, were supposed to work.
Most of the time, the associational, similarity-based
architecture of biological neural structures is a terrible
inconvenience. Human evolution always works with neural
structures—no other type of computational substrate is
available—but some computational tasks are so ill-suited to the
architecture that one must turn incredible hoops to encode them
neurally. (This is why I tend to be instinctively suspicious of
someone who says, ‘Let’s solve this problem with a neural net!’
When the human mind comes up with a solution, it tends to phrase
it as code, not a neural network. ‘If you really understood the
problem,’ I think to myself, ‘you wouldn’t be using neural
nets.’) - Contextualizing seed-AI proposals