Any type of self-improvement in an un-aligned AGI = death. And if it’s already better than human level, it might not even need to do a bit of self-improvement, just escape our control, and we’re dead. So I think the suicide is quite a bit of hyperbole, or at least stated poorly relative to the rest of the conceptual landscape here.
If the AGI is aligned when it self-improves with algorithmic refinement, reflective stability should probably cause it to stay aligned after, and we just have a faster benevolent superintelligences.
So this concern is one more route to self-improvement. And theres a big question of how good a route it is.
My points were:
learning is at least as important as runtime speed. Refining networks to algorithms helps with one but destroys the other
Writing poems, and most cognitive activity, will very likely not resolve to a more efficient algorithm like arithmetic does. Arithmetic is a special case; perception and planning in varied environments require broad semantic connections. Networks excel at those. Algorithms do not.
So I take this to be a minor, not a major, concern for alignment, relative to others.
So I take this to be a minor, not a major, concern for alignment, relative to others.
Oh sure, this was more a “look at this cool thing intelligent machines could do that should shut up people from saying things like ‘foom is impossible because training run are expensive’”.
learning is at least as important as runtime speed. Refining networks to algorithms helps with one but destroys the other
Writing poems, and most cognitive activity, will very likely not resolve to a more efficient algorithm like arithmetic does. Arithmetic is a special case; perception and planning in varied environments require broad semantic connections. Networks excel at those. Algorithms do not.
Please don’t read this as me being hostile, but… why? How sure can we be of this? How sure are you that things-better-than-neural-networks are not out there?
Do we have any (non-trivial) equivalent algorithm that works best inside a NN rather than code?
Btw I am no neuroscientists, so I could be missing a lot of the intuitions you got.
At the end of the day you seem to think that it can be possible to fully interpret and reverse engineer neural networks, but you just don’t believe that Good Old Fashioned AGI can exists and/or be better than training NNs weights?
I haven’t justified either of those statements; I hope to make the complete arguments in upcoming posts. For now I’ll just say that human cognition is solving tough problems, and there’s no good reason to think that algorithms would be lots more efficient than networks in solving those problems.
I’ll also reference Morevec’s Paradox as an intuition pump. Things that are hard for humans, like chess and arithmetic are easy for computers (algorithms); things that are easy for humans, like vision and walking, are hard for algorithms.
I definitely do not think it’s pragmatically possible to fully interpret or reverse engineer neural networks. I think it’s possible to do it adequately to create aligned AGI, but that’s a much weaker criteria.
Any type of self-improvement in an un-aligned AGI = death. And if it’s already better than human level, it might not even need to do a bit of self-improvement, just escape our control, and we’re dead. So I think the suicide is quite a bit of hyperbole, or at least stated poorly relative to the rest of the conceptual landscape here.
If the AGI is aligned when it self-improves with algorithmic refinement, reflective stability should probably cause it to stay aligned after, and we just have a faster benevolent superintelligences.
So this concern is one more route to self-improvement. And theres a big question of how good a route it is.
My points were:
learning is at least as important as runtime speed. Refining networks to algorithms helps with one but destroys the other
Writing poems, and most cognitive activity, will very likely not resolve to a more efficient algorithm like arithmetic does. Arithmetic is a special case; perception and planning in varied environments require broad semantic connections. Networks excel at those. Algorithms do not.
So I take this to be a minor, not a major, concern for alignment, relative to others.
Sorry for taking long to get back to you.
Oh sure, this was more a “look at this cool thing intelligent machines could do that should shut up people from saying things like ‘foom is impossible because training run are expensive’”.
Please don’t read this as me being hostile, but… why? How sure can we be of this? How sure are you that things-better-than-neural-networks are not out there?
Do we have any (non-trivial) equivalent algorithm that works best inside a NN rather than code?
Btw I am no neuroscientists, so I could be missing a lot of the intuitions you got.
At the end of the day you seem to think that it can be possible to fully interpret and reverse engineer neural networks, but you just don’t believe that Good Old Fashioned AGI can exists and/or be better than training NNs weights?
I haven’t justified either of those statements; I hope to make the complete arguments in upcoming posts. For now I’ll just say that human cognition is solving tough problems, and there’s no good reason to think that algorithms would be lots more efficient than networks in solving those problems.
I’ll also reference Morevec’s Paradox as an intuition pump. Things that are hard for humans, like chess and arithmetic are easy for computers (algorithms); things that are easy for humans, like vision and walking, are hard for algorithms.
I definitely do not think it’s pragmatically possible to fully interpret or reverse engineer neural networks. I think it’s possible to do it adequately to create aligned AGI, but that’s a much weaker criteria.
Please fix (or remove) the link.
Done, thanks!