Reading this with my theoretical computer scientist hat on, the whole thing feels rather fuzzy. It feels basically like taking the informal throwing ideas around level of discussion, and putting numbers on the paragraphs to make it seem deep and formal. The hat-influenced persona is mostly upset about how the concept of AGI is such a magical anthropomorphic creature here, when Numbers Being Used sets up the expectation for Crisp Reductive Formalization to make an appearance.
There might be a chapter in the beginning, stating, “We define an AGI to be a system with such and such properties”, that grounds and sets up the ensuing discussion.
The other bit of troublesome magical hand-waving is the part where the problems of going from human concepts, desires and intentions into machine formalism come up. The theoretical computer scientist hat is quite familiar with problems of formal specification, but being a piece of headgear with sinister psychic powers, it does not have first-hand familiarity with human concepts, desires and intentions, and would like a formal presentation for them.
The problem here of course is that we don’t have a good formal presentation for stuff in human brain states, and things start to get rather tricky. This is basically the Alan Perlis epigram, “One can’t proceed from the informal to the formal by formal means”, except that now it looks like we need to. A direct attack won’t work, but the approach where the massive mess of going from the informal human level in general to the formal AGI implementation level is basically just danced around in the discussion with rather specific problems going from some certain informal human concepts into particularly unfriendliness-engendering AGI designs feels like it leaves the discussion of the particulars rather groundless.
I don’t have a good answer to the problems. The basic questions of defining the AGI and the translation from generalized human intentions to AGI design is basically a big chunk of the whole friendly AI endeavor, and can’t be handled in the paper. But I still have the sense that there should be some extra crispness there to convince a computer science literate reader that this is something worth paying attention to, instead of just make-work prognostication around an ill-defined concept.
The thing with most engineering-style literature, computer science included, is that while the work is both concerned with mechanical exactitude with respect to the subject matter and inexorably tied to human creativity and judgment in coming up with the subject matter to begin with, it is generally quite silent on trying to apply the standards of mechanical exactitude to the human creativity and judgment half of the process. Some kind of precursor article that focuses on this bit and how an AGI project will need to break the Perlis epigram, which for most people is just an invisible background assumption not even worth stating, might be useful to make people less predisposed to seeing the AGI stuff itself as just strange nonsense.
Agreed. I get same feeling basically, on top of which it feels to me that the formalization of fuzzily defined goal systems, be it FAI or paperclip maximizer, may well be impossible in practice (nobody can do it even in a toy model given infinite computing power!), leaving us with either the neat AIs that implement something like ‘maximize own future opportunities’ (the AI will have to be able to identify separate courses of action to begin with), or altogether with some messy AIs (neural network, cortical column network, et cetera) for which none of the argument is applicable. If I put speculative hat on, I can make up argument that the AI will be a Greenpeace activist just as well, by considering what the simplest self protective goal systems may be (and discarding the bias that the AI is self aware in man-like way)
Reading this with my theoretical computer scientist hat on, the whole thing feels rather fuzzy. It feels basically like taking the informal throwing ideas around level of discussion, and putting numbers on the paragraphs to make it seem deep and formal. The hat-influenced persona is mostly upset about how the concept of AGI is such a magical anthropomorphic creature here, when Numbers Being Used sets up the expectation for Crisp Reductive Formalization to make an appearance.
There might be a chapter in the beginning, stating, “We define an AGI to be a system with such and such properties”, that grounds and sets up the ensuing discussion.
The other bit of troublesome magical hand-waving is the part where the problems of going from human concepts, desires and intentions into machine formalism come up. The theoretical computer scientist hat is quite familiar with problems of formal specification, but being a piece of headgear with sinister psychic powers, it does not have first-hand familiarity with human concepts, desires and intentions, and would like a formal presentation for them.
The problem here of course is that we don’t have a good formal presentation for stuff in human brain states, and things start to get rather tricky. This is basically the Alan Perlis epigram, “One can’t proceed from the informal to the formal by formal means”, except that now it looks like we need to. A direct attack won’t work, but the approach where the massive mess of going from the informal human level in general to the formal AGI implementation level is basically just danced around in the discussion with rather specific problems going from some certain informal human concepts into particularly unfriendliness-engendering AGI designs feels like it leaves the discussion of the particulars rather groundless.
I don’t have a good answer to the problems. The basic questions of defining the AGI and the translation from generalized human intentions to AGI design is basically a big chunk of the whole friendly AI endeavor, and can’t be handled in the paper. But I still have the sense that there should be some extra crispness there to convince a computer science literate reader that this is something worth paying attention to, instead of just make-work prognostication around an ill-defined concept.
The thing with most engineering-style literature, computer science included, is that while the work is both concerned with mechanical exactitude with respect to the subject matter and inexorably tied to human creativity and judgment in coming up with the subject matter to begin with, it is generally quite silent on trying to apply the standards of mechanical exactitude to the human creativity and judgment half of the process. Some kind of precursor article that focuses on this bit and how an AGI project will need to break the Perlis epigram, which for most people is just an invisible background assumption not even worth stating, might be useful to make people less predisposed to seeing the AGI stuff itself as just strange nonsense.
Agreed. I get same feeling basically, on top of which it feels to me that the formalization of fuzzily defined goal systems, be it FAI or paperclip maximizer, may well be impossible in practice (nobody can do it even in a toy model given infinite computing power!), leaving us with either the neat AIs that implement something like ‘maximize own future opportunities’ (the AI will have to be able to identify separate courses of action to begin with), or altogether with some messy AIs (neural network, cortical column network, et cetera) for which none of the argument is applicable. If I put speculative hat on, I can make up argument that the AI will be a Greenpeace activist just as well, by considering what the simplest self protective goal systems may be (and discarding the bias that the AI is self aware in man-like way)