If you’re talking about an AI general enough to answer interesting questions, something that doesn’t just recite knowledge from a database, but something that can actually solve problems by using and synthesizing information in novel ways (which I assume you are, if you’re talking about preventing it from turning the Earth into a supercollider by putting limits on its resource usage), then you would need to solve the additional problem of constraining what questions it’s allowed to answer — you don’t want someone asking it for the source code for some other type of AI, for example. I suspect that this part of the problem is FAI-complete.
If you could build an AI that did nothing but parse published articles to answer the question, “Has anyone said X?”, that would be very useful, and very safe. I worked on such a program (SemRep) at NIH. It works pretty well within the domain of medical journal articles.
If it could take one step more, and ask, “Can you find a set of one to four statements that, taken together, imply X?”, that would be a huge advance in capability, with little if any additional risk.
I added that capability to SemRep, but no one has ever used it, and it isn’t accessible through the web interface. (I introduced a switch that makes it dump its output as structured Prolog statements instead of as a flat file; you can then load them into a Prolog interpreter and ask queries, and it will perform Prolog inference.) In fact, I don’t think anyone else is aware that capability exists; my former boss thought it was a waste of time and was angry with me for having spent a day implementing it, and has probably forgotten about it. It needs some refinement to work properly, because a search of, say, 100,000 article abstracts will find many conflicting statements. It needs to pick one of “A / not A” for every A found directly in an article, based on the number of and quality of assertions found in favor of each.
How close to you have to get to natural language to do the search?
I’ve wondered whether a similar system could check legal systems for contradictions—probably a harder problem, but not as hard as full natural language.
Most of the knowledge used, is in its ontology. It doesn’t try to parse sentences with categories like {noun, verb, adverb}; it uses categories like {drug, disease, chemical, gene, surgery, physical therapy}. It doesn’t categorize verbs as {transitive, intransitive, etc.}; it categorizes verbs as eg {increases, decreases, is-a-symptom-of}. When you build a grammar (by hand) out of word categories that are this specific, it makes most NLP problems disappear.
ADDED: It isn’t really a grammar, either—it grabs onto the most-distinctive simple pattern first, which might be the phrase “is present in”, and then says, “Somewhere to the left I’ll probably find a symptom, and somewhere to the right I’ll probably find a disease”, and then goes looking for those things, mostly ignoring the words in-between.
I don’t know what you mean by ‘ontology’. I thought it meant the study of reality.
I can believe that the language in scientific research (especially if you limit the fields) is simplified enough for the sort of thing you describe to work.
In computer science and information science, an ontology is a formal representation of knowledge as a set of concepts within a domain, and the relationships between those concepts. It is used to reason about the entities within that domain, and may be used to describe the domain.
I don’t know what you mean by ‘ontology’. I thought it meant the study of reality.
“In computer science and information science, an ontology) is a formal representation of knowledge as a set of concepts within a domain, and the relationships between those concepts. It is used to reason about the entities within that domain, and may be used to describe the domain.”
If you’re talking about an AI general enough to answer interesting questions, something that doesn’t just recite knowledge from a database, but something that can actually solve problems by using and synthesizing information in novel ways (which I assume you are, if you’re talking about preventing it from turning the Earth into a supercollider by putting limits on its resource usage), then you would need to solve the additional problem of constraining what questions it’s allowed to answer
Nitpick, to some extent we have weak AI that can within very narrow knowledge bases already answer interesting novel question. For example, the Robbins conjecture was proven using the assistance of an automated theorem prover. And Simon Colton made AI that were able to make new interesting mathematicial definitions and make conjectures about them (see this paper). There’s been similar work in biochemistry. So even very weak AIs can not only answer interesting questions but come with new questions themselves.
you don’t want someone asking it for the source code for some other type of AI, for example.
But that is as easy as not being reckless with it. One can still deliberately crash a car, but they’re pretty safe.
The relative sizes of space of safe/dangerous questions compares favorably to the sizes of space of FAI/UFAI designs.
I suspect that this part of the problem is FAI-complete.
If so, that’s not necessarily an argument against an oracle AI. It still may be ‘as hard as creating FAI’, but only because FAI can be made through an oracle AI (and all other paths being much harder).
If you’re talking about an AI general enough to answer interesting questions, something that doesn’t just recite knowledge from a database, but something that can actually solve problems by using and synthesizing information in novel ways (which I assume you are, if you’re talking about preventing it from turning the Earth into a supercollider by putting limits on its resource usage), then you would need to solve the additional problem of constraining what questions it’s allowed to answer — you don’t want someone asking it for the source code for some other type of AI, for example. I suspect that this part of the problem is FAI-complete.
If you could build an AI that did nothing but parse published articles to answer the question, “Has anyone said X?”, that would be very useful, and very safe. I worked on such a program (SemRep) at NIH. It works pretty well within the domain of medical journal articles.
If it could take one step more, and ask, “Can you find a set of one to four statements that, taken together, imply X?”, that would be a huge advance in capability, with little if any additional risk.
I added that capability to SemRep, but no one has ever used it, and it isn’t accessible through the web interface. (I introduced a switch that makes it dump its output as structured Prolog statements instead of as a flat file; you can then load them into a Prolog interpreter and ask queries, and it will perform Prolog inference.) In fact, I don’t think anyone else is aware that capability exists; my former boss thought it was a waste of time and was angry with me for having spent a day implementing it, and has probably forgotten about it. It needs some refinement to work properly, because a search of, say, 100,000 article abstracts will find many conflicting statements. It needs to pick one of “A / not A” for every A found directly in an article, based on the number of and quality of assertions found in favor of each.
How close to you have to get to natural language to do the search?
I’ve wondered whether a similar system could check legal systems for contradictions—probably a harder problem, but not as hard as full natural language.
Most of the knowledge used, is in its ontology. It doesn’t try to parse sentences with categories like {noun, verb, adverb}; it uses categories like {drug, disease, chemical, gene, surgery, physical therapy}. It doesn’t categorize verbs as {transitive, intransitive, etc.}; it categorizes verbs as eg {increases, decreases, is-a-symptom-of}. When you build a grammar (by hand) out of word categories that are this specific, it makes most NLP problems disappear.
ADDED: It isn’t really a grammar, either—it grabs onto the most-distinctive simple pattern first, which might be the phrase “is present in”, and then says, “Somewhere to the left I’ll probably find a symptom, and somewhere to the right I’ll probably find a disease”, and then goes looking for those things, mostly ignoring the words in-between.
I don’t know what you mean by ‘ontology’. I thought it meant the study of reality.
I can believe that the language in scientific research (especially if you limit the fields) is simplified enough for the sort of thing you describe to work.
See: http://en.wikipedia.org/wiki/Ontology_(information_science)
“In computer science and information science, an ontology) is a formal representation of knowledge as a set of concepts within a domain, and the relationships between those concepts. It is used to reason about the entities within that domain, and may be used to describe the domain.”
Nitpick, to some extent we have weak AI that can within very narrow knowledge bases already answer interesting novel question. For example, the Robbins conjecture was proven using the assistance of an automated theorem prover. And Simon Colton made AI that were able to make new interesting mathematicial definitions and make conjectures about them (see this paper). There’s been similar work in biochemistry. So even very weak AIs can not only answer interesting questions but come with new questions themselves.
Or control access (which you probably want to do for the source for any sort of AI, anyway).
(Are you sure you’re not searching for additional restrictions to impose until the problem becomes FAI-complete?)
But that is as easy as not being reckless with it. One can still deliberately crash a car, but they’re pretty safe.
The relative sizes of space of safe/dangerous questions compares favorably to the sizes of space of FAI/UFAI designs.
If so, that’s not necessarily an argument against an oracle AI. It still may be ‘as hard as creating FAI’, but only because FAI can be made through an oracle AI (and all other paths being much harder).