Anyway, Cyc is a language for expressing “knowledge” (e.g. water will fall out of an upside-down cup), and super-giant database of such “knowledge”, hand-coded over the course of >1000 person-years and >35 wall-clock years, by a team of tireless employees whom I do not envy.
How expensive would this be: Beats me. I didn’t ask. (Wild guess: one-time cost in the eight figures, based on Cycorp annual revenue of ~$5M according to the first hit on google (which I didn’t double-check).)
Why might open-sourcing Cyc be helpful for AGI interpretability efforts? Well maybe it won’t. But FWIW, here’s what I was thinking…
When I imagine what a future interpretability system will look like, in general, I imagine an interface…
The human-legible side of the interface consists of, maybe, human-understandable natural-language phrases, or human-understandable pictures, or human-understandable equations, or whatever.
The trained-model side of the interface consists of stuff that’s happening in a big complicated unlabeled model built by ML.
Anyway, my surmise is that Cyc might be a good fit for the former (human-legible) side of this interface.
Specifically, the Cyc project has this massive structured knowledge system, and everything in that system is human-legible. It was, after all, built by humans!
You might say: natural language is human-legible too. Why not use that?
Well for one thing, natural-language descriptions can be ambiguous. For example, dictionary words may have dozens of definitions. In Cyc, such a word would be disambiguated: it would be dozens of different tokens for the dozens of different definitions and nuances. (…If I understand correctly. It came up in the interview.)
For another thing, there’s a matching-across-the-interface challenge: we want to make sure that the right human-legible bits are matched to the corresponding trained-model bits (at least, insofar as such a thing is possible). Here, the massive Cyc knowledge database (i.e. millions of common-sense facts about the world like “air can pass through a screen door” or whatever else those poor employees had to input over the decades) would presumably come in handy. After building a link between things in the ML system and Cyc, we can check (somehow) that the ML system endorses all the “knowledge” in Cyc as being correct. If that test passes, then that’s a good sign we’re matching the two things up properly; if it fails in some area, we would want to go back and figure out what’s going on there.
I don’t want to make too strong a statement though. At the end of the day, it seems to me that a natural-language interface to a future AGI interpretability system has both strengths and weaknesses compared to a Cyc interface. (And there are other possibilities too.) Hey, maybe we should have both! When it comes to interpretability, I say more dakka :-P
Let’s buy out Cyc, for use in AGI interpretability systems?
I hear that we’re supposed to brainstorm ways to spend massive amounts of money to advance AGI safety. Well here’s a brainstorm, sorry if it’s stupid, I’m not an interpretability researcher:
We could pay Cycorp to open-source Cyc, so that researchers can incorporate it into future AGI interpretability systems.
What is Cyc? I’m not an expert. Most of what I know about Cyc comes from this podcast interview, and wikipedia, and LW wiki, and Eliezer Yudkowsky dismissing Cyc as a not on the path to AGI. I agree, by the way: I don’t think that Cyc is on the path to AGI. (I’m not even sure if it’s trying to be.) But it doesn’t matter, that’s not why I’m talking about it.
Anyway, Cyc is a language for expressing “knowledge” (e.g. water will fall out of an upside-down cup), and super-giant database of such “knowledge”, hand-coded over the course of >1000 person-years and >35 wall-clock years, by a team of tireless employees whom I do not envy.
How expensive would this be: Beats me. I didn’t ask. (Wild guess: one-time cost in the eight figures, based on Cycorp annual revenue of ~$5M according to the first hit on google (which I didn’t double-check).)
Why might open-sourcing Cyc be helpful for AGI interpretability efforts? Well maybe it won’t. But FWIW, here’s what I was thinking…
When I imagine what a future interpretability system will look like, in general, I imagine an interface…
The human-legible side of the interface consists of, maybe, human-understandable natural-language phrases, or human-understandable pictures, or human-understandable equations, or whatever.
The trained-model side of the interface consists of stuff that’s happening in a big complicated unlabeled model built by ML.
Anyway, my surmise is that Cyc might be a good fit for the former (human-legible) side of this interface.
Specifically, the Cyc project has this massive structured knowledge system, and everything in that system is human-legible. It was, after all, built by humans!
You might say: natural language is human-legible too. Why not use that?
Well for one thing, natural-language descriptions can be ambiguous. For example, dictionary words may have dozens of definitions. In Cyc, such a word would be disambiguated: it would be dozens of different tokens for the dozens of different definitions and nuances. (…If I understand correctly. It came up in the interview.)
For another thing, there’s a matching-across-the-interface challenge: we want to make sure that the right human-legible bits are matched to the corresponding trained-model bits (at least, insofar as such a thing is possible). Here, the massive Cyc knowledge database (i.e. millions of common-sense facts about the world like “air can pass through a screen door” or whatever else those poor employees had to input over the decades) would presumably come in handy. After building a link between things in the ML system and Cyc, we can check (somehow) that the ML system endorses all the “knowledge” in Cyc as being correct. If that test passes, then that’s a good sign we’re matching the two things up properly; if it fails in some area, we would want to go back and figure out what’s going on there.
I don’t want to make too strong a statement though. At the end of the day, it seems to me that a natural-language interface to a future AGI interpretability system has both strengths and weaknesses compared to a Cyc interface. (And there are other possibilities too.) Hey, maybe we should have both! When it comes to interpretability, I say more dakka :-P