At some point you have to deal with the fact the fact that understanding the world entails knowing lots and lots of stuff—things like “tires are usually black”, or “it’s gauche to wear white after labor day”, etc.
There seem to be only two options:
Humans manually type in “tires are usually black” and zillions more things like that. This is very labor-intensive, if it’s possible at all. Cyc is the famous example along these lines. Davidad’s recent proposal is that we should try to do this.
A learning algorithm infers zillions of regularities in the world, like the fact that tires are usually black. That’s the deep learning approach, but there are also many non-deep-learning approaches in this category. I think conventional wisdom (which I happen to share) is that this category is the only category that might actually get to powerful AGI. And I don’t see how this category can be compatible with “creating a system that the system is just inherently transparent to the operators”, because the AGI will do different things depending on its “knowledge”, i.e. the giant collection of regularities that it has discovered, which are (presumably) unlabeled-by-default and probably a giant mess of things vaguely like “PATTERN 87462: IF BOTH PATTERN 24953 AND PATTERN 758463 ARE SIMULTANEOUSLY ACTIVE RIGHT NOW THEN IT’S MARGINALLY MORE LIKELY THAT PATTERN 217364 WILL BE ACTIVE SOON”, or whatever. And then the AGI does something, and humans have their work cut out figuring out why.
There might be a middle way between these—I think the probabilistic programming people might describe their roadmap-to-AGI that way?—but I don’t understand those kinds of plans, or if I do, then I don’t believe them.
I think the second-setup still allows for powerful AGI that’s more explainable than current AI, in the same way that humans can kind of explain decisions to each other, but not very well at the level of neuroscience.
If something like natural abstractions are real, then this would get easier. I have a hard time not believing a weak version of this (e.g. human and AGI neuron structures could be totally different, but they’d both end up with some basic things like “the concept of 1”).
At some point you have to deal with the fact the fact that understanding the world entails knowing lots and lots of stuff—things like “tires are usually black”, or “it’s gauche to wear white after labor day”, etc.
There seem to be only two options:
Humans manually type in “tires are usually black” and zillions more things like that. This is very labor-intensive, if it’s possible at all. Cyc is the famous example along these lines. Davidad’s recent proposal is that we should try to do this.
A learning algorithm infers zillions of regularities in the world, like the fact that tires are usually black. That’s the deep learning approach, but there are also many non-deep-learning approaches in this category. I think conventional wisdom (which I happen to share) is that this category is the only category that might actually get to powerful AGI. And I don’t see how this category can be compatible with “creating a system that the system is just inherently transparent to the operators”, because the AGI will do different things depending on its “knowledge”, i.e. the giant collection of regularities that it has discovered, which are (presumably) unlabeled-by-default and probably a giant mess of things vaguely like “PATTERN 87462: IF BOTH PATTERN 24953 AND PATTERN 758463 ARE SIMULTANEOUSLY ACTIVE RIGHT NOW THEN IT’S MARGINALLY MORE LIKELY THAT PATTERN 217364 WILL BE ACTIVE SOON”, or whatever. And then the AGI does something, and humans have their work cut out figuring out why.
There might be a middle way between these—I think the probabilistic programming people might describe their roadmap-to-AGI that way?—but I don’t understand those kinds of plans, or if I do, then I don’t believe them.
I think the second-setup still allows for powerful AGI that’s more explainable than current AI, in the same way that humans can kind of explain decisions to each other, but not very well at the level of neuroscience.
If something like natural abstractions are real, then this would get easier. I have a hard time not believing a weak version of this (e.g. human and AGI neuron structures could be totally different, but they’d both end up with some basic things like “the concept of 1”).