dxu comments on Giant (In)scrutable Matrices: (Maybe) the Best of All Possible Worlds

dxu 6 Apr 2023 6:30 UTC
5 points
1

I’m not sure what predictions you’re making that are different than mine, other than maybe “a research program that skips NN’s and just try to build the representations that they build up directly without looking at NNs has reasonable chances of success.” Which doesn’t seem like one you’d actually want to make.

I think I would, actually, want to make this prediction. The problem is that I’d want to make it primarily in the counterfactual world where the NN approach had been abandoned and/or declared off-limits, since in any world where both approaches exist, I would also expect the connectionist approach to reach dividends faster (as has occurred in e.g. our own world). This doesn’t make my position inconsistent with the notion that a GOFAI-style approach is workable; it merely requires that I think such an approach requires more mastery and is therefore slower (which, for what it’s worth, seems true almost by definition)!

I do, however, think that “building the high-level representations”, despite being slower, would not be astronomically slower than using SGD on connectionist models (which is what you seem to be gesturing at, with claims like “for a many (though not all) substantial learning tasks, it seems likely you will wait until the continents collide and the sun cools before you are able to find that algorithm”). To be fair, you did specify that you were talking about “decision-tree specific algorithms” there, which I agree are probably too crude to learn anything complex in a reasonable amount of time; but I don’t think the sentiment you express there carries over to all manner of GOFAI-style approaches (which is the strength of claim you would actually need for [what looks to me like] your overall argument to carry through).

(A decision-tree based approach would likely also take “until the continents collide and the sun cools” to build a working chess evaluation function from scratch, for example, but humans coded by hand what were, essentially, decision trees for evaluating positions, and achieved reasonable success until that approach was obsoleted by neural network-based evaluation functions. This seems like it reasonably strongly suggests that whatever the humans were doing before they started using NNs was not a completely terrible way to code high-level feature-based descriptions of chess positions, and that—with further work—those representations would have continued to be refined. But of course, that didn’t happen, because neural networks came along and replaced the old evaluation functions; hence, again, why I’d want primarily to predict GOFAI-style success in the counterfactual world where the connectionists had for some reason stopped doing that.)