As an additional reason for the importance of tabooing “know”, note that I disagree with all three of your claims about what the model “knows” in this comment and its parent.
(The definition of “know” I’m using is something like “knowing X means possessing a mental model which corresponds fairly well to reality, from which X can be fairly easily extracted”.)
The trained AlphaZero model knows lots of things about Go, in a comparable way to how a dog knows lots of things about running.
But the algorithm that gives rise to that model can know arbitrarily few things. (After all, the laws of physics gave rise to us, but they know nothing at all.)
Ah, understood. I think this is basically covered by talking about what the go bot knows at various points in time, a la this comment—it seems pretty sensible to me to talk about knowledge as a property of the actual computation rather than the algorithm as a whole. But from your response there it seems that you think that this sense isn’t really well-defined.
I’m not sure what you mean by “actual computation rather than the algorithm as a whole”. I thought that I was talking about the knowledge of the trained model which actually does the “computation” of which move to play, and you were talking about the knowledge of the algorithm as a whole (i.e. the trained model plus the optimising bot).
The human knows the rules and the win condition. The optimisation algorithm doesn’t, for the same reason that evolution doesn’t “know” what dying is: neither are the types of entities to which you should ascribe knowledge.
Suppose you have a computer program that gets two neural networks, simulates a game of go between them, determines the winner, and uses the outcome to modify the neural networks. It seems to me that this program has a model of the ‘go world’, i.e. a simulator, and from that model you can fairly easily extract the rules and winning condition. Do you think that this is a model but not a mental model, or that it’s too exact to count as a model, or something else?
I’d say that this is too simple and programmatic to be usefully described as a mental model. The amount of structure encoded in the computer program you describe is very small, compared with the amount of structure encoded in the neural networks themselves. (I agree that you can have arbitrarily simple models of very simple phenomena, but those aren’t the types of models I’m interested in here. I care about models which have some level of flexibility and generality, otherwise you can come up with dumb counterexamples like rocks “knowing” the laws of physics.)
As another analogy: would you say that the quicksort algorithm “knows” how to sort lists? I wouldn’t, because you can instead just say that the quicksort algorithm sorts lists, which conveys more information (because it avoids anthropomorphic implications). Similarly, the program you describe builds networks that are good at Go, and does so by making use of the rules of Go, but can’t do the sort of additional processing with respect to those rules which would make me want to talk about its knowledge of Go.
As an additional reason for the importance of tabooing “know”, note that I disagree with all three of your claims about what the model “knows” in this comment and its parent.
(The definition of “know” I’m using is something like “knowing X means possessing a mental model which corresponds fairly well to reality, from which X can be fairly easily extracted”.)
In the parent, is your objection that the trained AlphaZero-like model plausibly knows nothing at all?
The trained AlphaZero model knows lots of things about Go, in a comparable way to how a dog knows lots of things about running.
But the algorithm that gives rise to that model can know arbitrarily few things. (After all, the laws of physics gave rise to us, but they know nothing at all.)
Ah, understood. I think this is basically covered by talking about what the go bot knows at various points in time, a la this comment—it seems pretty sensible to me to talk about knowledge as a property of the actual computation rather than the algorithm as a whole. But from your response there it seems that you think that this sense isn’t really well-defined.
I’m not sure what you mean by “actual computation rather than the algorithm as a whole”. I thought that I was talking about the knowledge of the trained model which actually does the “computation” of which move to play, and you were talking about the knowledge of the algorithm as a whole (i.e. the trained model plus the optimising bot).
On that definition, how does one train an AlphaZero-like algorithm without knowing the rules of the game and win condition?
The human knows the rules and the win condition. The optimisation algorithm doesn’t, for the same reason that evolution doesn’t “know” what dying is: neither are the types of entities to which you should ascribe knowledge.
Suppose you have a computer program that gets two neural networks, simulates a game of go between them, determines the winner, and uses the outcome to modify the neural networks. It seems to me that this program has a model of the ‘go world’, i.e. a simulator, and from that model you can fairly easily extract the rules and winning condition. Do you think that this is a model but not a mental model, or that it’s too exact to count as a model, or something else?
I’d say that this is too simple and programmatic to be usefully described as a mental model. The amount of structure encoded in the computer program you describe is very small, compared with the amount of structure encoded in the neural networks themselves. (I agree that you can have arbitrarily simple models of very simple phenomena, but those aren’t the types of models I’m interested in here. I care about models which have some level of flexibility and generality, otherwise you can come up with dumb counterexamples like rocks “knowing” the laws of physics.)
As another analogy: would you say that the quicksort algorithm “knows” how to sort lists? I wouldn’t, because you can instead just say that the quicksort algorithm sorts lists, which conveys more information (because it avoids anthropomorphic implications). Similarly, the program you describe builds networks that are good at Go, and does so by making use of the rules of Go, but can’t do the sort of additional processing with respect to those rules which would make me want to talk about its knowledge of Go.