At least one of us is confused. It never occurred to me that the original comment was intended as a joke (except in so far as it’s a deliberate drastic oversimplification) and I don’t think I understand what you mean about cacheing being subsumed by naming (especially as the alleged hard problem is not cacheing but cache invalidation—which seems to me to have very little to do with naming).
I’m probably missing something here; could you explain your interpretation of the original comment a bit more? (With of course the understanding that explaining jokes tends to ruin them.)
cache invalidation—which seems to me to have very little to do with naming
I don’t agree with Douglas_Knight’s claim about the intent of the quote, but a cache is a kind of (application of a) key-value data structure. Keys are names. What information is in the names affects how long the cache entries remain correct and useful for.
(Correct: the value is still the right answer for the key. Useful: the entry will not be unused in the future, i.e. is not garbage in the sense of garbage-collection.)
I agree that a cache can be thought of as involving names, but even if—as you suggest, and it’s a good point that I hadn’t considered in this context—you sometimes have some scope to choose how much information goes into the keys and hence make different tradeoffs between cache size, how long things are valid for, etc., it seems pretty strange to think of that as being about naming.
Well, as iceman mentioned on a different subthread, a content-addressable store (key = hash of value) is fairly clearly a sort of naming scheme. But the thing about the names in a content-addressable store is that unlike meaningful names, they say nothing about why this value is worth naming; only that someone has bothered to compute it in the past. Therefore a content-addressable store either grows without bound, or has a policy for deleting entries. In that way, it is like a cache.
For example, Git (the version control system) uses a content-addressable store, and has a policy that objects are kept only if they are referenced (transitively through other objects) by the human-managed arbitrary mutable namespace of “refs” (HEAD, branches, tags, reflog).
Tahoe-LAFS, a distributed filesystem which is partially content-addressable but in any case uses high-entropy names, requires that clients periodically “renew the lease” on files they are interested in keeping, which they do by recursive traversal from whatever roots the user chooses.
At least one of us is confused. It never occurred to me that the original comment was intended as a joke (except in so far as it’s a deliberate drastic oversimplification) and I don’t think I understand what you mean about cacheing being subsumed by naming (especially as the alleged hard problem is not cacheing but cache invalidation—which seems to me to have very little to do with naming).
I’m probably missing something here; could you explain your interpretation of the original comment a bit more? (With of course the understanding that explaining jokes tends to ruin them.)
I don’t agree with Douglas_Knight’s claim about the intent of the quote, but a cache is a kind of (application of a) key-value data structure. Keys are names. What information is in the names affects how long the cache entries remain correct and useful for.
(Correct: the value is still the right answer for the key. Useful: the entry will not be unused in the future, i.e. is not garbage in the sense of garbage-collection.)
I agree that a cache can be thought of as involving names, but even if—as you suggest, and it’s a good point that I hadn’t considered in this context—you sometimes have some scope to choose how much information goes into the keys and hence make different tradeoffs between cache size, how long things are valid for, etc., it seems pretty strange to think of that as being about naming.
Well, as iceman mentioned on a different subthread, a content-addressable store (key = hash of value) is fairly clearly a sort of naming scheme. But the thing about the names in a content-addressable store is that unlike meaningful names, they say nothing about why this value is worth naming; only that someone has bothered to compute it in the past. Therefore a content-addressable store either grows without bound, or has a policy for deleting entries. In that way, it is like a cache.
For example, Git (the version control system) uses a content-addressable store, and has a policy that objects are kept only if they are referenced (transitively through other objects) by the human-managed arbitrary mutable namespace of “refs” (HEAD, branches, tags, reflog).
Tahoe-LAFS, a distributed filesystem which is partially content-addressable but in any case uses high-entropy names, requires that clients periodically “renew the lease” on files they are interested in keeping, which they do by recursive traversal from whatever roots the user chooses.