OK, but I didn’t even know that “apt-cache search” existed until you just mentioned it. I like the idea of debian but I usually get stuck using CentOS. So I think that might count as evidence in favor of doing something in the ball park of what I suggested, just with the right filter on what’s included and excluded? Probably much of the value of the deck would be (1) figuring out which things are useful enough to know about that would seriously help someone if they forgot about them or were ignorant in the first place and (2) making those decisions available for other people to memorize without having to do all the prioritization and searching.
Also, that example seems to confirm the idea that some kind of overview of “library science and reference searching and help navigation” techniques would be really useful. When I punch “library” into Anki there are only two decks so far, one for the built in functions of python and one connecting locations on a map to their chinese name. Both come up because they mention a URL that has “library” in the directory structure rather than because they’re aimed at what I want.
So I suspect that no user of Anki with enough skills to make and publish decks has thought about these issues for very long? Or I’m on the wrong track with my thinking? Perhaps Anki can’t really improve on the XKCD problem solving flowchart? Hmmm...
Probably much of the value of the deck would be (1) figuring out which things are useful enough to know about that would seriously help someone if they forgot about them or were ignorant in the first place and (2) making those decisions available for other people to memorize without having to do all the prioritization and searching.
My personal impression is that tools seem to follow a sort of exponent or power law where a few tools are used a ton of times, many are used only once or no times, and there’s a reasonable middle ground of things used from time to time.
Tools in the first category are so common that SRS offers nothing. I don’t need a flashcard to remember ls or cd. Tools in the third category would be wasteful to put into SRS, per my previous comment about search being quick and faster than 5 minutes.
The second category might be useful in SRS, but how do you generate them before-hand? I don’t think you can. Everyone uses a different 80%. That sort of list can only be generated with heuristics like ‘if I use a tool twice, then memorize it’ or in retrospect (‘I used tools x, y, and z with frequency n, and the total search time for each was >5 minutes’).
That article says that each person uses 20% of the tools, not 80%. If everyone used a different 80% that would seem to imply at least a 60% overlap in usage for two people, and at least a 40% overlap for three. Probably the overlaps would fall more or less slowly for different tools along the usage curve you proposed. It seems like there should be some way to at least estimate the expected value here...
Maybe you could check package download statistics and history data (or something?) to see which things are most installed and most used. If you had data from many people I bet you’d find clusters in usage space that could turn into different sorts of flash card decks to help a person “join” that cluster, or become capable of code switching between computer usage styles? Or another use might be to help someone who takes an N year hiatus from coding who doesn’t want to get too rusty at the keyboard?
Maybe you could check package download statistics and history data (or something?) to see which things are most installed and most used. If you had data from many people I bet you’d find clusters in usage space that could turn into different sorts of flash card decks to help a person “join” that cluster, or become capable of code switching between computer usage styles?
AFAIK, the Popcons all heavily anonymize their data down to ‘installed or not’, and don’t include anything useful for clustering. (This is reasonable because with the dozens of thousands of packages and whatever power laws or distributions are involved, it’d only take a few idiosyncratic package installations to break privacy.)
So maybe clusters would be efficient enough—although keeping in mind my 5 minute rule and the point about it being very easy to search for programs, I still think it’s unlikely—but currently I don’t know of any way to generate them.
OK, but I didn’t even know that “apt-cache search” existed until you just mentioned it. I like the idea of debian but I usually get stuck using CentOS. So I think that might count as evidence in favor of doing something in the ball park of what I suggested, just with the right filter on what’s included and excluded? Probably much of the value of the deck would be (1) figuring out which things are useful enough to know about that would seriously help someone if they forgot about them or were ignorant in the first place and (2) making those decisions available for other people to memorize without having to do all the prioritization and searching.
Also, that example seems to confirm the idea that some kind of overview of “library science and reference searching and help navigation” techniques would be really useful. When I punch “library” into Anki there are only two decks so far, one for the built in functions of python and one connecting locations on a map to their chinese name. Both come up because they mention a URL that has “library” in the directory structure rather than because they’re aimed at what I want.
So I suspect that no user of Anki with enough skills to make and publish decks has thought about these issues for very long? Or I’m on the wrong track with my thinking? Perhaps Anki can’t really improve on the XKCD problem solving flowchart? Hmmm...
My personal impression is that tools seem to follow a sort of exponent or power law where a few tools are used a ton of times, many are used only once or no times, and there’s a reasonable middle ground of things used from time to time.
Tools in the first category are so common that SRS offers nothing. I don’t need a flashcard to remember
ls
orcd
. Tools in the third category would be wasteful to put into SRS, per my previous comment about search being quick and faster than 5 minutes.The second category might be useful in SRS, but how do you generate them before-hand? I don’t think you can. Everyone uses a different 80%. That sort of list can only be generated with heuristics like ‘if I use a tool twice, then memorize it’ or in retrospect (‘I used tools x, y, and z with frequency n, and the total search time for each was >5 minutes’).
That article says that each person uses 20% of the tools, not 80%. If everyone used a different 80% that would seem to imply at least a 60% overlap in usage for two people, and at least a 40% overlap for three. Probably the overlaps would fall more or less slowly for different tools along the usage curve you proposed. It seems like there should be some way to at least estimate the expected value here...
Maybe you could check package download statistics and history data (or something?) to see which things are most installed and most used. If you had data from many people I bet you’d find clusters in usage space that could turn into different sorts of flash card decks to help a person “join” that cluster, or become capable of code switching between computer usage styles? Or another use might be to help someone who takes an N year hiatus from coding who doesn’t want to get too rusty at the keyboard?
AFAIK, the Popcons all heavily anonymize their data down to ‘installed or not’, and don’t include anything useful for clustering. (This is reasonable because with the dozens of thousands of packages and whatever power laws or distributions are involved, it’d only take a few idiosyncratic package installations to break privacy.)
So maybe clusters would be efficient enough—although keeping in mind my 5 minute rule and the point about it being very easy to search for programs, I still think it’s unlikely—but currently I don’t know of any way to generate them.