I’ve written something like six or seven personal wikis over the past decade. It’s actually an incredibly advanced form of procrastination1. At this point I’ve tried every possible design choice.
Lifecycle: I’ve built a few compiler-style wikis: plain-text files in a git repo statically compiled to HTML. I’ve built a couple using live servers with server-side rendering. The latest one is an API server with a React frontend.
Storage: I started with plain text files in a git repo, then moved to an SQLite database with a simple schema. The latest version is an avant-garde object-oriented hypermedia database with bidirectional links implemented on top of SQLite.
Markup: I used Markdown here and there. Then I built my own TeX-inspired markup language. Then I tried XML, with mixed results. The latest version uses a WYSIWYG editor made with ProseMirror.
And yet I don’t use them. Why? Building them was fun, sure, but there must be utility to a personal database.
At first I thought the problem was friction: the higher the activation energy to using a tool, the less likely you are to use it. Even a small amount of friction can cause me to go, oh, who cares, can’t be bothered. So each version gets progressively more frictionless2. The latest version uses a WYSIWYG editor built on top of ProseMirror (it took a great deal for me to actually give in to WYSIWYG). It also has a link to the daily note page, to make journalling easier. The only friction is in clicking the bookmark to localhost:5000. It is literally two clicks to get to the daily note.
And yet I still don’t use it. Why? I’m a great deal more organized now than I was a few years ago. My filesystem is beautifully structured and everything is where it should be. I could fill out the contents of a personal wiki.
I’ve come to the conclusion that there’s no point: because everything I can do with a personal wiki I can do better with a specialized app, and the few remaining use cases are useless. Let’s break it down.
I’ve tried three different times to create a personal wiki, using the last one for a solid year and a half before finally giving up and just defaulting to a janky combination of Notion and Google Docs/Sheets, seduced by sites like Cosma Shalizi’s and Gwern’s long content philosophy (emphasis mine):
… I have read blogs for many years and most blog posts are the triumph of the hare over the tortoise. They are meant to be read by a few people on a weekday in 2004 and never again, and are quicklyabandoned—and perhaps as Assange says, not a moment too soon. (But isn’t that sad? Isn’t it a terrible ROI for one’s time?) On the other hand, the best blogs always seem to be building something: they are rough drafts—works in progress15. So I did not wish to write a blog. Then what? More than just “evergreen content”, what would constitute Long Content as opposed to the existing culture of Short Content? How does one live in a Long Now sort of way?16
My answer is that one uses such a framework to work on projects that are too big to work on normally or too tedious. (Conscientiousness is often lacking online or in volunteer communities18 and many useful things go undone.) Knowing your site will survive for decades to come gives you the mental wherewithal to tackle long-term tasks like gathering information for years, and such persistence can be useful19—if one holds onto every glimmer of genius for years, then even the dullest person may look a bit like a genius himself20. (Even experienced professionals can only write at their peak for a few hours a day—usually first thing in the morning, it seems.) Half the challenge of fighting procrastination is the pain of starting—I find when I actually get into the swing of working on even dull tasks, it’s not so bad. So this suggests a solution: never start. Merely have perpetual drafts, which one tweaks from time to time. And the rest takes care of itself.
Fernando unbundles the use cases of a tool for thought in his essay; I’ll just quote the part that resonated with me:
The following use cases are very naturally separable: …
Learning: if you’re studying something, you can keep your notes in a TfT. This is one of the biggest use cases. But the problem is never note-taking, but reviewing notes. Over the years I’ve found that long-form lecture notes are all but useless, not just because you have to remember to review them on a schedule, but because spaced repetition can subsume every single lecture note. It takes practice and discipline to write good spaced repetition flashcards, but once you do, the long-form prose notes are themselves redundant.
(Tangentially, an interesting example of how comprehensively subsuming spaced repetition is is Michael Nielsen’s Using spaced repetition systems to see through a piece of mathematics, in which he describes how he used “deep Ankification” to better understand the theorem that a complex normal matrix is always diagonalizable by a unitary matrix, as an illustration of a heuristic one could use to deepen one’s understanding of a piece of mathematics in an open-ended way, inspired by Andrey Kolmogorov’s essay on, of all things, the equals sign. I wish I read that while I was still studying physics in school.)
Fernando, emphasis mine:
So I often wonder: what do other people use their personal knowledge bases for? And I look up blog and forum posts where Obsidian and Roam power users explain their setup. And most of what I see is junk. It’s never the Zettelkasten of the next Vannevar Bush, it’s always a setup with tens of plugins, a daily note three pages long that is subdivided into fifty subpages recording all the inane minutiae of life. This is a recipe for burnout.
People have this aspirational idea of building a vast, oppressively colossal, deeply interlinked knowledge graph to the point that it almost mirrors every discrete concept and memory in their brain. And I get the appeal of maximalism. But they’re counting on the wrong side of the ledger. Every node in your knowledge graph is a debt. Every link doubly so. The more you have, the more in the red you are. Every node that has utility—an interesting excerpt from a book, a pithy quote, a poem, a fiction fragment, a few sentences that are the seed of a future essay, a list of links that are the launching-off point of a project—is drowned in an ocean of banality. Most of our thoughts appear and pass away instantly, for good reason.
Minimizing friction is surprisingly difficult. I keep plain-text notes in a hierarchical editor (cherrytree), but even that feels too complicated sometimes. This is not just about the tool… what you actually need is a combination of the tool and the right way to use it.
(Every tool can be used in different ways. For example, suppose you write a diary in MS Word. There are still options such as “one document per day” or “one very long document for all”, and things in between like “one document per month”, which all give different kinds of friction. The one megadocument takes too much time to load. It is more difficult to search in many small documents. Or maybe you should keep your current day in a small document, but once in a while merge the previous days into the megadocument? Or maybe switch to some application that starts faster than MS Word?)
Forgetting is an important part. Even if you want to remember forever, you need some form of deprioritizing. Something like “pages you haven’t used for months will get smaller, and if you search for keywords, they will be at the bottom of the result list”. But if one of them suddenly becomes relevant again, maybe the connected ones become relevant, too? Something like associations in brain. The idea is that remembering the facts is only a part of the problem; making the relevant ones more accessible is another. Because searching in too much data is ultimately just another kind of friction.
It feels like a smaller version of the internet. Years ago, the problem used to be “too little information”, now the problem is “too much information, can’t find the thing I actually want”.
Perhaps a wiki, where the pages could get flagged as “important now” and “unimportant”? Or maybe, important for a specific context? And by default, when you choose a context, you would only see the important pages, and the rest of that only if you search for a specific keyword or follow a grey link. (Which again would require some work creating and maintaining the contexts. And that work should also be as frictionless as possible.)
@dkl9 wrote a very eloquent and concise piece arguing in favor of ditching “second brain” systems in favor of SRSs (Spaced Repetition Systems, such as Anki).
Try as you might to shrink the margin with better technology, recalling knowledge from within is necessarily faster and more intuitive than accessing a tool. When spaced repetition fails (as it should, up to 10% of the time), you can gracefully degrade by searching your SRS’ deck of facts.
If you lose your second brain (your files get corrupted, a cloud service shuts down, etc), you forget its content, except for the bits you accidentally remember by seeing many times. If you lose your SRS, you still remember over 90% of your material, as guaranteed by the algorithm, and the obsolete parts gradually decay. A second brain is more robust to physical or chemical damage to your first brain. But if your first brain is damaged as such, you probably have higher priorities than any particular topic of global knowledge you explicitly studied.
I write for only these reasons:
to help me think
to communicate and teach (as here)
to distill knowledge to put in my SRS
to record local facts for possible future reference
Linear, isolated documents suffice for all those purposes. Once you can memorise well, a second brain becomes redundant tedium.
I like to think of learning and all of these things as self-contained smaller self-contained knowledge trees. Building knowledge trees that are cached, almost like creatin zip files and systems where I store a bunch of zip files similar to what Elizier talks about in The Sequences.
Like when you mention the thing about Nielsen on linear algebra it opens up the entire though tree there. I might just get the association to something like PCA and then I think huh, how to ptimise this and then it goes to QR-algorithms and things like a householder matrix and some specific symmetric properties of linear spaces...
If I have enough of these in an area then I might go back to my anki for that specific area. Like if you think from the perspective of schedulling and storage algorithms similar to what is explored in algorithms to live by you quickly understand that the magic is in information compression and working at different meta-levels. Zipped zip files with algorithms to expand them if need be. Dunno if that makes sense, agree with the exobrain creep that exists though.
Unbundling Tools for Thought is an essay by Fernando Borretti I found via Gwern’s comment which immediately resonated with me (emphasis mine):
I’ve tried three different times to create a personal wiki, using the last one for a solid year and a half before finally giving up and just defaulting to a janky combination of Notion and Google Docs/Sheets, seduced by sites like Cosma Shalizi’s and Gwern’s long content philosophy (emphasis mine):
Fernando unbundles the use cases of a tool for thought in his essay; I’ll just quote the part that resonated with me:
(Tangentially, an interesting example of how comprehensively subsuming spaced repetition is is Michael Nielsen’s Using spaced repetition systems to see through a piece of mathematics, in which he describes how he used “deep Ankification” to better understand the theorem that a complex normal matrix is always diagonalizable by a unitary matrix, as an illustration of a heuristic one could use to deepen one’s understanding of a piece of mathematics in an open-ended way, inspired by Andrey Kolmogorov’s essay on, of all things, the equals sign. I wish I read that while I was still studying physics in school.)
Fernando, emphasis mine:
Minimizing friction is surprisingly difficult. I keep plain-text notes in a hierarchical editor (cherrytree), but even that feels too complicated sometimes. This is not just about the tool… what you actually need is a combination of the tool and the right way to use it.
(Every tool can be used in different ways. For example, suppose you write a diary in MS Word. There are still options such as “one document per day” or “one very long document for all”, and things in between like “one document per month”, which all give different kinds of friction. The one megadocument takes too much time to load. It is more difficult to search in many small documents. Or maybe you should keep your current day in a small document, but once in a while merge the previous days into the megadocument? Or maybe switch to some application that starts faster than MS Word?)
Forgetting is an important part. Even if you want to remember forever, you need some form of deprioritizing. Something like “pages you haven’t used for months will get smaller, and if you search for keywords, they will be at the bottom of the result list”. But if one of them suddenly becomes relevant again, maybe the connected ones become relevant, too? Something like associations in brain. The idea is that remembering the facts is only a part of the problem; making the relevant ones more accessible is another. Because searching in too much data is ultimately just another kind of friction.
It feels like a smaller version of the internet. Years ago, the problem used to be “too little information”, now the problem is “too much information, can’t find the thing I actually want”.
Perhaps a wiki, where the pages could get flagged as “important now” and “unimportant”? Or maybe, important for a specific context? And by default, when you choose a context, you would only see the important pages, and the rest of that only if you search for a specific keyword or follow a grey link. (Which again would require some work creating and maintaining the contexts. And that work should also be as frictionless as possible.)
@dkl9 wrote a very eloquent and concise piece arguing in favor of ditching “second brain” systems in favor of SRSs (Spaced Repetition Systems, such as Anki).
I like to think of learning and all of these things as self-contained smaller self-contained knowledge trees. Building knowledge trees that are cached, almost like creatin zip files and systems where I store a bunch of zip files similar to what Elizier talks about in The Sequences.
Like when you mention the thing about Nielsen on linear algebra it opens up the entire though tree there. I might just get the association to something like PCA and then I think huh, how to ptimise this and then it goes to QR-algorithms and things like a householder matrix and some specific symmetric properties of linear spaces...
If I have enough of these in an area then I might go back to my anki for that specific area. Like if you think from the perspective of schedulling and storage algorithms similar to what is explored in algorithms to live by you quickly understand that the magic is in information compression and working at different meta-levels. Zipped zip files with algorithms to expand them if need be. Dunno if that makes sense, agree with the exobrain creep that exists though.