Preface to the Sequence on Factored Cognition
Factored Cognition is primarily studied by Ought, the same organization that was partially credited for implementing the interactive prediction feature. Ought is an organization with at least five members who have worked on the problem for several years. I am a single person who just finished a master’s degree. The rationale for writing about the topic anyway was to have diversity of approaches: Ought is primarily doing empirical work, whereas I’ve studied the problem under the lens of math and epistemic rationality. As far as I know, there is virtually no overlap between what I’ve written and what Ought has published so far.
Was it successful? Well, all I can say for sure is that writing the sequence has significantly changed my own views.
This sequence has two ‘prologue’ posts, which make points relevant for but not restricted to Factored Cognition. I think of them as posts #-2 and #-1 (then, this post is #0, and the proper sequence starts at #1). These are
A guide to Iterated Amplification and Debate, which explains what Factored Cognition is and the two schemes that use it. This post is there to make sure that no prerequisite knowledge is needed to read the sequence. You can skip this if you’re already familiar with both schemes.
Hiding Complexity, which is about characterizing what makes a part of a big problem a ‘subproblem’.
The remaining sequence is currently about 15000 words long, though this could change. The structure is roughly:
Define a mathematical model and see what we can do with that (posts #1-#2)
Tackle the human component: think seriously about how thinking works and whether solving hard problems with Factored Cognition looks feasible (posts #3-#5)
Spell out what I conclude from both parts (post #6)
The current version of the sequence includes exercises. This is pretty experimental, so if they are too hard or too easy, it’s probably my fault. I’ve still left them in because I generally think it makes sense to include ‘think about this thing for a bit’ moments. They look like this:
EXERCISE (5 SECONDS): Compute 2+5.
Whenever there’s a range, it means that the lower number is an upper bound for the exercise itself, and the remaining time is for rereading parts of this or previous posts. So 1-6 minutes means ‘you shouldn’t take more than 1 minute for the exercise itself, but you may first take about 5 minutes to reread parts of the post, or perhaps of previous posts’.
The sequence also contains conjectures. Conjectures are claims that I think are true, important, and not trivial. There are only a few of them, and they should all be justified by the sequence up to that point. Conjectures look like this:
I’ll aim for publishing one post per week, which gives me time for final edits. It could slow down since I’m still working on the second half. Questions/criticism is welcome.
Special thanks to TurnTrout for providing valuable feedback on much of the sequence.
Planned summary for the Alignment Newsletter:
Planned opinion:
This is an accurate summary, minus one detail:
“True or not” makes it sound symmetrical, but the choice is between ‘very confident that it’s true’ and ‘anything else’. Something like ’80% confident’ goes into the second category.
One thing I would like to be added is just that I come out moderately optimistic about Debate. It’s not too difficult for me to imagine the counter-factual world where I think about FC and find reasons to be pessimistic about Debate, so I take the fact that I didn’t as non-zero evidence.
Changed to “The judge decides the winner based on whether they can confidently verify the final statement or not.”
Added a line to the end of the summary:
Cool, thanks.
Re personal opinion: what is your take on the feasibility of human experiments? It seems like your model is compatible with IDA working out even though no-one can ever demonstrate something like ‘solve the hardest exercise in a textbook’ using participants with limited time who haven’t read the book.
Yeah, that seems right to me—I don’t really expect to see us solving hard exercises in a textbook with a small number of humans without any additional tricks. I don’t think Ought did either; from pretty early on they were talking about strategies for having larger trees, e.g. via automated decomposition strategies, or caching / memoization of strategies, possibly using ML.
In addition, I think Ought historically has pursued the strategy “try the thing that, if successful, would allow us to build a safety story”, rather than “try the thing that, if it fails, implies that factored cognition would not work out”, which is why they talk about particularly challenging tasks like solving the hardest exercise in a textbook.