Alex Flint comments on Builder/Breaker for Deconfusion

Alex Flint 30 Sep 2022 20:36 UTC
LW: 7 AF: 5
AF

This means that if you choose one item from the list and try to work on it, you can’t be very confident that your work eventually contributes to a robust plan.

This point is very interesting (and in my opinion accurate). I agree that Rohin’s and Evan’s plans point in the direction of possible robust breakdowns of the alignment problem. I also have the sense that to this day nobody has definitively broken down the alignment problem into even two absolutely separable sub-problems, in a way that has stood the test of time. I am taking “separable” to mean that someone can work on one subproblem without really considering the other subproblems almost at all. In the broader economy, it seems to be exactly these kind of breakdowns that have allowed humans to be so effective. I have the sense that something about the alignment problem is resistant to breaking it down in this way.