I think that conceptual alignment research of the sort that Johannes is doing (and that I also am doing, which I call “deconfusion”) is just really difficult. It involves skills that are not taught to people, that seems very unlikely that you’d learn by being mentored in traditional academia (including when doing theoretical CS or non-applied math PhDs), that I only started wrapping my head around after some mentorship from two MIRI researchers (that I believe I was pretty lucky to get), and even then I’ve spent a ridiculous amount of time by myself trying to tease out patterns to figure out a more systematic process of doing this.
Oh, and the more theoretical CS (and related math such as mathematical logic) you know, the better you probably are at this—see how Johannes tries to create concrete models of the inchoate concepts in his head? Well, if you know relevant theoretical CS and useful math, you don’t have to rebuild the mathematical scaffolding all by yourself.
I don’t have a good enough model of John Wentworth’s model for alignment research to understand the differences, but I don’t think I learned all that much from John’s writings and his training sessions that were a part of his MATS 4.0 training regimen, as compared to the stuff I described above.
What <mathematical scaffolding/theoretical CS> do you think I am recreating? What observations did you use to make this inference? (These questions are not intended to imply any subtext meaning.)
Well, if you know relevant theoretical CS and useful math, you don’t have to rebuild the mathematical scaffolding all by yourself.
I didn’t intend to imply in my message that you have mathematical scaffolding that you are recreating, although I expect it may be likely (Pearlian causality perhaps? I’ve been looking into it recently and clearly knowing Bayes nets is very helpful). I specifically used “you” to imply that in general this is the case. I haven’t looked very deep into the stuff you are doing, unfortunately—it is on my to-do list.
I wrote a bit about it in this comment.
I think that conceptual alignment research of the sort that Johannes is doing (and that I also am doing, which I call “deconfusion”) is just really difficult. It involves skills that are not taught to people, that seems very unlikely that you’d learn by being mentored in traditional academia (including when doing theoretical CS or non-applied math PhDs), that I only started wrapping my head around after some mentorship from two MIRI researchers (that I believe I was pretty lucky to get), and even then I’ve spent a ridiculous amount of time by myself trying to tease out patterns to figure out a more systematic process of doing this.
Oh, and the more theoretical CS (and related math such as mathematical logic) you know, the better you probably are at this—see how Johannes tries to create concrete models of the inchoate concepts in his head? Well, if you know relevant theoretical CS and useful math, you don’t have to rebuild the mathematical scaffolding all by yourself.
I don’t have a good enough model of John Wentworth’s model for alignment research to understand the differences, but I don’t think I learned all that much from John’s writings and his training sessions that were a part of his MATS 4.0 training regimen, as compared to the stuff I described above.
What <mathematical scaffolding/theoretical CS> do you think I am recreating? What observations did you use to make this inference? (These questions are not intended to imply any subtext meaning.)
I didn’t intend to imply in my message that you have mathematical scaffolding that you are recreating, although I expect it may be likely (Pearlian causality perhaps? I’ve been looking into it recently and clearly knowing Bayes nets is very helpful). I specifically used “you” to imply that in general this is the case. I haven’t looked very deep into the stuff you are doing, unfortunately—it is on my to-do list.