My answer isn’t sensitive to things like “how good are you at research” (I didn’t even express the sensitivity to “how much do you like reflecting” or “how old are you” which I think are more important). I guess probably the first order thing is that the ‘peak’ alignment researcher is more likely to be older and closer to death so investing somewhat less in getting better at things. (But the world changes and lives are long so I’m not sure it’s a huge deal.)
I probably wouldn’t set aside hours for improving rationality (/ am not exactly sure what it would entail). Seems generally good to go out of your way to do things right, to reflect on lessons learned from the things you did, to be willing to do (and slightly overinvest in) things that are currently hard in order to get better, and so on. Maybe I’d say that like 5-10% of time should be explicitly set aside for activities that just don’t really move you forward (like post-mortems or reflecting on how things are going in a way that’s clearly not going to pay itself off for this project) and a further 10-20% on doing things in ways that aren’t the very optimal way right now but useful for getting better at doing them in the future (e.g. using unfamiliar tools, getting more advice from people than would make sense if the world ended next week, being more methodical about how you approach problems).
I guess the other aspect of this is separating some kind of general improvement from more domain specific improvement (i.e. are the numbers above about improving rationality or just getting better at doing stuff?). I think stuff that feels vaguely like “rationality” in the sense of being about cognitive practices is most likely to always seem pretty tied up with the object level (even if it transfers), and the purely domain-general stuff is very likely to be about e.g. very general tools or a nicer chair or whatever. So maybe I don’t think there’s much improvement on the table that is about fully domain-general ways to think / which is best approached by starting from general principles rather than getting better at what you are currently doing.
Those numbers are all very made up. I’m unfortunately not an expert at being an excellent human. Over my whole career I’ve maybe averaged something like that 15-30%, though there have been times of significantly higher rates and times of significantly lower rates and I would have preferred to average it out.
How many hours per week should the average AI alignment researcher spend on improving their rationality? How should they spend those hours?
I want to know this question, but for the ‘peak’ alignment researcher.
My answer isn’t sensitive to things like “how good are you at research” (I didn’t even express the sensitivity to “how much do you like reflecting” or “how old are you” which I think are more important). I guess probably the first order thing is that the ‘peak’ alignment researcher is more likely to be older and closer to death so investing somewhat less in getting better at things. (But the world changes and lives are long so I’m not sure it’s a huge deal.)
I probably wouldn’t set aside hours for improving rationality (/ am not exactly sure what it would entail). Seems generally good to go out of your way to do things right, to reflect on lessons learned from the things you did, to be willing to do (and slightly overinvest in) things that are currently hard in order to get better, and so on. Maybe I’d say that like 5-10% of time should be explicitly set aside for activities that just don’t really move you forward (like post-mortems or reflecting on how things are going in a way that’s clearly not going to pay itself off for this project) and a further 10-20% on doing things in ways that aren’t the very optimal way right now but useful for getting better at doing them in the future (e.g. using unfamiliar tools, getting more advice from people than would make sense if the world ended next week, being more methodical about how you approach problems).
I guess the other aspect of this is separating some kind of general improvement from more domain specific improvement (i.e. are the numbers above about improving rationality or just getting better at doing stuff?). I think stuff that feels vaguely like “rationality” in the sense of being about cognitive practices is most likely to always seem pretty tied up with the object level (even if it transfers), and the purely domain-general stuff is very likely to be about e.g. very general tools or a nicer chair or whatever. So maybe I don’t think there’s much improvement on the table that is about fully domain-general ways to think / which is best approached by starting from general principles rather than getting better at what you are currently doing.
Those numbers are all very made up. I’m unfortunately not an expert at being an excellent human. Over my whole career I’ve maybe averaged something like that 15-30%, though there have been times of significantly higher rates and times of significantly lower rates and I would have preferred to average it out.