I’m curious exactly what you meant by “first order”.
Just that the trade-off is only present if you think of “individual rationality” as “let’s forget that I’m part of a community for a moment”. All things considered, there’s just rationality, and you should do what’s optimal.
First-order: Everyone thinks that maximizing insight production means doing IDA* over idea tree. Second-order: Everyone notices that everyone will think that, so it’s no longer optimal for maximizing insights produces overall. Everyone wants to coordinate with everyone else in order to parallelize their search (assuming they care about the total sum of insights produced). You can still do something like IDA* over your sub-branches.
This may have answered some of your other questions. Assuming you care about the alignment problem being solved, maximizing your expected counterfactual thinking-contribution means you should coordinate with your research community.
And, as you note, maximizing personal credit is unaligned as a separate matter. But if we’re all motivated by credit, our coordination can break down by people defecting to grab credit.
How much should you focus on reading what other people do, vs doing your own things?
This is not yet at practical level, but: Let’s say we want to approach something like a community-wide optimal trade-off between exploring and exploiting, and we can’t trivially check what everyone else is up to. If we think the optimum is something obviously silly like “75% of researchers should Explore, and the rest should Exploit,” and I predict that 50% of researchers will follow the rule I follow, and all the uncoordinated researchers will all Exploit, then it is rational for me to randomize my decision with a coinflip.
It gets newcomblike when I can’t check, but I can still follow a mix that’s optimal given an expected number of cooperating researchers and what I predict they will predict in turn. If predictions are similar, the optimum given those predictions is a Schelling point. Of course, in the real world, if you actually had important practical strategies for optimizing community-level research strategies, you would just write it up and get everyone to coordinate that way.
I worry for people who are only reading other people’s work, like they have to “catch up” to everyone else before they have any original thoughts of their own.
You touch on many things I care about. Part (not the main part) of why I want people to prioritize searching neglected nodes more is because Einstellung is real. Once you’ve got a tool in your brain, you’re not going to know how to not use it, and it’ll be harder to think of alternatives. You want to increase your chance of attaining neglected tools and perspectives to attack long-standing open problems with. After all, if the usual tools were sufficient, why are they long-standing open problems? If you diverge from the most common learning paths early, you’re more likely to end up with a productively different perspective.
It’s too easy to misunderstand the original purpose of the question, and do work that technically satisfies it but really doesn’t do what was wanted in a broader context.
Just that the trade-off is only present if you think of “individual rationality” as “let’s forget that I’m part of a community for a moment”. All things considered, there’s just rationality, and you should do what’s optimal.
First-order: Everyone thinks that maximizing insight production means doing IDA* over idea tree. Second-order: Everyone notices that everyone will think that, so it’s no longer optimal for maximizing insights produces overall. Everyone wants to coordinate with everyone else in order to parallelize their search (assuming they care about the total sum of insights produced). You can still do something like IDA* over your sub-branches.
This may have answered some of your other questions. Assuming you care about the alignment problem being solved, maximizing your expected counterfactual thinking-contribution means you should coordinate with your research community.
And, as you note, maximizing personal credit is unaligned as a separate matter. But if we’re all motivated by credit, our coordination can break down by people defecting to grab credit.
This is not yet at practical level, but: Let’s say we want to approach something like a community-wide optimal trade-off between exploring and exploiting, and we can’t trivially check what everyone else is up to. If we think the optimum is something obviously silly like “75% of researchers should Explore, and the rest should Exploit,” and I predict that 50% of researchers will follow the rule I follow, and all the uncoordinated researchers will all Exploit, then it is rational for me to randomize my decision with a coinflip.
It gets newcomblike when I can’t check, but I can still follow a mix that’s optimal given an expected number of cooperating researchers and what I predict they will predict in turn. If predictions are similar, the optimum given those predictions is a Schelling point. Of course, in the real world, if you actually had important practical strategies for optimizing community-level research strategies, you would just write it up and get everyone to coordinate that way.
You touch on many things I care about. Part (not the main part) of why I want people to prioritize searching neglected nodes more is because Einstellung is real. Once you’ve got a tool in your brain, you’re not going to know how to not use it, and it’ll be harder to think of alternatives. You want to increase your chance of attaining neglected tools and perspectives to attack long-standing open problems with. After all, if the usual tools were sufficient, why are they long-standing open problems? If you diverge from the most common learning paths early, you’re more likely to end up with a productively different perspective.
I’ve taken to calling this “bandwidth”, cf. Owen Cotton-Barratt.