TL;DR Having a good research track record is some evidence of good big-picture takes, but it’s weak evidence. Strategic thinking is hard, and requires different skills. But people often conflate these skills, leading to excessive deference to researchers in the field, without evidence that that person is good at strategic thinking specifically. I certainly try to have good strategic takes, but it’s hard, and you shouldn’t assume I succeed!
Introduction
I often find myself giving talks or Q&As about mechanistic interpretability research. But inevitably, I’ll get questions about the big picture: “What’s the theory of change for interpretability?”, “Is this really going to help with alignment?”, “Does any of this matter if we can’t ensure all labs take alignment seriously?”. And I think people take my answers to these way too seriously.
These are great questions, and I’m happy to try answering them. But I’ve noticed a bit of a pathology: people seem to assume that because I’m (hopefully!) good at the research, I’m automatically well-qualified to answer these broader strategic questions. I think this is a mistake, a form of undue deference that is both incorrect and unhelpful. I certainly try to have good strategic takes, and I think this makes me better at my job, but this is far from sufficient. Being good at research and being good at high level strategic thinking are just fairly different skillsets!
But isn’t someone being good at research strong evidence they’re also good at strategic thinking? I personally think it’s moderate evidence, but far from sufficient. One key factor is that a very hard part of strategic thinking is the lack of feedback. Your reasoning about confusing long-term factors need to extrapolate from past trends and make analogies from things you do understand better, and it can be quite hard to tell if what you’re saying is complete bullshit or not. In an empirical science like mechanistic interpretability, however, you can get a lot more feedback. I think there’s a certain kind of researcher who thrives in environments where they can get lots of feedback, but has much worse performance in domains without, where they e.g. form bad takes about the strategic picture and just never correct them because there’s never enough evidence to convince them otherwise. It’s just a much harder and rarer skill set to be good at something in the absence of good feedback.
Having good strategic takes is hard, especially in a field as complex and uncertain as AGI Safety. It requires clear thinking about deeply conceptual issues, in a space where there are many confident yet contradictory takes, and a lot of superficially compelling yet simplistic models. So what does it take?
Factors of Good Strategic Takes
As discussed above, ability to think clearly about thorny issues is crucial, and is a rare skill that is only somewhat used in empirical research. Lots of research projects I do feel more like plucking the low hanging fruit. I do think someone doing ground-breaking research is better evidence here, like Chris Olah’s original circuits work, especially if done multiple times (once could just be luck!). Though even then, it’s evidence of the ability to correctly pursue ambitious research goals, but not necessarily to identify which ones will actually matter come AGI.
Domain knowledge of the research area is important. However, the key thing is not necessarily deep technical knowledge, but rather enough competence to tell when you’re saying something deeply confused. Or at the very least, enough ready access to experts that you can calibrate yourself. You also need some sense of what the technique is likely to eventually be capable of and what limitations it will face.
But you don’t necessarily need deep knowledge of all the recent papers so you can combine all the latest tricks. Being good at writing inference code efficiently or iterating quickly in a Colab notebook—these skills are crucial to research but just aren’t that relevant to strategic thinking, except insofar as they potentially build intuitions.
Time spent thinking about the issue definitely helps, and correlates with research experience. Having my day job be hanging out with other people who think about the AGI safety problem is super useful. Though note that people’s opinions are often substantially reflections of the people they speak to most, rather than what’s actually true.
It’s also useful to just know what people in the field believe, so I can present an aggregate view—this is something where deferring to experienced researchers makes sense.
I think there’s also diverse domain expertise that’s needed for good strategic takes that isn’t needed for good research takes, and most researchers (including me) haven’t been selected for having, e.g.:
A good understanding of what the capabilities and psychology of future AI will look like
Economic and political situations likely to surround AI development—e.g. will there be a Manhattan project for AGI?
What kind of solutions are likely to be implemented by labs and governments – e.g. how much willingness will there be to pay an alignment tax?
The economic situation determining which labs are likely to get there first
Whether it’s sensible to reason about AGI in terms of who gets there first, or as a staggered multi-polar thing where there’s no singular “this person has reached AGI and it’s all over” moment
The comparative likelihood for x-risk to come from loss of control, misuse, accidents, structural risks, all of the above, something we’re totally missing, etc.
And many, many more
Conclusion
Having good strategic takes is important, and I think that researchers, especially those in research leadership positions, should spend a fair amount of time trying to cultivate them, and I’m trying to do this myself. But regardless of the amount of effort, there is a certain amount of skill required to be good at this, and people vary a lot in this skill.
Going forwards, if you hear someone’s take about the strategic picture, please ask yourself, “What evidence do I have that this person is actually good at the skill of strategic takes?” And don’t just equivocate this with them having written some impressive papers!
Practically, I recommend just trying to learn about lots of people’s views, aim for deep and nuanced understanding of them (to the point that you can argue them coherently to someone else), and trying to reach some kind of overall aggregated perspective. Trying to form your own views can also be valuable, though I think also somewhat overrated.
Thanks to Jemima Jones for poking me to take agency and write a blog post for the first time in forever.
Strong upvote (both as object-level support and for setting a valuable precedent) for doing the quite difficult thing of saying “You should see me as less expert in some important areas than you currently do.”
Nice post! As someone who spends a lot of time in AI policy on strategic thought and talking to people who I think are amongst the best strategic thinkers on AI, I appreciated this piece and think you generally describe the skills pretty well.
However, you say “research” skill by default does not lead to strategic skill, which is very true, but this varies drastically depending on the type of research! Mechanistic interpretability, in fact, appears to me to be an example of a field which is so in the weeds empirical with good feedback loops, that it makes it much harder for researchers in this field to learn better strategic thinking. Other research fields with slower feedback loops are different—for example, societal impacts of AI research. More broadly, I think many fields of social science train strategic skill well, and some of the best political science thinkers clearly have significant strategic skill: Fukuyama, James C. Scott, Dominic Cummings, etc.
I made an attempt to brainstorm ways to evaluate strategic skill based on the abilities of the best thinkers I know, and came up with a list of characteristics I think it is correlated with:
awareness of both the state of and fundamental principles behind politics and power. A recent example I was discussing with friends: if you read the transcript of the leaked Yemen signal group chat and could immediately write a 2 page doc with likely guesses of the power dynamics between the participants implied by their communication and roles, you probably have some of this awareness.
scenario-based thinking: the ability to cleanly split the future into distinct scenarios, reason about each independently, effectively simulating the different actors involved (which makes me wonder how much one could accelerate this with LLMs, c.f. simulator theory), before comparing different scenarios based on a small number of key drivers.
zooming in and zooming out: an ability to not get lost in the details, and put their focus where it is useful, whilst still being aware of the details or having the skill to dig in if necessary.
generalist knowledge: their skills are usually T shaped, rather than I shaped. They may have read widely across law/econ/political theory/philosophy/sociology/international relations/evo psych or may have background in different worlds (finance, startup ecosystem, defense contracting/intelligence community, etc). This generalism helps in many ways, but one of the biggest is simply that because much of the thinking in any close intellectual community is highly correlated, since everyone reads the same things, you need some source of fresh signal to come up with significant new insights.
an aptitude for the skill of translation, meaning translation of information and beliefs between frames/perspectives/ways of thinking—something that heavily overlaps with an ability to shape and tell believable narratives. This is the core skill you need to explain, for example, why compute is a useful lever for governing AI to politicians who don’t know what a GPU is, but is also very useful to consider and discard lenses & framings on strategic problems until you can find the one which most cleanly carves up the situation.
an ability, similar to startup founders, to tolerate everyone thinking they are wrong (correlated with disagreeableness). Often, an exceptional strategic insight backchains to actions which look like they have a weak theory of change to everyone else, which means that good thinkers are often in the position of having to explain themselves.
context: having both the ability (through reading widely) and position (i.e. access to the thinking of significant figures in the field) to be high context on the big strategic problems. Context is that which is scarce (Cowen). I say context instead of seniority, because they are not quite the same, although strongly correlated—I occasionally meet a number of junior people who clearly are unusually high context for their experience level and have good strategic takes, which is one of the ways I spot promising talent.
Finally, I do notice a lot of those I think have the best strategic thought often use lenses and framings inspired by systems thinking, social evolution/selection processes, memetics, biology, and other similar ways of viewing society and human behavior.
Interesting. Thanks for the list. That seemed like a pretty reasonable breakdown to me. I think mechanistic interpretability does train some of them in particular, two, three and maybe six. But I agree that things involve thinking about society and politics and power and economics etc as a whole do seem clearly more relevant.
One major concern I have is that it’s hard to judge skill in domains with worse feedback loops because there is not feedback on who is correct. I’m curious how confident you are in your assessment of who has good takes or is good in these fields, and how you determine this?
I guess that’s the main element I didn’t mention: many people on this forum would suggest judging via predictive skill/forecasting success. I think this is an ok heuristic, but of course the long time horizons involved in many strategic questions makes it hard to judge (and Tetlock has documented the problems with forecasting over long time horizons where these questions matter most).
Mostly, the people I think of as having strong strategic skill are closely linked to some political influence (which implicitly requires this skill to effect change) such as attaining a senior govt position, being influential over the Biden EO/export controls, UK govt AI efforts, etc. Alternatively, they are linked to some big major idea in governance or technical safety, often by spotting something missing years before it became relevant.
Often by interacting regularly with good thinkers you can get a sense that they have stronger mental models for trends and the levers controlling trends than others, but concrete judgement is sometimes extremely difficult until a key event has passed and we can judge in hindsight (especially about very high level trends such as Mearsheimer’s disputed take on the causes of the Ukraine invasion, Fukuyama’s infamous “end of history” prediction, or even Pinker’s “Better Angels of Our Nature” predictions about continually declining global conflict).
Political influence seems a very different skill to me? Lots of very influential politicians have been very incompetent in other real world ways
This is just a special case (and an unusually important one) of a good forecasting record, right?
I suppose I mean influence over politics, policy, or governance (this is very high level since these are all distinct and separable), rather than actually being political necessarily. I do think there are some common skills, but actually being a politician weighs so many other factors more heavily that the strategic skill is not selected on very strongly at all. Being a politician’s advisor, on the other hand...
Yes, it’s a special case, but importantly one that is not evaluated by Brier score or Manifold bucks.
A few points:
Knowing a research field well makes it easier to assess how much other people know about it. For example, if you know ML, you sometimes notice that someone clearly doesn’t know what they’re talking about (or conversely, you become impressed by the fact that they clearly do know what they’re talking about). This is helpful when deciding who to defer to.
If you are a prominent researcher, you get more access to confidential/sensitive information and the time of prestigious people. This is true regardless of whether your strategic takes are good, and generally improves your strategic takes.
One downside is that people try harder to convince you of stuff. I think that being a more prominent researcher is probably overall net positive despite this effect.
IMO, one way of getting a sense of whether someone’s strategic takes are good is to ask them whether they try hard to have strategic takes. A lot of people will openly tell you that they don’t focus on that, which makes it easier for you to avoid deferring to random half-baked strategic takes that they say without expecting anyone to take them too seriously.
Curated. I think this is a pretty important point. I appreciate Neel’s willigness to use himself as an example.
I do think this leaves us with the important followup questions of “okay, but, how actually DO we evaluate strategic takes?”. A lot of people who are in a position to have demonstrated some kind of strategic awareness are people who are also some kind of “player” on the gameboard with an agenda, which means you can’t necessarily take their statements at face value as an epistemic claim.
Thanks!
Yeah, I don’t have a great answer to this one. I’m mostly trying to convey the spirit of: we’re all quite confused, and the people who seem competent disagree a lot, so they can’t actually be that correct. And given that the ground truth is confusion, it is epistemically healthier to be aware of this.
Actually solving these problems is way harder! I haven’t found a much better substitute than looking at people who have a good non-trivial track record of predictions, and people who have what to me seem like coherent models of the world that make legitimate and correct seeming predictions. Though the latter one is fuzzier and has a lot more false positives. A particularly salient form of a good track record is people who had positions in domains I know well (eg interpretability) that I previously thought were wrong/ridiculous, but who I later decided were right (eg I give Buck decent points here, and also a fair amount of points to Chris Olah)
Also, if you’re asking a panel of people, even those skilled at strategic thinking will still be useless unless they’ve thought deeply about the particular question or adjacent ones. And skilled strategic thinkers can get outdated quickly if they haven’t thought seriously about the problem in awhile.
I’m not trying to agree with that one. I think that if someone has thought a bunch about the general topic of AI and has a bunch of useful takes. They can probably convert this on the fly to something somewhat useful, even if it’s not as reliable as it would be if they spent a long time thinking about it. Like I think I can give useful technical mechanistic interpretability takes even if the question is about topics I’ve not spent much time thinking about before
yeah there’s generalization, but I do thing that eg (AGI technical alignment strategy, AGI lab and government strategy, AI welfare, AGI capabilities strategy) are sufficiently different that experts at one will be significantly behind experts on the others
Thanks for writing this post. I agree with the sentiment but feel it important to highlight that it is inevitable that people assume you have good strategy takes.
In Monty Python’s “Life of Brian” there is a scene in which the titular character finds himself surrounded by a mob of people declaring him the Mesiah. Brian rejects this label and flees into the desert, only to find himself standing in a shallow hole, surrounded by adherents. They declare that his reluctance to accept the title is further evidence that he really is the Mesiah.
To my knowledge nobody thinks that you are the literal Messiah but plenty of people going into AI Safety are heavily influenced by your research agenda. You work at Deepmind and have mentored a sizeable number of new researchers through MATS. 80,000 Hours lists you as example of someone with a successful career in Technical Alignment research.
To some, the fact that you request people not to blindly trust your strategic judgement is evidence that you are humble, grounded and pragmatic, all good reasons to trust your strategic judgement.
It is inevitable that people will view your views on the Theory of Change for Interpretability as aithoritative. You could literally repeat this post verbatim at the end of every single AI safety/interpretability talk you give, and some portion of junior researchers will still leave the talk defering to your strategic judgement.
Yes, I agree. It’s very annoying for general epistemics (though obviously pragmatically useful to me in various ways if people respect my opinion)
Though, to be clear, my main goal in writing this post was not to request that people defer less to me specifically, but more to make the general point about please defer more intelligently using myself as an example and to avoid calling any specific person out
Consistently give terrible strategic takes, so people learn not to defer you.
Hey Neel, I’ve heard you make similar remarks informally at talks or during Q&A sessions in past in-person panels and events, and it’s great that you’ve written them up so that they’re available in a nuanced format to a broader audience. I agree with the points you’ve made, but have a slightly different perspective on how it connects to the example of people asking for your strategic takes specifically, which I’ll share below (without presumption).
TL;DR: “Good strategic takes are hard to measure; but status is easy to recognize”.
I. Executive Summary
People aren’t necessarily confusing research prowess with strategic insight. Rather, they recognize you as having achieved elite social standing within the field of AI more broadly and want:
Access to perceived insider knowledge
Connection to high-status individuals
The latest thinking from those “in the room”
II. Always Use The Best Introduction
Before reading this post, I believed that the median person asking these questions was motivated by your impressive academic performance during your undergraduate studies, something that can be (over)simplified to “wow, this guy studied pure math at Cambridge and ranked top of his class, he’s one of the smartest people in the world, and smart people are correct about lots of things, he might have a correct answer to this question I have!”. I’m quite embarrassed to admit that this is pretty much what was going through my head when I attended a session you were holding during EAG last year, and I wouldn’t be surprised if others there were thinking that too.
Similarly along those lines, I recall reaching out to one of your former mentees for a 1:1 thinking, “wow, this guy studied computer science at Cambridge and ranked top of his class, he’s one of the smartest people in the world, and smart people are correct about lots of things!”. I also took the time to read his dissertation, and found it interesting, but that first impression mattered a lot more than it should have. An analogy is that when people are selecting the model to use for a task, they want to use the best model for that task. But if a model takes the top spot on the leaderboard where test scores are easy to measure, then that tends to mess with human psychology which irrationally pattern matches and assumes generalization across every possible task.
III. My Key Takeaway:
My key takeaway was that although that this winner-take-all dynamic may have played one factor, your model assigns more weightage towards the work you’ve done after graduating and pioneering the field of mechinterp.
IV. Credentials vs. Accomplishments
To be clear, founding mechinterp is a greater accomplishment than any formal credential. But even though teams of researchers at frontier labs are working on this agenda, it’s not mainstream yet (just take a look at mechinterp.com), whereas the handle of “math/cs genius” is generic enough as a concept to be legible to the average person. The arguments in your post about research being an empirical science requiring skills not especially relevant to strategy are locally valid, but these points are the furthest thing from the mind of those waiting in line at conferences to ask what your p(doom) is.
V. The Tyranny of the Marginal Spice Jar
Often the demands placed upon us by our environment play an instrumental role in shaping our skillset, because we adapt against the pressures placed upon us. I’m thankfully not in a leadership position where the role calls for executive project management decisions which require a solid understanding of the broader field and industry. I’m also grateful that I’m not a public figure with a reputation to maintain whose every move is open to scrutiny and close examination. I also understand that blog posts aren’t meant to be epistemically bulletproof.
I think that it’s true that when the people you speak with the most (e.g work colleagues or MATS scholars) ask you about your thoughts, their respect is based on the merits of the technical research you’ve published. And in general, when anyone publishes great AI research, then that does inspire interest in that person’s AI takes.
VI. Unnecessarily Skippable Digression Into Social Bubbles and Selection Effects
Your social circle is heavily filtered by a competitive application process which strongly selects for predicted ability to do quality research. This can distort intuitions around the prevalence of certain traits which are not as well represented in the common population. For example, authoring code or research papers requires to some extent that your brain is adapted for processing text content, the implications of which I haven’t seen discussed in depth anywhere on lesswrong. If someone expresses a strong preference for reading above watching a video when both options are available, it’s almost like a secret handshake, because so many cracked engineers have told me this that it’s become a green flag. In this world, entertainment culture and information transfer happens from books, web novels, articles, etc.
There’s an entirely separate world occupied by someone with the opposite preference, i.e wanting to watching a video above reading text when both options are available, an example secret handshake for that is when my Uber driver tells me that they’re cutting down on instagram. I admit this is a shallow heuristic but it’s become a red flag I watch out for indicating a potential vulnerability to predatory social media dark patterns or television binge-watching. It’s not an issue of self-control, people in the first group need to apply cognitive effort to pick things up from videos, but might have difficulties setting aside an engaging fantasy web serial. Most treatment of this topic I’ve seen addresses the second group, which feels alienating to me, as if there’s this ongoing dimorphism between producers and users of consumer software.
I’m typically skeptical of “high IQ bubble” typed arguments since they tend to prove too much, so I’ll make a more specific point. I agree with you that within these groups, conflation between perceived research skills and strategic skill does occur. My (minor) contention is that I don’t think that this particular mistake is the one being made by the average person asking a speaker about their strategic takes at the end of a talk.
VII. Main Argument: Research Takes?! What Research Takes?!!
Like, these sort of questions aren’t just being fielded by researchers in the field, you know. Why do people ask random celebrities and movie stars about their takes on geopolitics? Are they genuinely conflating acting skill with strategic skill? What about pro athletes? Is physical skill being conflated with strategic skill too? Do you believe that if a rich heiress with no research background was giving a talk about AI risk, that no one in the audience would be interested in her big picture takes? It makes no sense. Other comments have pointed this out already, so I’m sorry about adding another rant to the pile, but there exists a simpler explanation which does a better job of tracking reality!
The missing ingredient here is clout.
Various essays go into the relationship between competence and power, but what you’re describing as “research skill” can be renamed expertise. These folks aren’t mistaking you for someone high in “strategic skill”, instead they are making the correct inference that you are an elite. They want in on the latest gossip behind the waitlist at the exclusive private social where frontier lab employees are joking around about what name they’ll use for tomorrow’s new model. They’re holding their breath waiting for invention and hyperstition and self-fulfilling prophecy. They want to know the story of how Elon Musk will save the U.S AISI and call it xAISI.
IX. Concluding Apologetic Remarks
I’m not sure if this was an aim for the above post, but it’s an understandable impulse to want to distance oneself from scenes where it’s easier to find elites (good strategic takes) than experts (good research takes), because there can be a certain culture attached which often fails to act in a way that consistently upholds virtuous truth-seeking.
Overall, I think that taking a public stance can warp the landscape being describing in ways that are hard to predict, and appreciate your approach here compared to the influencer extreme of “my strategic takes are all great, the best, and bigly” versus the corporate extreme of “oh there are so many great takes, how could I pick one, great takes, thanks all”. The position of “yeah I’ve got takes but chill they’re mid” is a reasonable midpoint, and it would be nice to have people defer more intelligently in general.
Excellent points on the distinct skillset needed for strategy, Neel. Tackling the strategic layer, especially concerning societal dynamics under ASI influence where feedback is poor, is indeed critical and distinct from technical research.
Applying strategic thinking beyond purely technical alignment, I focused on how societal structure itself impacts the risks and stability of long-term human-ASI coexistence. My attempt to design a societal framework aimed at mitigating those risks resulted in the model described in my post, Proposal for a Post-Labor Societal Structure to Mitigate ASI Risks: The ‘Game Culture Civilization’ (GCC) Model
Whether the strategic choices and reasoning within that model hold up to scrutiny is exactly the kind of difficult evaluation your post calls for. Feedback focused on the strategic aspects (the assumptions, the proposed mechanisms for altering incentives, the potential second-order effects, etc.), as distinct from just the technical feasibility, would be very welcome and relevant to this discussion on evaluating strategic takes.
Nicholas Taleb in his book “Black Swan” argues similar ideas. As a former Wall Street trader, his thesis is that people making good decisions under uncertainty must have “skin in the game”. I.e. quantitative modeling is insufficient . This suggests researchers can and should support the stakeholders who are the decision-makers.
Whilst the title is true, I don’t think that it adds much as, for most people, the authority of a researcher is probably as good as it gets. Even other researchers are probably not able to reliably tell who is or is not a good strategic thinker, so, for a layperson, there is no realistic alternative than to take the researcher seriously.
(IMHO a good proxy for strategic thinking is the ability to clearly communicate to a lay audience. )
I think the correct question is how much of an update should you make in an absolute sense rather than a relative sense? Many people in this community are overconfident and if you decide that every person is less worth listening to than you thought this doesn’t change who you listen to, but it should make you a lot more uncertain in your beliefs
Strong upvote. Slightly worried by the fact that this wasn’t written, in some form, earlier (maybe I missed a similar older post?)
I think we[1] can, and should, go even further:
-Find a systematic/methodical way of identifying which people are really good at strategic thinking, and help them use their skills in relevant work; maybe try to hire from outside the usual recruitment pools.
If deemed feasible (in a short enough amount of time): train some people mainly on strategy, so as to get a supply of better strategists.
-Encourage people to state their incompetence in some domains (except maybe in cases where it makes for bad PR) / embrace the idea of specialization and division of labour more: maybe high-level strategists don’t need as much expertise on the technical details, only the ability to see which phenomena matter (assuming domain experts are able to communicate well enough)
say, the people who care about preventing catastrophic events, in a broad sense
I completely agree on the importance of strategic thinking. Personally, I like to hear what early AI pioneers had to say about modeling AI. For example, Minsky’s society of mind. I believe the trend of AI must be informed by the development of epistemology, and I’ve basically bet my research on the idea that epistemological progress will shape AGI
What do you mean with ‘must’? The word has to different meanings in this context and it seems bad epistemology not to distinguish them.
My use of “must” wasn’t just about technical necessity, but rather a philosophical or strategic imperative — that we ought to inform AGI not only through recent trends in deep learning (say, post-2014), but also by drawing from longer-standing academic traditions, like epistemic logic.