I agree that “buying time” isn’t a very useful category. Some thoughts on the things which seem to fall under the “buying time” category:
Evaluations
I think people should mostly consider this as a subcategory of technical alignment work, in particular the work of understanding models. The best evaluations will include work that’s pretty continuous with ML research more generally, like fine-tuning on novel tasks, developing new prompting techniques, and application of interpretability techniques.
Governance work, some subcategories of which include:
Lab coordination: work on this should mainly be done in close consultation with people already working at big AI labs, in order to understand the relevant constraints and opportunities
Various strands of technical work which is useful for the above
Outreach
One way to contribute to outreach is doing logistics for outreach programs (like the AGI safety fundamentals course)
Another way is to directly engage with ML researchers
Both of these seem very different from “buying time”—or at least “outreach to persuade people to become alignment researchers” doesn’t seem very different from “outreach to buy time somehow”
Thomas Kuhn argues in The Structure of Scientific Revolutions that research fields that focus directly on real-world fields usually make little progress. In his view, a field like physics where researchers in the 20th century focused on questions that are very theoretical produced a lot of progress.
When physicists work to advance the field of physics they look for possible experiments that can be done to increase the body of physics knowledge. A good portion of those doesn’t have any apparent real-world impact but they lead to accumulated knowledge in the field.
In contrast, a field like nutrition research tries to focus on questions that are directly relevant to how people eat. As a result, they don’t focus on fundamental research that leads to accumulated knowledge.
Given that background, I think that we want a good portion of alignment researchers not laser-focussed on specific theories of change.
I think that being driven by curiosity is different than being driven by a theory of change. Physicists are curious about figuring out any gaps in their models.
Feynman is also a good example. He hit a roadblock and couldn’t really think about how to work on any important problem so he decided to play around with math about physical problems.
If someone is curious to solve some problem in the broad sphere of alignment, it might be bad has to think about whether or not that actually helps with specific theories of change.
On the funding level, the theory of change might apply here, where you give the researchers who is able to solve some unsolved problem money because it progresses the field even if you have no idea whether the problem is useful to be solved. On the researcher level, it’s more being driven by curiosity.
At the same time, it’s also possible for the curiosity-driven approach to lead a field astray. Virology would be a good example. The virologists are curious about problems they can approach with the molecular biological toolkit but aren’t very curious about the problems you need to tackle to understand airborne transmission.
The virologists also do dangerous gain-of-function experiments that are driven by the curiosity to understand things about viruses but that are separate from a theory of change.
When approaching them a good stance might be: “If you do dangerous experiments then you need a theory of change” but for the non-dangerous experiments that are driven by curiosity to understand viruses without a theory of change there should be decent funding. You also want that there are some people who actually think about theory of change and go “If we want to prevent a pandemic, understanding how virus transmission works is important, so we should probably fund experiments for that even if it’s not the experiments toward which our curiosity drives us”.
Going back to the alignment question: You want to fund people who are curious about solving problems in the field to the extent that solving those problems isn’t likely to result in huge capability increases they should have decent funding. When it comes to working on problems where that are about capability increases you should require the researchers to have a clear theory of change. You also want some people to focus on a theory of change and seek problems based on a theory of change.
Thinking too hard about whether what you are doing actually is helping can lead to getting done nothing at all as Feymann experienced before we switched his approach.
It’s possible that at the present state of the AI safety field there are no clear roads to victory. If that’s the case then you would want to grow the AI safety field by letting researchers grow it more via curiosity-driven research.
The number of available researchers doesn’t tell you what’s actually required to make progress.
I agree that “buying time” isn’t a very useful category. Some thoughts on the things which seem to fall under the “buying time” category:
Evaluations
I think people should mostly consider this as a subcategory of technical alignment work, in particular the work of understanding models. The best evaluations will include work that’s pretty continuous with ML research more generally, like fine-tuning on novel tasks, developing new prompting techniques, and application of interpretability techniques.
Governance work, some subcategories of which include:
Lab coordination: work on this should mainly be done in close consultation with people already working at big AI labs, in order to understand the relevant constraints and opportunities
Policy work: see standard resources on this
Various strands of technical work which is useful for the above
Outreach
One way to contribute to outreach is doing logistics for outreach programs (like the AGI safety fundamentals course)
Another way is to directly engage with ML researchers
Both of these seem very different from “buying time”—or at least “outreach to persuade people to become alignment researchers” doesn’t seem very different from “outreach to buy time somehow”
Thomas Kuhn argues in The Structure of Scientific Revolutions that research fields that focus directly on real-world fields usually make little progress. In his view, a field like physics where researchers in the 20th century focused on questions that are very theoretical produced a lot of progress.
When physicists work to advance the field of physics they look for possible experiments that can be done to increase the body of physics knowledge. A good portion of those doesn’t have any apparent real-world impact but they lead to accumulated knowledge in the field.
In contrast, a field like nutrition research tries to focus on questions that are directly relevant to how people eat. As a result, they don’t focus on fundamental research that leads to accumulated knowledge.
Given that background, I think that we want a good portion of alignment researchers not laser-focussed on specific theories of change.
I think that being driven by curiosity is different than being driven by a theory of change. Physicists are curious about figuring out any gaps in their models.
Feynman is also a good example. He hit a roadblock and couldn’t really think about how to work on any important problem so he decided to play around with math about physical problems.
If someone is curious to solve some problem in the broad sphere of alignment, it might be bad has to think about whether or not that actually helps with specific theories of change.
On the funding level, the theory of change might apply here, where you give the researchers who is able to solve some unsolved problem money because it progresses the field even if you have no idea whether the problem is useful to be solved. On the researcher level, it’s more being driven by curiosity.
At the same time, it’s also possible for the curiosity-driven approach to lead a field astray. Virology would be a good example. The virologists are curious about problems they can approach with the molecular biological toolkit but aren’t very curious about the problems you need to tackle to understand airborne transmission.
The virologists also do dangerous gain-of-function experiments that are driven by the curiosity to understand things about viruses but that are separate from a theory of change.
When approaching them a good stance might be: “If you do dangerous experiments then you need a theory of change” but for the non-dangerous experiments that are driven by curiosity to understand viruses without a theory of change there should be decent funding. You also want that there are some people who actually think about theory of change and go “If we want to prevent a pandemic, understanding how virus transmission works is important, so we should probably fund experiments for that even if it’s not the experiments toward which our curiosity drives us”.
Going back to the alignment question: You want to fund people who are curious about solving problems in the field to the extent that solving those problems isn’t likely to result in huge capability increases they should have decent funding. When it comes to working on problems where that are about capability increases you should require the researchers to have a clear theory of change. You also want some people to focus on a theory of change and seek problems based on a theory of change.
Thinking too hard about whether what you are doing actually is helping can lead to getting done nothing at all as Feymann experienced before we switched his approach.
It’s possible that at the present state of the AI safety field there are no clear roads to victory. If that’s the case then you would want to grow the AI safety field by letting researchers grow it more via curiosity-driven research.
The number of available researchers doesn’t tell you what’s actually required to make progress.