Beth, would METR be interested in tasks related to chemical engineering or materials science? For example, “build a thermodynamic or kinetic model of reactor X in process Y” or “design a new alloy for use in applications XYZ”. It was not clear to me if the bounty accommodates these kinds of tasks. Thank you!
I think these tasks would be in scope as long as these tasks can be done fully digitally and can be done relatively easily in text. (And it’s possible to setup the task in a docker container etc etc.)
Quoting the desiderata from the post:
Plays to strengths of LLM agents: ideally, most of the tasks can be completed by writing code, using the command line, or other text-based interaction.
Beth, would METR be interested in tasks related to chemical engineering or materials science? For example, “build a thermodynamic or kinetic model of reactor X in process Y” or “design a new alloy for use in applications XYZ”. It was not clear to me if the bounty accommodates these kinds of tasks. Thank you!
I think these tasks would be in scope as long as these tasks can be done fully digitally and can be done relatively easily in text. (And it’s possible to setup the task in a docker container etc etc.)
Quoting the desiderata from the post:
Yep, that’s right. And also need it to be possible to check the solutions without needing to run physical experiments etc!