Love this. I’ve been thinking about related things in AI bio safety evals. Could we have an LLM walk a layperson through a complicated-but-safe wetlab protocol which is an approximate difficulty match for a dangerous protocol? How good of evidence would this be compared to doing the actual dangerous protocol? Maybe at least you could cut eval costs by having a large subject group do the safe protocol, and only a small carefully screened and supervised group go through the dangerous protocol.
Love this. I’ve been thinking about related things in AI bio safety evals. Could we have an LLM walk a layperson through a complicated-but-safe wetlab protocol which is an approximate difficulty match for a dangerous protocol? How good of evidence would this be compared to doing the actual dangerous protocol? Maybe at least you could cut eval costs by having a large subject group do the safe protocol, and only a small carefully screened and supervised group go through the dangerous protocol.