Comment on your planned opinion: I mostly agree; I think what this means is that prosaic AI safety depends somewhat on an empirical premise: That joint training doesn’t bring a major competitiveness penalty. I guess I only disagree insofar as I’m a bit more skeptical of that premise. What does the current evidence on joint training say on the matter? I have no idea, but I am under the impression that you can’t just take an existing training process—such as the one that made AlphaStar—and mix in some training tasks from a completely different domain and expect it to work. This seems like evidence against the premise to me. As someone (Paul?) pointed out in the comments when I said this, this point applies to fine-tuning as well. But if so that just means that the second and third ways of the dilemma are both uncompetitive, which means prosaic AI safety is uncompetitive in general.
prosaic AI safety depends somewhat on an empirical premise: That joint training doesn’t bring a major competitiveness penalty.
Yeah, this is why I said:
Of course, it may turn out that it takes a huge amount of resources to train the question answering system, making the system uncompetitive, but that seems hard to predict given our current knowledge.
you can’t just take an existing training process—such as the one that made AlphaStar—and mix in some training tasks from a completely different domain and expect it to work.
From a completely different domain, yeah, that probably won’t work well (though I’d still guess less than an order of magnitude slowdown). But as I understand it, the goal is to train a question answering system that answers questions related to the domain, e.g. for Starcraft you might ask the model questions about the best way to counter a particular strategy, or why it deploys a particular kind of unit in a certain situation. This depends on similar underlying features / concepts as playing Starcraft well, and adding training tasks of this form can often improve performance, e.g. One Model To Learn Them All.
Thanks! I endorse that summary.
Comment on your planned opinion: I mostly agree; I think what this means is that prosaic AI safety depends somewhat on an empirical premise: That joint training doesn’t bring a major competitiveness penalty. I guess I only disagree insofar as I’m a bit more skeptical of that premise. What does the current evidence on joint training say on the matter? I have no idea, but I am under the impression that you can’t just take an existing training process—such as the one that made AlphaStar—and mix in some training tasks from a completely different domain and expect it to work. This seems like evidence against the premise to me. As someone (Paul?) pointed out in the comments when I said this, this point applies to fine-tuning as well. But if so that just means that the second and third ways of the dilemma are both uncompetitive, which means prosaic AI safety is uncompetitive in general.
Yeah, this is why I said:
From a completely different domain, yeah, that probably won’t work well (though I’d still guess less than an order of magnitude slowdown). But as I understand it, the goal is to train a question answering system that answers questions related to the domain, e.g. for Starcraft you might ask the model questions about the best way to counter a particular strategy, or why it deploys a particular kind of unit in a certain situation. This depends on similar underlying features / concepts as playing Starcraft well, and adding training tasks of this form can often improve performance, e.g. One Model To Learn Them All.