But my main reason for posting is to ask this question: What is the most similar work that you know of?
It’s not tremendously similar, but for some reason I thought of the Diamond-Dybvig model of bank runs as a (distant) analogy. It has multiple equilibria: everyone might take money in & out of the bank as usual, or a bank run might kick off. The AI risk equivalent, I guess, would be a model where either every development team exercises optimal caution (whatever that would be), or every team rushes to be first. That said, I don’t know whether any realistic-ish model of AI development would have those particular equilibria.
As for the FHI paper, I’m glad its abstract mentions the model’s prediction that more information can increase the risk. That’s a cute result.
I wonder what’d happen in a model that incorporates time passing over multiple rounds. The teams’ decisions in each round could expose information about their judgements of capabilities & risks. Might lead to an intractable model, though.
It’s not tremendously similar, but for some reason I thought of the Diamond-Dybvig model of bank runs as a (distant) analogy. It has multiple equilibria: everyone might take money in & out of the bank as usual, or a bank run might kick off. The AI risk equivalent, I guess, would be a model where either every development team exercises optimal caution (whatever that would be), or every team rushes to be first. That said, I don’t know whether any realistic-ish model of AI development would have those particular equilibria.
As for the FHI paper, I’m glad its abstract mentions the model’s prediction that more information can increase the risk. That’s a cute result.
I wonder what’d happen in a model that incorporates time passing over multiple rounds. The teams’ decisions in each round could expose information about their judgements of capabilities & risks. Might lead to an intractable model, though.