Joe Brenton comments on A Model-based Approach to AI Existential Risk

Joe Brenton 25 Aug 2023 12:14 UTC
LW: 6 AF: 3
0
AF
Seems like this model has potential to drive resolution of this question on Manifold Markets to ‘yes’:
“Will Tyler Cowen agree that an ‘actual mathematical model’ for AI X-Risk has been developed by October 15, 2023?”
https://manifold.markets/JoeBrenton/will-tyler-cowen-agree-that-an-actu?r=Sm9lQnJlbnRvbg
- Sammy Martin 25 Aug 2023 13:00 UTC
  LW: 2 AF: 1
  0
  AF Parent
  Oh, we’ve been writing up these concerns for 20 years and no one listens to us.′ My view is quite different. I put out a call and asked a lot of people I know, well-informed people, ‘Is there any actual mathematical model of this process of how the world is supposed to end?’...So, when it comes to AGI and existential risk, it turns out as best I can ascertain, in the 20 years or so we’ve been talking about this seriously, there isn’t a single model done.
  I think that MTAIR plausibly is a model of the ‘process of how the world is supposed to end’, in the sense that it runs through causal steps where each individual thing is conditioned on the previous thing (APS is developed, APS is misaligned, given misalignment it causes damage on deployment, given that the damage is unrecoverable), and for some of those inputs your probabilities and uncertainty distribution could itself come from a detailed causal model (e.g. you can look at the Direct Approach for the first two questions.
  For the later questions, like e.g. what’s the probability that an unaligned APS can inflict large disasters given that it is deployed, we can enumerate ways that it could happen in detail but to assess their probability you’d need to do a risk assessment with experts not produce a mathematical model.
  E.g. you wouldn’t have a “mathematical model” of how likely a US-China war over Taiwan is, you’d do wargaming and ask experts or maybe superforecasters. Similarly, for the example that he gave which was COVID there was a part of this that was a straightforward SEIR model and then a part that was more sociological talking about how the public response works (though of course a lot of the “behavioral science” then turned out to be wrong!).
  So a correct ‘mathematical model of the process’ if we’re being fair, would use explicit technical models for technical questions and for sociological/political/wargaming questions you’d use other methods. I don’t think he’d say that there’s no ‘mathematical model’ of nuclear war because while we have mathematical models of how fission and fusion works, we don’t have any for how likely it is that e.g. Iran’s leadership decides to start building nuclear weapons.
  I think Tyler Cowen would accept that as sufficiently rigorous in that domain, and I believe that the earlier purely technical questions can be obtained from explicit models. One addition that could strengthen the model is to explicitly spell out different scenarios for each step (e.g. APS causes damage via autonomous weapons, economic disruption, etc). But the core framework seems sufficient as is, and also those concerns have been explained in other places.
  What do you think?
  - Dave Orr 25 Aug 2023 16:16 UTC
    3 points
    0
    Parent
    I think that Tyler is thinking more of an economic type model that looks at the incentives of various actors and uses that to understand what might go wrong and why. I predict that he would look at this model and say, “misaligned AI can cause catastrophes” is the hand-wavy bit that he would like to see an actual model of.
    I’m not an economist (is IANAE a known initialization yet?), but it would probably include things like the AI labs, the AIs, and potentially regulators or hackers/thieves, try to understand and model their incentives and behaviors, and see what comes out of that. It’s less about subjective probabilities from experts and more about trying to understand the forces acting on the players and how they respond to them.
    - Sammy Martin 26 Aug 2023 18:12 UTC
      2 points
      0
      Parent
      I guess it is down to Tyler’s personal opinion, but would he accept asking IR and defense policy experts on the chance of a war with China as an acceptable strategy or would he insist on mathematical models of their behaviors and responses? To me it’s clearly the wrong tool, just as in the climate impacts literature we can’t get economic models of e.g. how governments might respond to waves of climate refugees but can consult experts on it.
      - Dave Orr 26 Aug 2023 18:17 UTC
        4 points
        0
        Parent
        I just think that to an economist, models and survey results are different things, and he’s not asking for the latter.