Good point re 2. Re 1, meh, still seems like a meta-argument to me, because when I roll out my mental simulations of the ways the future could go, it really does seem like my If… condition obtaining would cut out about half of the loss-of-control ones.
Re 3: point by point: 1. AISIs existing vs. not: Less important; I feel like this changes my p(doom) by more like 10-20% rather than 50%. 2. Big names coming out: idk this also feels like maybe 10-20% rather than 50% 3. I think Anthropic winning the race would be a 40% thing maybe, but being a runner-up doesn’t help so much, but yeah p(anthropicwins) has gradually gone up over the last three years... 4. Trump winning seems like a smaller deal to me. 5. Ditto for Elon. 6. Not sure how to think about logical updates, but yeah, probably this should have swung my credence around more than it did. 7. ? This was on the mainline path basically and it happened roughly on schedule. 8. Takeoff speeds matter a ton, I’ve made various updates but nothing big and confident enough to swing my credence by 50% or anywhere close. Hmm. But yeah I agree that takeoff speeds matter more. 9. Picture here hasn’t changed much in three years. 10. Ditto.
OK, so I think I directionally agree that my p(doom) should have been oscillating more than it in fact did over the last three years (if I take my own estimates seriously). However I don’t go nearly as far as you; most of the things you listed are either (a) imo less important, or (b) things I didn’t actually change my mind on over the last three years such that even though they are very important my p(doom) shouldn’t have been changing much.
But IMO the easiest way for safety cases to become the industry-standard thing is for AISI (or internal safety factions) to specifically demand it, and then the labs produce it, but kinda begrudgingly, and they don’t really take them seriously internally (or are literally not the sort of organizations that are capable of taking them seriously internally—e.g. due to too much bureaucracy). And that seems very much like the sort of change that’s comparable to or smaller than the things above.
I agree with everything except the last sentence—my claim took this into account, I was specifically imagining something like this playing out and thinking ‘yep, seems like this kills about half of the loss-of-control worlds’
I think I would be more sympathetic to your view if the claim were “if AI labs really reoriented themselves to take these AI safety cases as seriously as they take, say, being in the lead or making profit”. That would probably halve my P(doom), it’s just a very very strong criterion.
I agree that’s a stronger claim than I was making. However, part of my view here is that the weaker claim I did make has a good chance of causing the stronger claim to be true eventually—if a company was getting close to AGI, and they published their safety case a year before and it was gradually being critiqued and iterated on, perhaps public pressure and pressure from the scientific community would build to make it actually good. (Or more optimistically, perhaps the people in the company would start to take it more seriously once they got feedback from the scientific community about it and it therefore started to feel more real and more like a real part of their jobs)
Anyhow bottom line is I won’t stick to my 50% claim, maybe I’ll moderate it down to 25% or something.
Good point re 2. Re 1, meh, still seems like a meta-argument to me, because when I roll out my mental simulations of the ways the future could go, it really does seem like my If… condition obtaining would cut out about half of the loss-of-control ones.
Re 3: point by point:
1. AISIs existing vs. not: Less important; I feel like this changes my p(doom) by more like 10-20% rather than 50%.
2. Big names coming out: idk this also feels like maybe 10-20% rather than 50%
3. I think Anthropic winning the race would be a 40% thing maybe, but being a runner-up doesn’t help so much, but yeah p(anthropicwins) has gradually gone up over the last three years...
4. Trump winning seems like a smaller deal to me.
5. Ditto for Elon.
6. Not sure how to think about logical updates, but yeah, probably this should have swung my credence around more than it did.
7. ? This was on the mainline path basically and it happened roughly on schedule.
8. Takeoff speeds matter a ton, I’ve made various updates but nothing big and confident enough to swing my credence by 50% or anywhere close. Hmm. But yeah I agree that takeoff speeds matter more.
9. Picture here hasn’t changed much in three years.
10. Ditto.
OK, so I think I directionally agree that my p(doom) should have been oscillating more than it in fact did over the last three years (if I take my own estimates seriously). However I don’t go nearly as far as you; most of the things you listed are either (a) imo less important, or (b) things I didn’t actually change my mind on over the last three years such that even though they are very important my p(doom) shouldn’t have been changing much.
I agree with everything except the last sentence—my claim took this into account, I was specifically imagining something like this playing out and thinking ‘yep, seems like this kills about half of the loss-of-control worlds’
I agree that’s a stronger claim than I was making. However, part of my view here is that the weaker claim I did make has a good chance of causing the stronger claim to be true eventually—if a company was getting close to AGI, and they published their safety case a year before and it was gradually being critiqued and iterated on, perhaps public pressure and pressure from the scientific community would build to make it actually good. (Or more optimistically, perhaps the people in the company would start to take it more seriously once they got feedback from the scientific community about it and it therefore started to feel more real and more like a real part of their jobs)
Anyhow bottom line is I won’t stick to my 50% claim, maybe I’ll moderate it down to 25% or something.