What do you mean “hugely edited”? What other things would you like us to change? If I were starting from scratch I would of course write the post differently but I don’t think it would be worth my time to make major post hoc edits; I would like to focus on follow up posts.
If it’s spontaneous then yeah, I don’t expect it to happen ~ever really. I was mainly thinking about cases where people intentionally train models to scheme.
What do you mean “hugely edited”? What other things would you like us to change? If I were starting from scratch I would of course write the post differently but I don’t think it would be worth my time to make major post hoc edits; I would like to focus on follow up posts.
Specifically, I wanted the edit to be a clarification that you only have a <0.1% probability on spontaneous scheming ending the world.
Well, I have <0.1% on spontaneous scheming, period. I suspect Nora is similar and just misspoke in that comment.
If it’s spontaneous then yeah, I don’t expect it to happen ~ever really. I was mainly thinking about cases where people intentionally train models to scheme.