Nice work, I’m looking forward to a full-fledged paper.
Pardon me if the following comments are inaccurate. It’s non-expert curiosity.
I really like the endeavor of estimating tail-risk, which you say being excited about as well. Yet it seems that you quickly reduce tail-risk to catastrophe and thus the problem is reduced to a binary classification. Given the very low probability of triggerring a catastrophes the problem may be intractable. You might be making it a problem of critical threshold detection whereas tail-risk is most likely not a binary switch.
Could you instead, say, generalize the problem and consider that the risk is not necessarily catastrophic? This might also find many more applications as the topic is very relevant to a large part of AI models while, in contrast, aspiring superintelligences able of designing doomsday machines remain a tiny proportion of applications. Finally, this would also simplify your empirical investigations.
That is, could you develop a general framework that includes X-risk but that is not limited to it?
Nice work, I’m looking forward to a full-fledged paper.
Pardon me if the following comments are inaccurate. It’s non-expert curiosity.
I really like the endeavor of estimating tail-risk, which you say being excited about as well. Yet it seems that you quickly reduce tail-risk to catastrophe and thus the problem is reduced to a binary classification. Given the very low probability of triggerring a catastrophes the problem may be intractable. You might be making it a problem of critical threshold detection whereas tail-risk is most likely not a binary switch.
Could you instead, say, generalize the problem and consider that the risk is not necessarily catastrophic? This might also find many more applications as the topic is very relevant to a large part of AI models while, in contrast, aspiring superintelligences able of designing doomsday machines remain a tiny proportion of applications. Finally, this would also simplify your empirical investigations.
That is, could you develop a general framework that includes X-risk but that is not limited to it?