Edit, one day later: the structure seems good, but I’m very concerned that the thresholds for High and Critical risk in each category are way too high, such that e.g. a system could very plausibly kill everyone without reaching Critical in any category. See pp. 8–11. If so, that’s a fatal flaw for a framework like this. I’m interested in counterarguments; for now, praise mostly retracted; oops. I still prefer this to no RSP-y-thing, but I was expecting something stronger from OpenAI. I really hope they lower thresholds for the finished version of this framework.
My impression was that (other than Autonomy) High means “effective & professionally skilled human levels of ability at creating this type of risk” and Critical means “superhuman levels of ability at creating this type of risk”. I assume their rationale is that we already have a world containing plenty of people with human levels of ability to create risk, and we’re not dead yet. I think their threshold for High may be a bit too high on Persuasion, by comparing to very rare, really exceptional people (by “country-wide change agents” I assume they mean people like Nelson Mandela or Barack Obama): we don’t have a lot of those, especially not willing and able to work for a O(cents) per thousand tokens for anyone. I’d have gone with a lower bar like “as persuasive as a skilled & capable professional negotiator, politician+speechwriter team, or opinion writer”: i.e. someone with charisma and a way with words, but not once-in-a-generation levels of charisma.
Added to the post:
My impression was that (other than Autonomy) High means “effective & professionally skilled human levels of ability at creating this type of risk” and Critical means “superhuman levels of ability at creating this type of risk”. I assume their rationale is that we already have a world containing plenty of people with human levels of ability to create risk, and we’re not dead yet. I think their threshold for High may be a bit too high on Persuasion, by comparing to very rare, really exceptional people (by “country-wide change agents” I assume they mean people like Nelson Mandela or Barack Obama): we don’t have a lot of those, especially not willing and able to work for a O(cents) per thousand tokens for anyone. I’d have gone with a lower bar like “as persuasive as a skilled & capable professional negotiator, politician+speechwriter team, or opinion writer”: i.e. someone with charisma and a way with words, but not once-in-a-generation levels of charisma.