Why not! There are many many questions that were not discussed here because I just wanted to focus on the core part of the argument. But I agree details and scenarios are important, even if I think this shouldn’t change too much the basic picture depicted in the OP.
Here are some important questions that were voluntarily omitted from the QA for the sake of not including stuff that fluctuates too much in my head;
would we react before the point of no return?
Where should we place the red line? Should this red line apply to labs?
Is this going to be exponential? Do we care?
What would it look like if we used a counter-agent that was human-aligned?
What can we do about it now concretely? Is KYC something we should advocate for?
Don’t you think an AI capable of ARA would be superintelligent and take-over anyway?
What are the short term bad consequences of early ARA? What does the transition scenario look like.
Is it even possible to coordinate worldwide if we agree that we should?
How much human involvement will be needed in bootstrapping the first ARAs?
We plan to write more about these with @Épiphanie Gédéon in the future, but first it’s necessary to discuss the basic picture a bit more.
Potentially unpopular take, but if you have the skillset to do so, I’d rather you just come up with simple/clear explanations for why ARA is dangerous, what implications this has for AI policy, present these ideas to policymakers, and iterate on your explanations as you start to see why people are confused.
Note also that in the US, the NTIA has been tasked with making recommendations about open-weight models. The deadline for official submissions has ended but I’m pretty confident that if you had something you wanted them to know, you could just email it to them and they’d take a look. My impression is that they’re broadly aware of extreme risks from certain kinds of open-sourcing but might benefit from (a) clearer explanations of ARA threat models and (b) specific suggestions for what needs to be done.
Might be good to have a dialogue format with other people who agree/disagree to flesh out scenarios and countermeasures
Why not! There are many many questions that were not discussed here because I just wanted to focus on the core part of the argument. But I agree details and scenarios are important, even if I think this shouldn’t change too much the basic picture depicted in the OP.
Here are some important questions that were voluntarily omitted from the QA for the sake of not including stuff that fluctuates too much in my head;
would we react before the point of no return?
Where should we place the red line? Should this red line apply to labs?
Is this going to be exponential? Do we care?
What would it look like if we used a counter-agent that was human-aligned?
What can we do about it now concretely? Is KYC something we should advocate for?
Don’t you think an AI capable of ARA would be superintelligent and take-over anyway?
What are the short term bad consequences of early ARA? What does the transition scenario look like.
Is it even possible to coordinate worldwide if we agree that we should?
How much human involvement will be needed in bootstrapping the first ARAs?
We plan to write more about these with @Épiphanie Gédéon in the future, but first it’s necessary to discuss the basic picture a bit more.
Potentially unpopular take, but if you have the skillset to do so, I’d rather you just come up with simple/clear explanations for why ARA is dangerous, what implications this has for AI policy, present these ideas to policymakers, and iterate on your explanations as you start to see why people are confused.
Note also that in the US, the NTIA has been tasked with making recommendations about open-weight models. The deadline for official submissions has ended but I’m pretty confident that if you had something you wanted them to know, you could just email it to them and they’d take a look. My impression is that they’re broadly aware of extreme risks from certain kinds of open-sourcing but might benefit from (a) clearer explanations of ARA threat models and (b) specific suggestions for what needs to be done.