If we don’t have such a clear distinction, then there’s not much that we can do, except ban AI, or ML entirely (or maybe ban AI above a certain compute threshold, or optimization threshold), which seems like a non-starter.
Idk, if humanity as a whole could have a justified 90% confidence that AI above a certain compute threshold would kill us all, I think we could ban it entirely. Like, why on earth not? It’s in everybody’s interest to do so. (Note that this is not the case with climate change, where it is in everyone’s interest for them to keep emitting while others stop emitting.)
This seems probably true even if it was 90% confidence that there is some threshold over which AI would kill us all, that we don’t yet know. In this case I imagine something more like a direct ban on most people doing it, and some research that very carefully explores what the threshold is.
This is only any use at all if governments can easily identify tractable research programs that actually contribute to AI safety, instead of have “AI safety” as a cool tagline. I guess that you imagine that that will be the case in the future? Or maybe you think that it doesn’t matter if they fund a bunch of terrible, pointless research if some “real” research also gets funded?
A common way in which this is done is to get experts to help allocate funding, which seems like a reasonable way to do this, and probably better than the current mechanisms excepting Open Phil (current mechanism = how well you can convince random donors to give you money).
What? It seems like this is only possible if the technical problem is solved and known to be solved. At that point, the problem is solved
In the world where the aligned version is not competitive, a government can unilaterally pay the price of not being competitive because it has many more resources.
Also there are other problems you might care about, like how the AI system might be used. You may not be too happy if anyone can “buy” a superintelligent AI from the company that built it; this makes arbitrary humans generally more able to impact the world; if you have a group of not-very-aligned agents making big changes to the world and possibly fighting with each other, things will plausibly go badly at some point.
Again, if there are existing, legible standards of what’s safe and what isn’t this seems good. But without such standards I don’t know how this helps?
Telling what is / isn’t safe seems decidedly easier than making an arbitrary agent safe; it feels like we will be able to be conservative about this. But this is mostly an intuition.
I think a general response to your intuition is that I don’t see technical solutions as the only options; there are other ways we could be safe (1, 2).
Cruxes:
We’re going to have clear, legible things that ensure safety (which might be “never build systems of this type”).
Governments are much more competent than you currently believe (I don’t know what you believe, but probably I think they are more competent than you do)
We have so little evidence / argument so far, that just the model uncertainty means that we can’t conclude “it is unimportant to think about how we could use the resources of the most powerful actors in the world”.
Idk, if humanity as a whole could have a justified 90% confidence that AI above a certain compute threshold would kill us all, I think we could ban it entirely. Like, why on earth not? It’s in everybody’s interest to do so. (Note that this is not the case with climate change, where it is in everyone’s interest for them to keep emitting while others stop emitting.)
This seems probably true even if it was 90% confidence that there is some threshold over which AI would kill us all, that we don’t yet know. In this case I imagine something more like a direct ban on most people doing it, and some research that very carefully explores what the threshold is.
A common way in which this is done is to get experts to help allocate funding, which seems like a reasonable way to do this, and probably better than the current mechanisms excepting Open Phil (current mechanism = how well you can convince random donors to give you money).
In the world where the aligned version is not competitive, a government can unilaterally pay the price of not being competitive because it has many more resources.
Also there are other problems you might care about, like how the AI system might be used. You may not be too happy if anyone can “buy” a superintelligent AI from the company that built it; this makes arbitrary humans generally more able to impact the world; if you have a group of not-very-aligned agents making big changes to the world and possibly fighting with each other, things will plausibly go badly at some point.
Telling what is / isn’t safe seems decidedly easier than making an arbitrary agent safe; it feels like we will be able to be conservative about this. But this is mostly an intuition.
I think a general response to your intuition is that I don’t see technical solutions as the only options; there are other ways we could be safe (1, 2).
Cruxes:
We’re going to have clear, legible things that ensure safety (which might be “never build systems of this type”).
Governments are much more competent than you currently believe (I don’t know what you believe, but probably I think they are more competent than you do)
We have so little evidence / argument so far, that just the model uncertainty means that we can’t conclude “it is unimportant to think about how we could use the resources of the most powerful actors in the world”.