The intensity of regulation which seems remotely plausible does not help if we don’t have a alignment method to mandate which holds up through recursive self-improvement. We don’t have one, and it seems pretty likely that we won’t get one in time. Humanity is almost certainly incapable of coordinating to not build such a powerful technology given the current strategic, cultural, and geopolitical landscape.
I think we might get lucky in one of a few directions, but the default outcome is doom.
There’s a discussion of this here. If you think I should write more on the subject I might devote more time to it; this seems like an extremely important point and one that isn’t widely acknowledged.
It looks like in that thread you never replied to the people saying they couldn’t follow your explanation. Specifically, what bad things could an AI regulator do that would increase the probability of doom?
Mandate specific architectures to be used because the government is more familiar with them, even if other architectures would be safer.
Mandate specific “alignment” protocols to be used that do not, in fact, make an AI safer or more legible, and divert resources to them that would otherwise have gone to actual alignment work.
Declare certain AI “biases” unacceptable, and force the use of AIs that do not display them. If some of these “biases” are in fact real patterns about the world, this could select for AIs with unpredictable blind spots and/or deceptive AIs.
Increase compliance costs such that fewer people are willing to work on alignment, and smaller teams might be forced out of the field entirely.
Subsidize unhelpful approaches to alignment, drawing in people more interested in making money than in actually solving the problem, increasing the noise-to-signal ratio.
Create licensing requirements that force researchers out of the field.
Create their own AI project under political administrators that have no understanding of alignment, and no real interest in solving it, thereby producing AIs that have an unusually high probability of causing doom and an unusually low probability of producing useful alignment research and/or taking a pivotal act to reduce or end the risk.
Push research underground, reducing the ability of researchers to collaborate.
Push research into other jurisdictions with less of a culture of safety. E.g. DeepMind cares enough about alignment to try to quantify how hard a goal can be optimized before degenerate behavior emerges; if they are shut down and some other organization elsewhere takes the lead, they may well not share this goal.
This was just off the top of my head. In real life, regulation tends to cause problems that no one saw coming in advance. The strongest counterargument here is that regulation should at least slow capabilities research down, buying more time for alignment. But regulators do not have either the technical knowledge or the actual desire to distinguish capabilities and alignment research, and alignment research is much more fragile.
I mostly buy the position outlined in:
AGI ruin scenarios are likely (and disjunctive)
A central AI alignment problem: capabilities generalization, and the sharp left turn
Why all the fuss about recursive self-improvement?
Warning Shots Probably Wouldn’t Change The Picture Much
my current outlook on AI risk mitigation
The intensity of regulation which seems remotely plausible does not help if we don’t have a alignment method to mandate which holds up through recursive self-improvement. We don’t have one, and it seems pretty likely that we won’t get one in time. Humanity is almost certainly incapable of coordinating to not build such a powerful technology given the current strategic, cultural, and geopolitical landscape.
I think we might get lucky in one of a few directions, but the default outcome is doom.
Extreme regulation seems plausible if policy makers start to take the problem seriously. But no regulations will apply everywhere in the world.
I said conditional on it not being regulated. If it’s regulated, I suspect there’s an extremely high probability of doom.
How does this work?
There’s a discussion of this here. If you think I should write more on the subject I might devote more time to it; this seems like an extremely important point and one that isn’t widely acknowledged.
It looks like in that thread you never replied to the people saying they couldn’t follow your explanation. Specifically, what bad things could an AI regulator do that would increase the probability of doom?
Mandate specific architectures to be used because the government is more familiar with them, even if other architectures would be safer.
Mandate specific “alignment” protocols to be used that do not, in fact, make an AI safer or more legible, and divert resources to them that would otherwise have gone to actual alignment work.
Declare certain AI “biases” unacceptable, and force the use of AIs that do not display them. If some of these “biases” are in fact real patterns about the world, this could select for AIs with unpredictable blind spots and/or deceptive AIs.
Increase compliance costs such that fewer people are willing to work on alignment, and smaller teams might be forced out of the field entirely.
Subsidize unhelpful approaches to alignment, drawing in people more interested in making money than in actually solving the problem, increasing the noise-to-signal ratio.
Create licensing requirements that force researchers out of the field.
Create their own AI project under political administrators that have no understanding of alignment, and no real interest in solving it, thereby producing AIs that have an unusually high probability of causing doom and an unusually low probability of producing useful alignment research and/or taking a pivotal act to reduce or end the risk.
Push research underground, reducing the ability of researchers to collaborate.
Push research into other jurisdictions with less of a culture of safety. E.g. DeepMind cares enough about alignment to try to quantify how hard a goal can be optimized before degenerate behavior emerges; if they are shut down and some other organization elsewhere takes the lead, they may well not share this goal.
This was just off the top of my head. In real life, regulation tends to cause problems that no one saw coming in advance. The strongest counterargument here is that regulation should at least slow capabilities research down, buying more time for alignment. But regulators do not have either the technical knowledge or the actual desire to distinguish capabilities and alignment research, and alignment research is much more fragile.
Thanks, some of those possibilities do seem quite risky and I hadn’t thought about them before.