Eliezer gives alignment a 0% chance of succeeding. I think policy, if tried seriously, has >50%. So it’s a giant opportunity that’s gotten way too little attention
I’m optimistic about policy for big companies in particular. They have a lot to lose from breaking the law, they’re easy to inspect (because there’s so few), and there’s lots of precedent (ITAR already covers some software). Right now, serious AI capabilities research just isn’t profitable outside of the big tech companies
Voluntary compliance is also a very real thing. Lots of AI researchers are wealthy and high-status, and they’d have a lot to lose from breaking the law. At the very least, a law would stop them from publishing their research. A field like this also lends itself to undercover enforcement
I think an agreement with China is impossible now, because prominent folks don’t even believe the threat exists. Two factors could change the art of the possible. First, if there were a widely known argument about the dangers of AI, on which most public intellectual agreed. Second, since the US has a technological lead, it could actually be to their advantage.
Look at gain of function research for the result of a government moratorium on research. At first Baric feared that the moratorium would end his research. Then the NIH declared that his research isn’t officially gain of function and continued funding him.
Regulating gain of function research away is essentially easy mode compared to AI.
I agree that it’s hard, but there are all sorts of possible moves (like LessWrong folks choosing to work at this future regulatory agency, or putting massive amounts of lobbying funds into making sure the rules are strict)
If the alternative (solving alignment) seems impossible given 30 years and massive amounts of money, then even a really hard policy seems easy by comparison
How about if you solve a ban on gain-of-function research first, and then move on to much harder problems like AGI? A victory on this relatively easy case would result in a lot of valuable gained experience, or, alternatively, allow foolish optimists to have their dangerous optimism broken over shorter time horizons.
foolish optimists to have their dangerous optimism broken
I’m pretty confused about your confidence in your assertion here. Have you spoken to people who’ve lead successful government policy efforts, to ground this pessimism? Why does the IAEA exist? How did ARPA-E happen? Why is a massive subsidy for geothermal well within the Overton Window and thus in a bill Joe Manchin said he would sign?
Gain of function research is the remit of a decades-old incumbent bureaucracy (the NIH) that oversees bio policy, and doesn’t like listening to outsiders. There’s no such equivalent for AI; everyone in the government keeps asking “what should we do” and all the experts shrug or disagree with each other. What if they mostly didn’t?
Where is your imagined inertia/political opposition coming from? Is it literally skepticism that senators show up for work every day? What if I told you that most of them do and that things with low political salience and broad expert agreement happen all the time?
Where is your imagined inertia/political opposition coming from? Is it literally skepticism that senators show up for work every day? What if I told you that most of them do and that things with low political salience and broad expert agreement happen all the time?
Where my skepticism is coming from (for AI policy) is: what’s the ban, in enough detail that it could actually be a law?
Are we going to have an Office of Program Approval, where people have to send code, the government has to read it, and only once the government signs off, it can get run? If so, the whole tech industry will try to bury you, and even if you succeed, how are you going to staff that office with people who can tell the difference between AGI code and non-AGI code?
Are we going to have laws about what not to do, plus an office of lawyers looking for people breaking the laws? (This is more the SEC model.) Then this is mostly a ban on doing things in public; the NHTSA only knew to send George Hotz a cease-and-desist because he was uploading videos of the stuff he was doing. Maybe you can get enough visibility into OpenAI and Anthropic, but do you also need to get the UK to create one to get visibility into Deepmind? If the Canadian government, proud of its AI industry and happy to support it, doesn’t make such an office, do the companies just move there?
[Like, the federal government stopped the NIH from funding stem cell research for moral reasons, and California said “fine, we’ll fund it instead.”]
If the laws are just “don’t make AI that will murder people or overthrow the government”, well, we already have laws against murdering people and overthrowing the government. The thing I’m worried about is someone running a program that they think will be fine which turns out to not be fine, and it’s hard to bridge the gap between anticipated and actual consequences with laws.
To clarify, I largely agree with the viewpoint that “just announcing a law banning AGI” is incoherent and underspecified. But the job will with high probability be much easier than regulating the entire financial sector (the SEC’s job), which can really only be done reactively.
If AGI projects cost >$1B and require specific company cultural DNA, it’s entirely possible that we’re talking about fewer than 20 firms across the Western world. These companies will be direct competitors, and incentivized to both (1) make sure the process isn’t too onerous and (2) heavily police competitors in case they try to defect, since that would lead to an unfair advantage. The problem here is preventing overall drift towards unsafe systems, and that is much easier for a central actor like a government to coordinate.
Re: Canada and the UK, I’m really not sure why you think those societies would be less prone to policy influence; as far as I can tell they’re actually much easier cases. “Bring your business here, we don’t believe the majority of the experts [assuming we can get that] that unregulated development is decently likely to spawn a terminator might kill everyone” is actually not a great political slogan, pretty much anywhere.
But the job will with high probability be much easier than regulating the entire financial sector (the SEC’s job), which can really only be done reactively.
I’m interested in the details here! Like, ‘easier’ in the sense of “requires fewer professionals”, “requires fewer rulings by judges”, “lower downside risk”, “less adversarial optimization pressure”, something else?
[For context, in my understanding of the analogy between financial regulation and AI, the event in finance analogous to when humans would lose control of the future to AI was probably around the point of John Law.]
If AGI projects cost >$1B and require specific company cultural DNA, it’s entirely possible that we’re talking about fewer than 20 firms across the Western world.
[EDIT] Also I should note I’m more optimistic about this the more expensive AGI is / the fewer companies can approach it. My guess is that a compute-centric regulatory approach—one where you can’t use more than X compute without going to the government office or w/e—has an easier shot of working than one that tries to operate on conceptual boundaries. But we need it to be the case that much compute is actually required, and building alternative approaches to assembling that much compute (like Folding@Home, or secret government supercomputers, or w/e) are taken seriously.
“Bring your business here, we don’t believe the majority of the experts [assuming we can get that] that unregulated development is decently likely to spawn a terminator might kill everyone” is actually not a great political slogan, pretty much anywhere.
Maybe? One of the things that’s sort of hazardous about AI (and is similarly hazardous about finance) is that rainbow after rainbow leads to a pot of gold. First AI solves car accidents, then they solve having to put soldiers in dangerous situations, then they solve climate change, then they solve cancer, then—except at some point in there, you accidentally lose control of the future and probably everyone dies. And it’s pretty easy for people to dismiss this sort of concern on psychological grounds, like Steven Pinker does in Enlightenment Now.
I’m interested in the details here! Like, ‘easier’ in the sense of “requires fewer professionals”, “requires fewer rulings by judges”, “lower downside risk”, “less adversarial optimization pressure”, something else?
By “easier”, I specifically mean “overseeing fewer firms, each taking fewer actions”. I wholeheartedly agree that any sort of regulation is predicated on getting lucky re: AGI not requiring <$100M amounts of compute, when it’s developed. If as many actors can create/use AGI as can run hedge funds, policy is probably not going to help much.
My guess is that a compute-centric regulatory approach—one where you can’t use more than X compute without going to the government office or w/e—has an easier shot of working than one that tries to operate on conceptual boundaries. But we need it to be the case that much compute is actually required, and building alternative approaches to assembling that much compute (like Folding@Home, or secret government supercomputers, or w/e) are taken seriously.
IMO secret government supercomputers will never be regulatable; the only hope there is government self-regulation (by which I mean, getting governments as worried about AGI catastrophes as their leading private-sector counterparts). Folding@Home equivalents are something of an open problem; if there was one major uncertainty, I’d say they’re it, but again this is less of a problem the more compute is required.
One of the things that’s sort of hazardous about AI (and is similarly hazardous about finance) is that rainbow after rainbow leads to a pot of gold
I think that you are absolutely correct that unless e.g. the hard problem of corrigibility gets verified by the scientific community, promulgated to adjacent elites, and popularized with the public, there is little chance that proto-AGI-designers will face pressure to curb their actions. But those actions are not “impossible” in some concrete sense; they just require talent and expertise in mass persuasion, instead of community-building.
We probably have a ban on gain-of-function research in the bag, since it seems relatively easy to persuade intellectuals of the merits of the idea.
How that then translates to real-world policy is opaque to me, but give it fifty years? Half the crackpot ideas that were popular at college have come true over my lifetime.
Our problem with AI is that we can’t convince anyone that it’s dangerous.
And we may not need the fifty years! Reaching intellectual consensus might be good enough to slow it down until the government gets round to banning it.
Weirdly the other day I ran into a most eminent historian and he asked me what I’d been doing lately. As it happened I’d been worrying about AI, and so I gave him the potted version, and straight away he said: “Shouldn’t we ban it then?”, and I was like: “I think so, but that makes me a crank amongst cranks”.
My problem is that I am not capable of convincing computer scientists and mathematicians, who are usually the people who think most like me.
They always start blithering on about consciousness or ’if it’s clever enough to … then why...” etc, and although I can usually answer their immediate objections, they just come up with something else. But even my closest friends have taken a decade to realize that I might be worrying about something real instead of off on one. And I haven’t got even a significant minority of them.
And I think that’s because I don’t really understand myself. I have a terrible intuition about powerful optimization processes and that’s it.
We probably have a ban on gain-of-function research in the bag, since it seems relatively easy to persuade intellectuals of the merits of the idea.
Is this the case? Like, we had a moratorium on federal funding (not even on doing it, just whether or not taxpayers would pay for it), and it was controversial, and then we dropped it after 3 years.
You might have thought that it would be a slam dunk after there was a pandemic for which lab leak was even a plausible origin, but the people who would have been considered most responsible quickly jumped into the public sphere and tried really hard to discredit the idea. I think this is part of a general problem, which is that special interests are very committed to an issue and the public is very uncommitted, and that balance generally favors the special interests. [It’s Peter Daszak’s life on the line for the lab leak hypothesis, and a minor issue to me.] I suspect that if it ever looks like “getting rid of algorithms” is seriously on the table, lots of people will try really hard to prevent that from becoming policy.
Is this the case? Like, we had a moratorium on federal funding (not even on doing it, just whether or not taxpayers would pay for it), and it was controversial, and then we dropped it after 3 years.
And more crucially, it didn’t even stop the federal funding of Baric while it was in place. The equivalent would be that you outlaw AGI development but do nothing about people training tool AI’s and people simply declaring their development as tool AI development in response to the regulation.
It’s certainly fairly easy to persuade people that it’s a good idea, but you might be right that asymmetric lobbying can keep good ideas off the table indefinitely. On the other hand, ‘cigarettes cause cancer’ to ‘smoking bans’ took about fifty years despite an obvious asymmetry in favour of tobacco.
As I say, politics is all rather opaque to me, but once an idea is universally agreed amongst intellectuals it does seem to eventually result in political action.
Given the lack of available moves that are promising, attempting to influence policy is a reasonable move. It’s part of the 80,000 hours career suggestions.
On the other hand it’s a long-short and I see no reason to expect a high likelihood of success.
First, if there were a widely known argument about the dangers of AI, on which most public intellectual agreed.
This is exactly what we have piloted at the Existential Risk Observatory, a Dutch nonprofit founded last year. I’d say we’re fairly successful so far. Our aim is to reduce human extinction risk (especially from AGI) by informing the public debate. Concretely, what we’ve done in the past year in the Netherlands is (I’m including the detailed description so others can copy our approach—I think they should):
We have set up a good-looking website, found a board, set up a legal entity.
Asked and obtained endorsement from academics already familiar with existential risk.
Found a freelance, well-known ex-journalist and ex-parliamentarian to work with us as a media strategist.
Wrote op-eds warning about AGI existential risk, as explicitly as possible, but heeding the media strategist’s advice. Sometimes we used academic co-authors. Fouroutofsix of our op-eds were published in leading newspapers in print.
Organized drinks, networked with journalists, introduced them to others who are into AGI existential risk (e.g. EAs).
Our most recent result (last weekend) is that a prominent columnist who is agenda-setting on tech and privacy issues in NRC Handelsblad, the Dutch equivalent of the New York Times, wrote a piece where he talked about AGI existential risk as an actual thing. We’ve also had a meeting with the chairwoman of the Dutch parliamentary committee on digitization (the line between a published article and a policy meeting is direct), and a debate about AGI xrisk in the leading debate centre now seems fairly likely.
We’re not there yet, but we’ve only done this for less than a year, we’re tiny, we don’t have anyone with a significant profile, and we were self-funded (we recently got our first funding from SFF—thanks guys!).
I don’t see any reason why our approach wouldn’t translate to other countries, including the US. If you do this for a few years, consistently, and in a coordinated and funded way, I would be very surprised if you cannot get to a situation where mainstream opinion in places like the Times and the Post regards AI as quite possibly capable of destroying the world.
I also think this could be one of our chances.
Would love to think further about this, and we’re open for cooperation.
Eliezer gives alignment a 0% chance of succeeding. I think policy, if tried seriously, has >50%. So it’s a giant opportunity that’s gotten way too little attention
I’m optimistic about policy for big companies in particular. They have a lot to lose from breaking the law, they’re easy to inspect (because there’s so few), and there’s lots of precedent (ITAR already covers some software). Right now, serious AI capabilities research just isn’t profitable outside of the big tech companies
Voluntary compliance is also a very real thing. Lots of AI researchers are wealthy and high-status, and they’d have a lot to lose from breaking the law. At the very least, a law would stop them from publishing their research. A field like this also lends itself to undercover enforcement
I think an agreement with China is impossible now, because prominent folks don’t even believe the threat exists. Two factors could change the art of the possible. First, if there were a widely known argument about the dangers of AI, on which most public intellectual agreed. Second, since the US has a technological lead, it could actually be to their advantage.
Look at gain of function research for the result of a government moratorium on research. At first Baric feared that the moratorium would end his research. Then the NIH declared that his research isn’t officially gain of function and continued funding him.
Regulating gain of function research away is essentially easy mode compared to AI.
A real Butlerian jihad would be much harder.
I agree that it’s hard, but there are all sorts of possible moves (like LessWrong folks choosing to work at this future regulatory agency, or putting massive amounts of lobbying funds into making sure the rules are strict)
If the alternative (solving alignment) seems impossible given 30 years and massive amounts of money, then even a really hard policy seems easy by comparison
How about if you solve a ban on gain-of-function research first, and then move on to much harder problems like AGI? A victory on this relatively easy case would result in a lot of valuable gained experience, or, alternatively, allow foolish optimists to have their dangerous optimism broken over shorter time horizons.
I’m pretty confused about your confidence in your assertion here. Have you spoken to people who’ve lead successful government policy efforts, to ground this pessimism? Why does the IAEA exist? How did ARPA-E happen? Why is a massive subsidy for geothermal well within the Overton Window and thus in a bill Joe Manchin said he would sign?
Gain of function research is the remit of a decades-old incumbent bureaucracy (the NIH) that oversees bio policy, and doesn’t like listening to outsiders. There’s no such equivalent for AI; everyone in the government keeps asking “what should we do” and all the experts shrug or disagree with each other. What if they mostly didn’t?
Where is your imagined inertia/political opposition coming from? Is it literally skepticism that senators show up for work every day? What if I told you that most of them do and that things with low political salience and broad expert agreement happen all the time?
Where my skepticism is coming from (for AI policy) is: what’s the ban, in enough detail that it could actually be a law?
Are we going to have an Office of Program Approval, where people have to send code, the government has to read it, and only once the government signs off, it can get run? If so, the whole tech industry will try to bury you, and even if you succeed, how are you going to staff that office with people who can tell the difference between AGI code and non-AGI code?
Are we going to have laws about what not to do, plus an office of lawyers looking for people breaking the laws? (This is more the SEC model.) Then this is mostly a ban on doing things in public; the NHTSA only knew to send George Hotz a cease-and-desist because he was uploading videos of the stuff he was doing. Maybe you can get enough visibility into OpenAI and Anthropic, but do you also need to get the UK to create one to get visibility into Deepmind? If the Canadian government, proud of its AI industry and happy to support it, doesn’t make such an office, do the companies just move there?
[Like, the federal government stopped the NIH from funding stem cell research for moral reasons, and California said “fine, we’ll fund it instead.”]
If the laws are just “don’t make AI that will murder people or overthrow the government”, well, we already have laws against murdering people and overthrowing the government. The thing I’m worried about is someone running a program that they think will be fine which turns out to not be fine, and it’s hard to bridge the gap between anticipated and actual consequences with laws.
To clarify, I largely agree with the viewpoint that “just announcing a law banning AGI” is incoherent and underspecified. But the job will with high probability be much easier than regulating the entire financial sector (the SEC’s job), which can really only be done reactively.
If AGI projects cost >$1B and require specific company cultural DNA, it’s entirely possible that we’re talking about fewer than 20 firms across the Western world. These companies will be direct competitors, and incentivized to both (1) make sure the process isn’t too onerous and (2) heavily police competitors in case they try to defect, since that would lead to an unfair advantage. The problem here is preventing overall drift towards unsafe systems, and that is much easier for a central actor like a government to coordinate.
Re: Canada and the UK, I’m really not sure why you think those societies would be less prone to policy influence; as far as I can tell they’re actually much easier cases. “Bring your business here, we don’t believe the majority of the experts [assuming we can get that] that unregulated development is decently likely to spawn a terminator might kill everyone” is actually not a great political slogan, pretty much anywhere.
I’m interested in the details here! Like, ‘easier’ in the sense of “requires fewer professionals”, “requires fewer rulings by judges”, “lower downside risk”, “less adversarial optimization pressure”, something else?
[For context, in my understanding of the analogy between financial regulation and AI, the event in finance analogous to when humans would lose control of the future to AI was probably around the point of John Law.]
[EDIT] Also I should note I’m more optimistic about this the more expensive AGI is / the fewer companies can approach it. My guess is that a compute-centric regulatory approach—one where you can’t use more than X compute without going to the government office or w/e—has an easier shot of working than one that tries to operate on conceptual boundaries. But we need it to be the case that much compute is actually required, and building alternative approaches to assembling that much compute (like Folding@Home, or secret government supercomputers, or w/e) are taken seriously.
Maybe? One of the things that’s sort of hazardous about AI (and is similarly hazardous about finance) is that rainbow after rainbow leads to a pot of gold. First AI solves car accidents, then they solve having to put soldiers in dangerous situations, then they solve climate change, then they solve cancer, then—except at some point in there, you accidentally lose control of the future and probably everyone dies. And it’s pretty easy for people to dismiss this sort of concern on psychological grounds, like Steven Pinker does in Enlightenment Now.
By “easier”, I specifically mean “overseeing fewer firms, each taking fewer actions”. I wholeheartedly agree that any sort of regulation is predicated on getting lucky re: AGI not requiring <$100M amounts of compute, when it’s developed. If as many actors can create/use AGI as can run hedge funds, policy is probably not going to help much.
IMO secret government supercomputers will never be regulatable; the only hope there is government self-regulation (by which I mean, getting governments as worried about AGI catastrophes as their leading private-sector counterparts). Folding@Home equivalents are something of an open problem; if there was one major uncertainty, I’d say they’re it, but again this is less of a problem the more compute is required.
I think that you are absolutely correct that unless e.g. the hard problem of corrigibility gets verified by the scientific community, promulgated to adjacent elites, and popularized with the public, there is little chance that proto-AGI-designers will face pressure to curb their actions. But those actions are not “impossible” in some concrete sense; they just require talent and expertise in mass persuasion, instead of community-building.
We probably have a ban on gain-of-function research in the bag, since it seems relatively easy to persuade intellectuals of the merits of the idea.
How that then translates to real-world policy is opaque to me, but give it fifty years? Half the crackpot ideas that were popular at college have come true over my lifetime.
Our problem with AI is that we can’t convince anyone that it’s dangerous.
And we may not need the fifty years! Reaching intellectual consensus might be good enough to slow it down until the government gets round to banning it.
Weirdly the other day I ran into a most eminent historian and he asked me what I’d been doing lately. As it happened I’d been worrying about AI, and so I gave him the potted version, and straight away he said: “Shouldn’t we ban it then?”, and I was like: “I think so, but that makes me a crank amongst cranks”.
My problem is that I am not capable of convincing computer scientists and mathematicians, who are usually the people who think most like me.
They always start blithering on about consciousness or ’if it’s clever enough to … then why...” etc, and although I can usually answer their immediate objections, they just come up with something else. But even my closest friends have taken a decade to realize that I might be worrying about something real instead of off on one. And I haven’t got even a significant minority of them.
And I think that’s because I don’t really understand myself. I have a terrible intuition about powerful optimization processes and that’s it.
Is this the case? Like, we had a moratorium on federal funding (not even on doing it, just whether or not taxpayers would pay for it), and it was controversial, and then we dropped it after 3 years.
You might have thought that it would be a slam dunk after there was a pandemic for which lab leak was even a plausible origin, but the people who would have been considered most responsible quickly jumped into the public sphere and tried really hard to discredit the idea. I think this is part of a general problem, which is that special interests are very committed to an issue and the public is very uncommitted, and that balance generally favors the special interests. [It’s Peter Daszak’s life on the line for the lab leak hypothesis, and a minor issue to me.] I suspect that if it ever looks like “getting rid of algorithms” is seriously on the table, lots of people will try really hard to prevent that from becoming policy.
And more crucially, it didn’t even stop the federal funding of Baric while it was in place. The equivalent would be that you outlaw AGI development but do nothing about people training tool AI’s and people simply declaring their development as tool AI development in response to the regulation.
It’s certainly fairly easy to persuade people that it’s a good idea, but you might be right that asymmetric lobbying can keep good ideas off the table indefinitely. On the other hand, ‘cigarettes cause cancer’ to ‘smoking bans’ took about fifty years despite an obvious asymmetry in favour of tobacco.
As I say, politics is all rather opaque to me, but once an idea is universally agreed amongst intellectuals it does seem to eventually result in political action.
Given the lack of available moves that are promising, attempting to influence policy is a reasonable move. It’s part of the 80,000 hours career suggestions.
On the other hand it’s a long-short and I see no reason to expect a high likelihood of success.
This is exactly what we have piloted at the Existential Risk Observatory, a Dutch nonprofit founded last year. I’d say we’re fairly successful so far. Our aim is to reduce human extinction risk (especially from AGI) by informing the public debate. Concretely, what we’ve done in the past year in the Netherlands is (I’m including the detailed description so others can copy our approach—I think they should):
We have set up a good-looking website, found a board, set up a legal entity.
Asked and obtained endorsement from academics already familiar with existential risk.
Found a freelance, well-known ex-journalist and ex-parliamentarian to work with us as a media strategist.
Wrote op-eds warning about AGI existential risk, as explicitly as possible, but heeding the media strategist’s advice. Sometimes we used academic co-authors. Four out of six of our op-eds were published in leading newspapers in print.
Organized drinks, networked with journalists, introduced them to others who are into AGI existential risk (e.g. EAs).
Our most recent result (last weekend) is that a prominent columnist who is agenda-setting on tech and privacy issues in NRC Handelsblad, the Dutch equivalent of the New York Times, wrote a piece where he talked about AGI existential risk as an actual thing. We’ve also had a meeting with the chairwoman of the Dutch parliamentary committee on digitization (the line between a published article and a policy meeting is direct), and a debate about AGI xrisk in the leading debate centre now seems fairly likely.
We’re not there yet, but we’ve only done this for less than a year, we’re tiny, we don’t have anyone with a significant profile, and we were self-funded (we recently got our first funding from SFF—thanks guys!).
I don’t see any reason why our approach wouldn’t translate to other countries, including the US. If you do this for a few years, consistently, and in a coordinated and funded way, I would be very surprised if you cannot get to a situation where mainstream opinion in places like the Times and the Post regards AI as quite possibly capable of destroying the world.
I also think this could be one of our chances.
Would love to think further about this, and we’re open for cooperation.