Agree it’s hard to prove a negative, but personally I find the following argument pretty suggestive:
“Other AGI labs have some plans—these are the plans we think are bad, and a pivotal act will have to disrupt them. But if we, ourselves, are an AGI lab with some plan, we should expect our pivotal agent to also be able to disrupt our plans. This does not directly lead to the end of the world, but it definitely includes root access to the datacenter.”
Other AGI labs have some plans—these are the plans we think are bad, and a pivotal act will have to disrupt them.
Here’s the thing I’m stuck on lately. Does it really follow from “Other AGI labs have some plans—these are the plans we think are bad” that some drastic and violent-seeming plan like burning all the world’s GPUs with nanobots is needed?
I know Eliezer tried to settle this point with 4. We can’t just “decide not to build AGI”, but it seems like the obvious kinds of ‘pivotal acts’ needed are much boring and less technological than he believes, e.g. have conversations with a few important people, probably the leadership at top AI labs.
Some people seem to think this has been tried and didn’t work. And I suppose I don’t know the extent to which this has been tried, as any meetings that have been had with leadership at the AI labs, the participants probably aren’t liberty to talk about. But it just seems like there should be hundreds of different angles, asks, pleads, compromises, bargains etc. with different influential people before it would make sense to conclude that the logical course of action is “nanobots”.
The problem is that (1) the benefits of AI are large; (2) there are lots of competing actors; (3) verification is hard; (4) no one really knows where the lines are and (5) timelines may be short.
(2) In addition to major companies in the US, AI research is also conducted in major companies in foreign countries, most notably China. The US government and the Chinese government both view AI as a competitive advantage. So, there are a lot of stakeholders, not all of whom AGI risk aware Americans have easy access to, who would have to agree. (And, of course, new companies can be founded all the time.) So you need almost a universal level of agreement.
(3) Let’s say everyone relevant agrees. The incentive to cheat is enormous. Usually, the way to prevent cheating is some form of verification. How do you verify that no one is conducting AI research? If there is no verification, there will likely be no agreement. And even if there is, the effectiveness would be limited. (Banning GPU production might be verifiable, but note that you have now increased the pool of opponents of your AI research ban significantly and you now need global agreement by all relevant governments on this point.)
(4) There may be agreement on the risk of AGI, but people may have confidence that we are at least a certain distance away from AGI or that certain forms of research don’t pose a threat. This will tend to cause agreements to restrict AGI research to be limited.
(5) How long do we have to get this agreement? I am very confident that we won’t have dangerous AI within the next six years. On the other hand, it took 13 years to get general agreement on banning CFCs after the ozone hole was discovered. I don’t think we will have dangerous AI in 13 years, but other people do. On the other hand, if an agreement between governments is required, 13 years seems optimistic.
In addition to the mentions in the post about Facebook AI being rather hostile to the AI safety issue in general, convincing them and top people at OpenAI and Deepmind might still not be enough. You need to prevent every company who talks to some venture capitalists and can convince them how profitable AGI could be. Hell, depending on how easy the solution ends up being, you might even have to prevent anyone with a 3080 and access to arXiv from putting something together in their home office.
This really is “uproot the entire AI research field” and not “tell Deepmind to cool it.”
Agree it’s hard to prove a negative, but personally I find the following argument pretty suggestive:
“Other AGI labs have some plans—these are the plans we think are bad, and a pivotal act will have to disrupt them. But if we, ourselves, are an AGI lab with some plan, we should expect our pivotal agent to also be able to disrupt our plans. This does not directly lead to the end of the world, but it definitely includes root access to the datacenter.”
Here’s the thing I’m stuck on lately. Does it really follow from “Other AGI labs have some plans—these are the plans we think are bad” that some drastic and violent-seeming plan like burning all the world’s GPUs with nanobots is needed?
I know Eliezer tried to settle this point with 4. We can’t just “decide not to build AGI”, but it seems like the obvious kinds of ‘pivotal acts’ needed are much boring and less technological than he believes, e.g. have conversations with a few important people, probably the leadership at top AI labs.
Some people seem to think this has been tried and didn’t work. And I suppose I don’t know the extent to which this has been tried, as any meetings that have been had with leadership at the AI labs, the participants probably aren’t liberty to talk about. But it just seems like there should be hundreds of different angles, asks, pleads, compromises, bargains etc. with different influential people before it would make sense to conclude that the logical course of action is “nanobots”.
Definitely.
The problem is that (1) the benefits of AI are large; (2) there are lots of competing actors; (3) verification is hard; (4) no one really knows where the lines are and (5) timelines may be short.
(2) In addition to major companies in the US, AI research is also conducted in major companies in foreign countries, most notably China. The US government and the Chinese government both view AI as a competitive advantage. So, there are a lot of stakeholders, not all of whom AGI risk aware Americans have easy access to, who would have to agree. (And, of course, new companies can be founded all the time.) So you need almost a universal level of agreement.
(3) Let’s say everyone relevant agrees. The incentive to cheat is enormous. Usually, the way to prevent cheating is some form of verification. How do you verify that no one is conducting AI research? If there is no verification, there will likely be no agreement. And even if there is, the effectiveness would be limited. (Banning GPU production might be verifiable, but note that you have now increased the pool of opponents of your AI research ban significantly and you now need global agreement by all relevant governments on this point.)
(4) There may be agreement on the risk of AGI, but people may have confidence that we are at least a certain distance away from AGI or that certain forms of research don’t pose a threat. This will tend to cause agreements to restrict AGI research to be limited.
(5) How long do we have to get this agreement? I am very confident that we won’t have dangerous AI within the next six years. On the other hand, it took 13 years to get general agreement on banning CFCs after the ozone hole was discovered. I don’t think we will have dangerous AI in 13 years, but other people do. On the other hand, if an agreement between governments is required, 13 years seems optimistic.
In addition to the mentions in the post about Facebook AI being rather hostile to the AI safety issue in general, convincing them and top people at OpenAI and Deepmind might still not be enough. You need to prevent every company who talks to some venture capitalists and can convince them how profitable AGI could be. Hell, depending on how easy the solution ends up being, you might even have to prevent anyone with a 3080 and access to arXiv from putting something together in their home office.
This really is “uproot the entire AI research field” and not “tell Deepmind to cool it.”