Ok, I thought when you said “FAI project” you meant a project to build FAI. But I’ve noticed two problems with trying to do some of the relatively safe FAI-related problems in public:
It’s hard to predetermine whether a problem is safe, and hard to stop or slow down once research momentum gets going. For example I’ve become concerned that decision theory research may be dangerous, but I’m having trouble getting even myself to stop.
All the problems, safe and unsafe, are interrelated, and people working on the (seemly) safe problems will naturally become interested in the (more obviously) unsafe ones as well and start thinking about them. (For example solving CEV seems to require understanding the nature of “preference”, which leads to decision theory, and solving decision theory seems to require understanding the nature of logical uncertainty.) It seems very hard to prevent this or make all the researchers conscientious and security-conscious enough to not leak out or deliberately publish (e.g. to gain academic reputation) unsafe research results. Even if you pick the initial researchers to be especially conscientious and security-conscious, the problem will get worse as they publish results and other people become interested in their research areas.
Yes, both Eliezer and I (and many others) agree with these points. Eliezer seems pretty set on only doing a basement-style FAI team, perhaps because he’s thought about the situation longer and harder than I have. I’m still exploring to see whether there are strategic alternatives, or strategic tweaks. I’m hoping we can discuss this in more detail when my strategic analysis series gets there.
But it seems like SIAI has already deviated from the basement-style FAI plan, since it started supporting research associates who are allowed/encouraged to publish openly, and encouraging public FAI-related research in other ways (such as publishing a list of open problems). And if the “slippery slope” problems I described were already known, why didn’t anyone bring them up during the discussions about whether to publish papers about UDT? (I myself only thought of them in the general explicit form yesterday.)
If SIAI already knew about these problems but still thinks it’s a good idea to promote public FAI-related research and publish papers about decision theory, then I’m even more confused than before. I hope your series “gets there” soon so I can see where the cause of the disagreement lies.
What I’m saying is that there are costs and benefits to open FAI work. You listed some costs, but that doesn’t mean there aren’t also benefits. See, e.g. Vladimir’s comment.
The benefits are only significant if there is a significant chance of successfully building FAI before some UFAI project takes off. Maybe our disagreement just boils down to different intuitions about that? But Nesov agrees this chance is “tiny” and still wants to push open research, so I’m still confused.
The benefits are only significant if there is a significant chance of successfully building FAI before some UFAI project takes off. … But Nesov agrees this chance is “tiny” and still wants to push open research, so I’m still confused.
I want to make it bigger, as much as I can. It doesn’t matter how small a chance of winning there is, as long as our actions improve it. Giving up doesn’t seem like a strategy that leads to winning. The strategy of navigating the WBE transition (or some more speculative intelligence improvement tool) is a more complicated question, and I don’t see in what way the background catastrophic risk matters for it.
This also came up in a previous discussion about this we had: it’s necessary to distinguish the risk within a given interval of years, and the eventual risk (i.e. the risk of never building a FAI). The same action can make immediate risk worse, but probability of eventually winning higher. I think encouraging an open effort for researching metaethics through decision theory is like that; also better acceptance of the problem might be leveraged to overturn the hypothetical increase in UFAI risk.
It doesn’t matter how small a chance of winning there is, as long as our actions improve it.
Yes, if we’re talking about the overall chance of winning, but I was talking about the chance of winning through a specific scenario (directly building FAI). If the chance of that is tiny, why did your cost/benefit analysis of the proposed course of action (encouraging open FAI research) focus completely on it? Shouldn’t we be thinking more about how the proposal affects other ways of winning? ETA: To spell it out, encouraging open FAI research decreases the probability that we win by winning the WBE race or through intelligence amplification, by increasing the probability that UFAI happens first.
Giving up doesn’t seem like a strategy that leads to winning.
Nobody is saying “let’s give up”. If we don’t encourage open FAI research, we can still push for a positive Singularity in other ways, some of which I’ve posted about recently in discussion.
The strategy of navigating the WBE transition (or some more speculative intelligence improvement tool) is a more complicated question, and I don’t see in what way the background catastrophic risk matters for it.
What do you mean? What aren’t you seeing?
The same action can make immediate risk worse, but probability of eventually winning higher.
Yes, of course. I am talking about the probability of eventually winning.
Yes, if we’re talking about the overall chance of winning, but I was talking about the chance of winning through a specific scenario (directly building FAI). If the chance of that is tiny, why did your cost/benefit analysis of the proposed course of action (encouraging open FAI research) focus completely on it?
I see, I’m guessing you view the “second round” (post-WBE/human intelligence improvement) as not being similarly unlikely to eventually win. I agree that if the first round (working on FAI now, pre-WBE) has only a tiny chance of winning, while the second has a non-tiny chance (taking into account the probability of no catastrophe till the second round and it being dominated by a FAI project rather than random AGI), then it’s better to sacrifice the first round to make the second round healthier. But I also only see a tiny chance of winning the second round, mostly because of the increasing UFAI risk and the difficulty of winning a race that grants you the advantages of the second round, rather than producing an UFAI really fast.
The same action can make immediate risk worse, but probability of eventually winning higher.
Near/Far. Long-term effects aren’t predictable and shouldn’t be traded for more predictable short-term losses. In my experience it fails the Predictable Retrospective Stupidity test. Even when you try to factor in structural uncertainty, you still end up getting burned. And even if you still want to make such a tradeoff then you should halt all research until you’ve come to agreement or a natural stopping point with Wei Dai or others who have reservations. Stop, melt, catch fire, don’t destroy the world.
(Disclaimer: This comment is fueled by a strong emotional reaction due to contingent personal details that might or might not upon further reflection deserve to be treated as substantial evidence for the policy I recommend.)
Just to make clear what specific idea this is about: Wei points out that researching FAI might increase UFAI risk, and suggests that therefore FAI shouldn’t be researched. My reply is to the effect that while FAI research might increase UFAI risk within any given number of years, it also decreases the risk of never solving FAI (which IIRC I put at something like 95% if we research it pre-WBE, and 97% if we don’t).
There seems to be a tradeoff here. An open project has more chances to develop the necessary theory faster, but having such project in the open looks like a clearly bad idea towards the endgame. So on one hand, an open project shouldn’t be cultivated (and becomes harder to hinder) as we get closer to the endgame, but on the other, a closed project will probably not get off the ground, and fueling it by an initial open effort is one way to make it stronger. So there’s probably some optimal point to stop encouraging open development, and given the current state of the theory (nil) I believe the time hasn’t come yet.
The open effort could help the subsequent closed project in two related ways: gauge the point where the understanding of what to actually do in the closed project is sufficiently clear (for some sense of “sufficiently”), and form enough of background theory to be able to convince enough young Conways (with necessary training) to work on the problem on the closed stage.
So there’s probably some optimal point to stop encouraging open development, and given the current state of the theory (nil) I believe the time hasn’t come yet.
Your argument seems premised on the assumption that there will be an endgame. If we assume some large probability that we end up deciding not to have an endgame at all (i.e., not to try to actually build FAI with unenhanced humans), then it’s no longer clear “the time hasn’t come yet”.
Even if we assume that with probability ~1 there will be an effort to directly build FAI, given the slippery slope effects we have to stop encouraging open research well before the closed project starts. The main deciding factors for “when” must be how large the open research community has gotten, how strong the slippery slope effects are, and how much “pull” SingInst has against those effects. The “current state of the theory” seems to have little to do with it. (Edit: No that’s too strong. Let me amend it to “one consideration among many”.)
If we assume some large probability that we end up deciding not to have an endgame at all (i.e., not to try to actually build FAI with unenhanced humans), then it’s no longer clear “the time hasn’t come yet”.
This is something we’ll know better further down the road, so as long as it’s possible to defer this decision (i.e. while the downside is not too great, however that should be estimated), it’s the right thing to do. I still can’t rule out that there might be a preference definition procedure (that refers to humans) simple enough to be implemented pre-WBE, and decision theory seems to be an attack on this possibility (clarifying why this is naive, for example, in which case it’ll also serve as an argument to the powerful in the WBE race).
The “current state of the theory” seems to have little to do with it. (Edit: No that’s too strong. Let me amend it to “one consideration among many”.)
Well, maybe not specifically current, but what can be expected eventually, for the closed project to benefit from, which does seem to me like a major consideration in the possibility of its success.
Ok, I thought when you said “FAI project” you meant a project to build FAI. But I’ve noticed two problems with trying to do some of the relatively safe FAI-related problems in public:
It’s hard to predetermine whether a problem is safe, and hard to stop or slow down once research momentum gets going. For example I’ve become concerned that decision theory research may be dangerous, but I’m having trouble getting even myself to stop.
All the problems, safe and unsafe, are interrelated, and people working on the (seemly) safe problems will naturally become interested in the (more obviously) unsafe ones as well and start thinking about them. (For example solving CEV seems to require understanding the nature of “preference”, which leads to decision theory, and solving decision theory seems to require understanding the nature of logical uncertainty.) It seems very hard to prevent this or make all the researchers conscientious and security-conscious enough to not leak out or deliberately publish (e.g. to gain academic reputation) unsafe research results. Even if you pick the initial researchers to be especially conscientious and security-conscious, the problem will get worse as they publish results and other people become interested in their research areas.
Yes, both Eliezer and I (and many others) agree with these points. Eliezer seems pretty set on only doing a basement-style FAI team, perhaps because he’s thought about the situation longer and harder than I have. I’m still exploring to see whether there are strategic alternatives, or strategic tweaks. I’m hoping we can discuss this in more detail when my strategic analysis series gets there.
But it seems like SIAI has already deviated from the basement-style FAI plan, since it started supporting research associates who are allowed/encouraged to publish openly, and encouraging public FAI-related research in other ways (such as publishing a list of open problems). And if the “slippery slope” problems I described were already known, why didn’t anyone bring them up during the discussions about whether to publish papers about UDT? (I myself only thought of them in the general explicit form yesterday.)
If SIAI already knew about these problems but still thinks it’s a good idea to promote public FAI-related research and publish papers about decision theory, then I’m even more confused than before. I hope your series “gets there” soon so I can see where the cause of the disagreement lies.
What I’m saying is that there are costs and benefits to open FAI work. You listed some costs, but that doesn’t mean there aren’t also benefits. See, e.g. Vladimir’s comment.
The benefits are only significant if there is a significant chance of successfully building FAI before some UFAI project takes off. Maybe our disagreement just boils down to different intuitions about that? But Nesov agrees this chance is “tiny” and still wants to push open research, so I’m still confused.
I want to make it bigger, as much as I can. It doesn’t matter how small a chance of winning there is, as long as our actions improve it. Giving up doesn’t seem like a strategy that leads to winning. The strategy of navigating the WBE transition (or some more speculative intelligence improvement tool) is a more complicated question, and I don’t see in what way the background catastrophic risk matters for it.
This also came up in a previous discussion about this we had: it’s necessary to distinguish the risk within a given interval of years, and the eventual risk (i.e. the risk of never building a FAI). The same action can make immediate risk worse, but probability of eventually winning higher. I think encouraging an open effort for researching metaethics through decision theory is like that; also better acceptance of the problem might be leveraged to overturn the hypothetical increase in UFAI risk.
Yes, if we’re talking about the overall chance of winning, but I was talking about the chance of winning through a specific scenario (directly building FAI). If the chance of that is tiny, why did your cost/benefit analysis of the proposed course of action (encouraging open FAI research) focus completely on it? Shouldn’t we be thinking more about how the proposal affects other ways of winning? ETA: To spell it out, encouraging open FAI research decreases the probability that we win by winning the WBE race or through intelligence amplification, by increasing the probability that UFAI happens first.
Nobody is saying “let’s give up”. If we don’t encourage open FAI research, we can still push for a positive Singularity in other ways, some of which I’ve posted about recently in discussion.
What do you mean? What aren’t you seeing?
Yes, of course. I am talking about the probability of eventually winning.
(Another thread of this conversation is here.)
I see, I’m guessing you view the “second round” (post-WBE/human intelligence improvement) as not being similarly unlikely to eventually win. I agree that if the first round (working on FAI now, pre-WBE) has only a tiny chance of winning, while the second has a non-tiny chance (taking into account the probability of no catastrophe till the second round and it being dominated by a FAI project rather than random AGI), then it’s better to sacrifice the first round to make the second round healthier. But I also only see a tiny chance of winning the second round, mostly because of the increasing UFAI risk and the difficulty of winning a race that grants you the advantages of the second round, rather than producing an UFAI really fast.
Near/Far. Long-term effects aren’t predictable and shouldn’t be traded for more predictable short-term losses. In my experience it fails the Predictable Retrospective Stupidity test. Even when you try to factor in structural uncertainty, you still end up getting burned. And even if you still want to make such a tradeoff then you should halt all research until you’ve come to agreement or a natural stopping point with Wei Dai or others who have reservations. Stop, melt, catch fire, don’t destroy the world.
(Disclaimer: This comment is fueled by a strong emotional reaction due to contingent personal details that might or might not upon further reflection deserve to be treated as substantial evidence for the policy I recommend.)
Just to make clear what specific idea this is about: Wei points out that researching FAI might increase UFAI risk, and suggests that therefore FAI shouldn’t be researched. My reply is to the effect that while FAI research might increase UFAI risk within any given number of years, it also decreases the risk of never solving FAI (which IIRC I put at something like 95% if we research it pre-WBE, and 97% if we don’t).
When I have analyzed this problem previously my reasoning matched that listed by Nesov here.
Yeah, we’ll come back to this in the strategy series. There are lots of details to consider.
There seems to be a tradeoff here. An open project has more chances to develop the necessary theory faster, but having such project in the open looks like a clearly bad idea towards the endgame. So on one hand, an open project shouldn’t be cultivated (and becomes harder to hinder) as we get closer to the endgame, but on the other, a closed project will probably not get off the ground, and fueling it by an initial open effort is one way to make it stronger. So there’s probably some optimal point to stop encouraging open development, and given the current state of the theory (nil) I believe the time hasn’t come yet.
The open effort could help the subsequent closed project in two related ways: gauge the point where the understanding of what to actually do in the closed project is sufficiently clear (for some sense of “sufficiently”), and form enough of background theory to be able to convince enough young Conways (with necessary training) to work on the problem on the closed stage.
Your argument seems premised on the assumption that there will be an endgame. If we assume some large probability that we end up deciding not to have an endgame at all (i.e., not to try to actually build FAI with unenhanced humans), then it’s no longer clear “the time hasn’t come yet”.
Even if we assume that with probability ~1 there will be an effort to directly build FAI, given the slippery slope effects we have to stop encouraging open research well before the closed project starts. The main deciding factors for “when” must be how large the open research community has gotten, how strong the slippery slope effects are, and how much “pull” SingInst has against those effects. The “current state of the theory” seems to have little to do with it. (Edit: No that’s too strong. Let me amend it to “one consideration among many”.)
This is something we’ll know better further down the road, so as long as it’s possible to defer this decision (i.e. while the downside is not too great, however that should be estimated), it’s the right thing to do. I still can’t rule out that there might be a preference definition procedure (that refers to humans) simple enough to be implemented pre-WBE, and decision theory seems to be an attack on this possibility (clarifying why this is naive, for example, in which case it’ll also serve as an argument to the powerful in the WBE race).
Well, maybe not specifically current, but what can be expected eventually, for the closed project to benefit from, which does seem to me like a major consideration in the possibility of its success.