So, the workshop discussion (plus your “Intelligence Explosion” paper) lead to three possible approaches:
differentially push for WBE vs. neuromorphic AI (e.g., research topics within WBE that contribute least to neuromorphic AI)
differentially push for FAI vs. general de novo AI
push for intelligence amplification
It seems really hard to differentially push for FAI. For example I’ve mostly stopped working on decision theory because it seems to help UFAI as much as FAI. The only safe topics within FAI that I can see are ethics (normative and meta) and meta-philosophy, which are not really things you can throw resources at. I’m much less familiar with WBE but naively I would think that there are more opportunities for research in WBE that don’t contribute too much to neuromorphic AI.
For example I’ve mostly stopped working on decision theory because it seems to help UFAI as much as FAI.
I think there are potential avenues of development of decision theory that might help FAI more than uFAI; I think maybe you should talk to Steve Rayhawk to see if he has any thoughts about this.
Anyway I praise your prudence, especially as it seems like a real logical possibility that AGI can’t be engineered without first solving self-reference and logical uncertainty.
For example I’ve mostly stopped working on decision theory because it seems to help UFAI as much as FAI. The only safe topics within FAI that I can see are ethics (normative and meta) and meta-philosophy, which are not really things you can throw resources at.
I see many examples where ideas associated with decision theory (and surfacing from thinking about it) clarify philosophical or metaethical questions. For example, meaning of beliefs, of decisions, or observations, what influences decisions (and so where to look for preference), the way preference could be accessed through beliefs about it, all these things happen in context of certain setups of control and interpretation. I don’t think it’s possible to separate decision theory from FAI theory, and if we all stop developing FAI theory, we lose automatically.
The way I see it, even if we completely solve decision theory, there are so many other problems involved with building an FAI that the success probability (unless we first develop WBE or intelligence amplification) is still tiny. So working on decision theory is counterproductive if it raises the probability of UFAI coming before WBE/IA by even a small delta.
I don’t think it’s possible to separate decision theory from FAI theory, and if we all stop developing FAI theory, we lose automatically.
Of course I’m not suggesting we stop such work permanently, only until WBE or IA arrives. (Or when they are close enough that the probability of UFAI coming first is no longer significant, or if it becomes clear that de novo AI is much easier than WBE and IA and we have no choice but to push for FAI directly.)
What changes with WBE? Waiting for it already forsakes significant part of the expected future, but then what? It accelerates the physical timeline towards FAI and catastrophe both, and squeezing more FAI than catastrophe out of it (compared to pre-WBE ratio) requires rather unlikely circumstances.
The way I see it, even if we completely solve decision theory, there are so many other problems involved with building an FAI that the success probability (unless we first develop WBE or intelligence amplification) is still tiny.
Yes. (With the caveat that I’m regarding decision theory as a currently salient instrumental focus in pursuing the overall FAI problem, not a goal in itself.)
So working on decision theory is counterproductive if it raises the probability of UFAI coming before WBE/IA by even a small delta.
There is probability and then probability in given period of time. Raising probability of catastrophe before WBE doesn’t necessarily raise it overall (for example, suppose catastrophe is inevitable, then moving it closer doesn’t change the probability of it eventually occurring, and introducing a bit of probability of lack of catastrophe in exchange for moving the catastrophe branch closer does reduce the long term risk).
Lack of a catastrophe is not a stable state, it can be followed by a catastrophe, while implemented FAI is stable. You seem to be considering two alternatives: (1) reduction in obscure more immediate risk of UFAI, achieved by a few people deciding not to think about decision theory and hence not talking about it in public (this decision, as it’s being made, doesn’t take many out of the pool of those who would make progress towards UFAI); and (2) taking a small chance to get FAI right. Estimating FAI ever being solved as low, I think choosing (2) is the correct play. Alternatively, don’t talk about the results openly, but work anyway (which is an important decision, but the community is too weak right now for closing itself off, and our present ideas are too feeble to pose significant risk; likely should turn to secrecy later).
(I notice that I don’t understand this clearly enough, will have to reflect in more detail.)
Did you read the part of the workshop report that talked about this?
this decision, as it’s being made, doesn’t take many out of the pool of those who would make progress towards UFAI
Getting decision theory “right enough” could be important for building a viable UFAI (or at least certain types of it, e.g., non-neuromorphic). There’s reason to think for example that AIXI would fail due to incorrect decision theory (but people trying to make AIXI practical do not seem to realize this yet). Given that we seem to constitute a large portion of all people trying to get decision theory right for AI purposes, the effect of our decisions might be larger than you think.
Alternatively, don’t talk about the results openly, but work anyway
Yes, but of course that reduces the positive effects of working on decision theory, so you might decide that you should do something else instead. For example I think that thinking about strategy and meta-philosophy might be better uses of my time. (Also, I suggest that keeping secrets is very hard so even this alternative of working in secret may be a net negative.)
Did you read the part of the workshop report that talked about this?
Yes, and I agree, but it’s not what I referred to. The essential part of the claim (as I accept it) is that given WBE, there exist scenarios where FAI can be developed much more reliably than in any feasible pre-WBE scenario. At the very least, dominating WBE theoretically allows to spend thousands of subjective years working on the problem, while in pre-WBE mode we have at most 150 and more likely about 50-80 years.
What I was talking about is probability of success. FAI (and FAI theory in particular, as a technology-independent component) is a race against AGI and other disasters, which become more likely as technology develops. In any given time interval, all else equal, completion of an AGI seems significantly more likely than of FAI. It’s this probability of winning vs. losing the race in any given time that I don’t see expected to change with the WBE transition. Just as FAI research gets more time, so is AGI research expected to get more time, unless somehow FAI researchers outrun everyone else for WBE resources, what I called “dominating WBE” above, but that’s an unlikely feat, I don’t have reasons for seeing that as more likely than just solving FAI pre-WBE.
In other words, we have two different low-probability events that bound success pre-WBE and post-WBE: solving the likely-too-difficult problem of FAI in a short time (pre-WBE), and outrunning “competing” AGI projects (post-WBE). If AGI is easy, pre-WBE is more important, because probability of surviving to post-WBE is then low. If AGI is hard, then FAI is hard too, and so we must rely on the post-WBE stage.
The gamble is on uncertainty about how hard FAI and AGI are. If they are very hard, we’ll probably get to the WBE race. Otherwise, it’s worth trying now, just in case it’s possible to solve FAI earlier, or perhaps to develop the theory well enough to gain high-profile claim on dominating WBE and finishing the project before competing risks.
Just as FAI research gets more time, so is AGI research expected to get more time, unless somehow FAI researchers outrun everyone else for WBE resources, what I called “dominating WBE” above, but that’s an unlikely feat, I don’t have reasons for seeing that as more likely than just solving FAI pre-WBE.
In order for FAI to win pre-WBE, FAI has to get more resources than AGI (e.g., more, smarter researchers, computing power), but because FAI is much harder than AGI, it needs a large advantage. The “race for WBE” is better because it’s a fairer one and you may only need to win by a small margin.
Also, if someone (who isn’t necessarily an FAI group to start with) dominates WBE, they have no strong reason to immediately aim for AGI. What does it buy them that they don’t already have? They can take the (subjective) time to think over the situation, and perhaps decide that FAI would be the best way to move forward.
In order for FAI to win pre-WBE, FAI has to get more resources than AGI (e.g., more, smarter researchers, computing power), but because FAI is much harder than AGI, it needs a large advantage. The “race for WBE” is better because it’s a fairer one and you may only need to win by a small margin.
If FAI is much harder, WBE race has more potential for winning than pre-WBE race, but still low probability (getting more resources than all AI efforts is unlikely, and by the time the WBE race even begins, a lot is already lost).
Also, if someone (who isn’t necessarily an FAI group to start with) dominates WBE, they have no strong reason to immediately aim for AGI. What does it buy them that they don’t already have? They can take the (subjective) time to think over the situation, and perhaps decide that FAI would be the best way to move forward.
No strong reason but natural stupidity. This argues for developing enough theory pre-WBE to make deliberate delay in developing AGI respectable/likely to get traction.
The workshop report gave “a roughly 14% chance of win if de novo AI (Friendly AI or not) came first” so the two of us seem to be much more pessimistic than average. Do you think we should be updating in their direction, or vice versa? (Unless the workshop is counting “partial wins” like an AI that fills the universe with orgasmium, or what I called “Instrumentally Friendly AI”?)
So, the workshop discussion (plus your “Intelligence Explosion” paper) lead to three possible approaches:
differentially push for WBE vs. neuromorphic AI (e.g., research topics within WBE that contribute least to neuromorphic AI)
differentially push for FAI vs. general de novo AI
push for intelligence amplification
It seems really hard to differentially push for FAI. For example I’ve mostly stopped working on decision theory because it seems to help UFAI as much as FAI. The only safe topics within FAI that I can see are ethics (normative and meta) and meta-philosophy, which are not really things you can throw resources at. I’m much less familiar with WBE but naively I would think that there are more opportunities for research in WBE that don’t contribute too much to neuromorphic AI.
Has anyone been working on these questions?
I think there are potential avenues of development of decision theory that might help FAI more than uFAI; I think maybe you should talk to Steve Rayhawk to see if he has any thoughts about this.
Anyway I praise your prudence, especially as it seems like a real logical possibility that AGI can’t be engineered without first solving self-reference and logical uncertainty.
I see many examples where ideas associated with decision theory (and surfacing from thinking about it) clarify philosophical or metaethical questions. For example, meaning of beliefs, of decisions, or observations, what influences decisions (and so where to look for preference), the way preference could be accessed through beliefs about it, all these things happen in context of certain setups of control and interpretation. I don’t think it’s possible to separate decision theory from FAI theory, and if we all stop developing FAI theory, we lose automatically.
The way I see it, even if we completely solve decision theory, there are so many other problems involved with building an FAI that the success probability (unless we first develop WBE or intelligence amplification) is still tiny. So working on decision theory is counterproductive if it raises the probability of UFAI coming before WBE/IA by even a small delta.
Of course I’m not suggesting we stop such work permanently, only until WBE or IA arrives. (Or when they are close enough that the probability of UFAI coming first is no longer significant, or if it becomes clear that de novo AI is much easier than WBE and IA and we have no choice but to push for FAI directly.)
What changes with WBE? Waiting for it already forsakes significant part of the expected future, but then what? It accelerates the physical timeline towards FAI and catastrophe both, and squeezing more FAI than catastrophe out of it (compared to pre-WBE ratio) requires rather unlikely circumstances.
Yes. (With the caveat that I’m regarding decision theory as a currently salient instrumental focus in pursuing the overall FAI problem, not a goal in itself.)
There is probability and then probability in given period of time. Raising probability of catastrophe before WBE doesn’t necessarily raise it overall (for example, suppose catastrophe is inevitable, then moving it closer doesn’t change the probability of it eventually occurring, and introducing a bit of probability of lack of catastrophe in exchange for moving the catastrophe branch closer does reduce the long term risk).
Lack of a catastrophe is not a stable state, it can be followed by a catastrophe, while implemented FAI is stable. You seem to be considering two alternatives: (1) reduction in obscure more immediate risk of UFAI, achieved by a few people deciding not to think about decision theory and hence not talking about it in public (this decision, as it’s being made, doesn’t take many out of the pool of those who would make progress towards UFAI); and (2) taking a small chance to get FAI right. Estimating FAI ever being solved as low, I think choosing (2) is the correct play. Alternatively, don’t talk about the results openly, but work anyway (which is an important decision, but the community is too weak right now for closing itself off, and our present ideas are too feeble to pose significant risk; likely should turn to secrecy later).
(I notice that I don’t understand this clearly enough, will have to reflect in more detail.)
Did you read the part of the workshop report that talked about this?
Getting decision theory “right enough” could be important for building a viable UFAI (or at least certain types of it, e.g., non-neuromorphic). There’s reason to think for example that AIXI would fail due to incorrect decision theory (but people trying to make AIXI practical do not seem to realize this yet). Given that we seem to constitute a large portion of all people trying to get decision theory right for AI purposes, the effect of our decisions might be larger than you think.
Yes, but of course that reduces the positive effects of working on decision theory, so you might decide that you should do something else instead. For example I think that thinking about strategy and meta-philosophy might be better uses of my time. (Also, I suggest that keeping secrets is very hard so even this alternative of working in secret may be a net negative.)
Yes, and I agree, but it’s not what I referred to. The essential part of the claim (as I accept it) is that given WBE, there exist scenarios where FAI can be developed much more reliably than in any feasible pre-WBE scenario. At the very least, dominating WBE theoretically allows to spend thousands of subjective years working on the problem, while in pre-WBE mode we have at most 150 and more likely about 50-80 years.
What I was talking about is probability of success. FAI (and FAI theory in particular, as a technology-independent component) is a race against AGI and other disasters, which become more likely as technology develops. In any given time interval, all else equal, completion of an AGI seems significantly more likely than of FAI. It’s this probability of winning vs. losing the race in any given time that I don’t see expected to change with the WBE transition. Just as FAI research gets more time, so is AGI research expected to get more time, unless somehow FAI researchers outrun everyone else for WBE resources, what I called “dominating WBE” above, but that’s an unlikely feat, I don’t have reasons for seeing that as more likely than just solving FAI pre-WBE.
In other words, we have two different low-probability events that bound success pre-WBE and post-WBE: solving the likely-too-difficult problem of FAI in a short time (pre-WBE), and outrunning “competing” AGI projects (post-WBE). If AGI is easy, pre-WBE is more important, because probability of surviving to post-WBE is then low. If AGI is hard, then FAI is hard too, and so we must rely on the post-WBE stage.
The gamble is on uncertainty about how hard FAI and AGI are. If they are very hard, we’ll probably get to the WBE race. Otherwise, it’s worth trying now, just in case it’s possible to solve FAI earlier, or perhaps to develop the theory well enough to gain high-profile claim on dominating WBE and finishing the project before competing risks.
In order for FAI to win pre-WBE, FAI has to get more resources than AGI (e.g., more, smarter researchers, computing power), but because FAI is much harder than AGI, it needs a large advantage. The “race for WBE” is better because it’s a fairer one and you may only need to win by a small margin.
Also, if someone (who isn’t necessarily an FAI group to start with) dominates WBE, they have no strong reason to immediately aim for AGI. What does it buy them that they don’t already have? They can take the (subjective) time to think over the situation, and perhaps decide that FAI would be the best way to move forward.
If FAI is much harder, WBE race has more potential for winning than pre-WBE race, but still low probability (getting more resources than all AI efforts is unlikely, and by the time the WBE race even begins, a lot is already lost).
No strong reason but natural stupidity. This argues for developing enough theory pre-WBE to make deliberate delay in developing AGI respectable/likely to get traction.
The workshop report gave “a roughly 14% chance of win if de novo AI (Friendly AI or not) came first” so the two of us seem to be much more pessimistic than average. Do you think we should be updating in their direction, or vice versa? (Unless the workshop is counting “partial wins” like an AI that fills the universe with orgasmium, or what I called “Instrumentally Friendly AI”?)