In my view, in practice, the pivotal acts framing actually pushes people to consider a more narrow space of discrete powerful actions, “sharp turns”, “events that have a game-changing impact on astronomical stakes”.
My objection to Critch’s post wasn’t ‘you shouldn’t talk about pivotal processes, just pivotal acts’. On the contrary, I think bringing in pivotal processes is awesome.
My objection (more so to “Pivotal Act” Intentions, but also to the new one) is specifically to the idea that we should socially shun the concept of “pivotal acts”, and socially shun people who say they think humanity needs to execute a pivotal act, or people who say positive things about some subset of pivotal acts.
This seems unwise to me, because it amounts to giving up on humanity’s future in the worlds where it turns out humanity does need to execute a pivotal act. Suppose you have this combination of beliefs:
Humanity probably won’t need to execute any pivotal acts in order to avoid existential catastrophe.
… But there’s a non-tiny chance (e.g., 10%) that at least one pivotal act will in fact be necessary.
A decent number of people currently misunderstand the idea of “pivotal acts” as evil/adversarial/”villainous”, in spite of the fact that there’s a decent chance humanity will need someone to commit this “villainy” in order to prevent the death of every human on Earth.
I personally think that a large majority of humanity’s hope lies in someone executing a pivotal act. But I assume Critch disagrees with this, and holds a view closer to 1+2+3.
If so, then I think he shouldn’t go “well, pivotal acts sound weird and carry some additional moral hazards, so I will hereby push for pivotal acts to become more stigmatized and hard to talk about, in order to slightly increase our odds of winning in the worlds where pivotal acts are unnecessary”.
Rather, I think hypothetical-Critch should promote the idea of pivotal processes, and try to reduce any existing stigma around the idea of pivotal acts, so that humanity is better positioned to evade destruction if we do end up needing to do a pivotal act. We should try to set ourselves up to win in more worlds.
(Where things in this category get straw-manned as “Rube-Goldberg-machine-like”)
If you’re referring to my comment, then this is itself straw-manning me!
Rube-Goldberg-ishness is a matter of degree: as you increase the complexity of a plan, it becomes harder to analyze, and tends to accumulate points of failure that reduce the probability of success. This obviously doesn’t mean we should pick the simplest possible plan with no consideration for anything else; but it’s a cost to keep in mind, like any other.
I mentioned this as a quantitative cost to keep in mind; “things in this category get straw-manned as ‘Rube-Goldberg-machine-like’” seems to either be missing the fact that this is a real cost, or treating me as making some stronger and more specific claim.
As often, one of the actual cruxes is in continuity assumptions, where basically you have a low prior on “smooth trajectory changes by many acts” and high prior on “sharp turns left or right”.
This seems wrong to me, in multiple respects:
Continuity assumptions are about what’s likely to happen, not about what’s desirable. It would be a separate assumption to say “continuity is always good”, and I worry that a reasoning error is occurring if this is being conflated with “continuity tends to occur”.
Why this matters here: My claim is that pivotal acts are likely to be necessary for good outcomes, not that they’re necessarily likely to occur. If your choices are “execute a pivotal act, or die”, then insofar as you’re confident this is the case, the base rate of continuous events just isn’t relevant.
The primary argument for hard takeoff isn’t “stuff tends to be discontinuous”; it’s “AGI is a powerful invention, and e.g. GPT-3 isn’t a baby AGI”. The discontinuity of hard takeoff is not a primitive; it’s an implication of the claim that AGI is different from current AI tech, that it contains a package of qualitatively new kinds of cognition that aren’t just ‘what GPT-3 is currently doing, but scaled up’.
No one claims that AlphaGo needs to be continuous with theorem-proving AI systems, or that a washing machine needs to be continuous with a chariot. The core disagreement here is about whether X and Y are the same kind of thing, not about whether incremental tweaks to a given kind of thing tend to produce small improvements.
I think you should be more of a fox with respect to continuity, and less of a hedgehog. The reason hard takeoff is very likely true isn’t some grand, universal Discontinuity Narrative. It’s just that different things work differently. Sometimes you get continuities; sometimes you don’t. To figure out which is which, you need to actually analyze the specific phenomenon under discussion, not just consult the universal cosmic base rate of continuity.
(And indeed, I think Paul is doing a lot more ‘analyze the specific phenomenon under discussion’ than you seem to give him credit for. I think it’s straw-manning Paul and Eliezer to reduce their disagreement to a flat ‘we have different priors about how many random things tend to be continuous’.)
Second crux, as you note, is doom-by-default probability: if you have a very high doom probability, you may be in favour of variance-increasing acts
I agree with this in general, but I think this is a wrong lens for thinking about pivotal acts. On my model, a pivotal act isn’t a hail mary that you attempt because you want to re-roll the dice; it’s more like a very specific key that is needed in order to open a very specific lock. Achieving good outcomes is a very constrained problem, and you need to do a lot of specific things in order to make things go well.
We may disagree about variance-increasing tactics in other domains, but our disagreement about pivotal acts is about whether some subset of the specific class of keys called ‘pivotal acts’ is necessary and/or sufficient to open the lock.
Given this deep prior differences, it seems reasonable to assume this discussion will lead nowhere in particular. (I’ve a draft with a more explicit argument why.)
I’m feeling much more optimistic than you about trying to resolve these points, in part because I feel that you’ve misunderstood almost every aspect of my view and of my comment above! If you’re that far from passing my ITT, then there’s a lot more hope that we may converge in the course of incrementally changing that.
(Or non-incrementally changing that. Sometimes non-continuous things do happen! ‘Gaining understanding of a topic’ being a classic example of a domain with many discontinuities.)
With the last point: I think can roughly pass your ITT—we can try that, if you are interested.
So, here is what I believe are your beliefs
With pretty high confidence, you expect sharp left turn to happen (in almost all trajectories)
This is to a large extent based on the belief that at some point “systems start to work really well in domains really far beyond the environments of their training” which is roughly the same as “discovering a core of generality” and few other formulations. These systems will be in some meaningful sense fundamentally different from eg Gato
From your perspective, this is based on thinking deeply about the nature of such system (note that this mostly based on hypothetical systems, and an analogy with evolution)
My claim roughly is this is only part of what’s going on, where the actual think is: people start with a deep prior on “continuity in the space of intelligent systems”. Looking into a specific question about hypothetical systems, their search in argument space is guided by this prior, and they end up mostly sampling arguments supporting their prior. (This is not to say the arguments are wrong.)
You probably don’t agree with the above point, but notice the correlations:
You expect sharp left turn due to discontinuity in “architectures” dimensions (which is the crux according to you)
But you also expect jumps in capabilities of individual systems (at least I think so)
Also, you expect majority of hope in a “sharp right turn” histories (in contrast to smooth right turn histories)
And more
In my view yours (or rather MIRI-esque) views on the above dimensions are correlated more than expected, which suggest the existence of hidden variable/hidden model explaining the correlation.
I personally think that a large majority of humanity’s hope lies in someone executing a pivotal act. But I assume Critch disagrees with this, and holds a view closer to 1+2+3.
If so, then I think he shouldn’t go “well, pivotal acts sound weird and carry some additional moral hazards, so I will hereby push for pivotal acts to become more stigmatized and hard to talk about, in order to slightly increase our odds of winning in the worlds where pivotal acts are unnecessary”.
Rather, I think hypothetical-Critch should promote the idea of pivotal processes, and try to reduce any existing stigma around the idea of pivotal acts, so that humanity is better positioned to evade destruction if we do end up needing to do a pivotal act. We should try to set ourselves up to win in more worlds.
Can’t speak for Critch, but my view is pivotal acts planned as pivotal acts, in the way how most people in LW community think about them, have only a very small chance of being the solution. (my guess is one or two bits more extreme, more like 2-5% than 10%).
I’m not sure if I agree with you re: the stigma. My impression is while the broader world doesn’t think in terms of pivotal acts, if it payed more attention, yes, many proposals would be viewed with suspicion. On the other hand, I think on LW it’s the opposite: many people share the orthodoxy views about sharp turns, pivotal acts, etc., and proposals to steer the situation more gently are viewed as unworkable or engaging in thinking with “too optimistic assumptions” etc.
Note that I advocate for considering much more weird solutions, and also thinking much more weird world states when talking with the “general world”. While in contrast, on LW and AF, I’d like to see more discussion of various “boring” solutions on which the world can roughly agree.
Continuity assumptions are about what’s likely to happen, not about what’s desirable. It would be a separate assumption to say “continuity is always good”, and I worry that a reasoning error is occurring if this is being conflated with “continuity tends to occur”.
Basically, no. Continuity assumptions are about how the space looks like. Obviously forecasting questions (“what’s likely to happen”) often depend on ideas how the space looks like.
My claim is that pivotal acts are likely to be necessary for good outcomes, not that they’re necessarily likely to occur. If your choices are “execute a pivotal act, or die”, then insofar as you’re confident this is the case, the base rate of continuous events just isn’t relevant.
Yes but your other claim is “sharp left turn” is likely and leads to bad outcomes. So if we partition the space of outcomes good/bad, in both branches you assume it is very likely because of sharp turns.
The primary argument for hard takeoff isn’t “stuff tends to be discontinuous”; it’s “AGI is a powerful invention, and e.g. GPT-3 isn’t a baby AGI”. The discontinuity of hard takeoff is not a primitive; it’s an implication of the claim that AGI is different from current AI tech, that it contains a package of qualitatively new kinds of cognition that aren’t just ‘what GPT-3 is currently doing, but scaled up’.
This is becoming maybe repetitive, but I’ll try to paraphrase again. Consider the option the “continuity assumptions” I’m talking about are not grounded in “takeoff scenarios”, but in “how you think about hypothetical points in the abstract space of intelligent systems”.
Thinking about features of this highly abstract space, in regions which don’t exist yet, is epistemically tricky (I hope we can at least agree on that).
It probably seems to you, you have many strong arguments giving you reliable insights about how the space works somewhere around “AGI”.
My claim is: “Yes, but the process which generated the arguments is based on black-box neural net, which has a strong prior on things like “stuff like math is discontinuous”″ (I suspect this “taste and intuition” box is located more in Eliezer’s mind, and some other people updated “on the strenght of arguments”) This isn’t to imply various people haven’t done a lot of thinking and generated a lot of arguments and intuitions about this. Unfortunately, given other epistemic constraints, in my view the “taste and intuitions” differences sort of “propagate” to “conclusion” differences.
With pretty high confidence, you expect sharp left turn to happen (in almost all trajectories)
This is to a large extent based on the belief that at some point “systems start to work really well in domains really far beyond the environments of their training” which is roughly the same as “discovering a core of generality” and few other formulations. These systems will be in some meaningful sense fundamentally different from eg Gato
That’s right, though the phrasing “discovering a core of generality” here sounds sort of mystical and mysterious to me, which makes me wonder whether you can see the perspective from which this is a very obvious and normal belief. I get a similar vibe when people talk about a “secret sauce” and say they can’t understand why MIRI thinks there might be a secret sauce—treating generalizability as a sort of occult property.
The way I would phrase it is in very plain, concrete terms:
If a machine can multiply two-digit numbers together as well as four-digit numbers together, then it can probably multiply three-digit numbers together. The structure of these problems is similar enough that it’s easier to build a generalist that can handle ‘multiplication’ than to solve two-digit and four-digit multiplication using fundamentally different techniques.
Similarly, it’s easier to teach a human or AI how to navigate physical environments in general, than to teach them how to navigate all physical environments except parking garages. Parking garages aren’t different enough from other physical environments, and the techniques for modeling and navigating physical spaces work too well, when they work at all.
Similarly, it’s easier to build an AI that is an excellent physicist and has the potential to be a passable or great chemist and/or biologist, than to build an excellent physicist that just can’t do chemistry or biology, no matter how many chemistry experiments or chemistry textbooks it sees. The problems have too much overlap.
We can see that the latter is true just by reflecting on what kinds of mental operations go into generating hypotheses about ontologies/carvings on the world, generating hypothesis about the state of the world given some ontology, fitting hypotheses about different levels/scales into a single cohesive world-model, calculating value of information, strategically directing attention toward more fruitful directions of thought, coming up with experiments, thinking about possible experimental outcomes, noticing anomalies, deducing implications and logical relationships, coming up with new heuristics and trying them out, etc. These clearly overlap enormously across the relevant domains.
We can also observe that this is in fact what happened with humans. We have zero special-purpose brain machinery for any science, or indeed for science as a category; we just evolved to be able to model physical environments well, and this generalized to all sciences once it generalized to any.
For things to not go this way would be quite weird.
From your perspective, this is based on thinking deeply about the nature of such system (note that this mostly based on hypothetical systems, and an analogy with evolution)
Doesn’t seem to pass my ITT. Like, it’s true in a sense that I’m ‘thinking about hypothetical systems’, because I only care about human cognition inasmuch as it seems likely to generalize to AGI cognition. But this still seems like it’s treating generality as a mysterious occult property, and not as something coextensive with all our observations of general intelligences.
My claim roughly is this is only part of what’s going on, where the actual think is: people start with a deep prior on “continuity in the space of intelligent systems”. Looking into a specific question about hypothetical systems, their search in argument space is guided by this prior, and they end up mostly sampling arguments supporting their prior. (This is not to say the arguments are wrong.)
Seems to me that my core intuition is about there being common structure shared between physics research, biology research, chemistry research, etc.; plus the simple observation that humans don’t have specialized evolved modules for chemistry vs physics vs biology. Discontinuity is an implication of those views, not a generator of those views.
Like, sure, if I had a really incredibly strong prior in favor of continuity, then maybe I would try really hard to do a mental search for reasons not to accept those prime-facie sources of discontinuity. And since I don’t have a super strong prior like that, I guess you could call my absence of a super-continuity assumption a ‘discontinuity assumption’.
But it seems like a weird and unnatural way of trying to make sense of my reasoning: I don’t have an extremely strong prior that everything must be continuous, but I also don’t have an extremely strong prior that everything must be spherical, or that everything must be purple. I’m not arriving at any particular conclusions via a generator that keeps saying ‘not everything is spherical!’ or ‘not everything is purple!’; I’m not a non-sphere-ist or an anti-purple-ist; the deep secret heart and generator for all my views is not that I have a deep and abiding faith in “there exist non-spheres”. And putting me in a room with some weird person who does think everything is a sphere doesn’t change any of that.
You probably don’t agree with the above point, but notice the correlations:
You expect sharp left turn due to discontinuity in “architectures” dimensions (which is the crux according to you)
But you also expect jumps in capabilities of individual systems (at least I think so)
Also, you expect majority of hope in a “sharp right turn” histories (in contrast to smooth right turn histories)
I would say that there are two relevant sources of discontinuity here:
AGI is an invention, and inventions happen at particular times. This inherently involves a 0-to-1 transition when the system goes from ‘not working’ to ‘working’. Paul and I believe equally in discontinuities like this, though we may disagree about whether AGI has already been ‘invented’ (such that we just need to iterate and improve on it), vs. whether the invention lies in the future.
General intelligence is powerful and widely applicable. This is another category of discontinuity Paul believes can happen (e.g., washing machines are allowed to have capabilities that non-washing-machines lack; nukes are allowed to have capabilities that non-nukes lack), though Paul may be somewhat less impressed than me with general intelligence overall (resulting in a smaller gap/discontinuity). Separately, Paul’s belief in AGI development predictability, AI research efficiency, and ‘AGI is already solved’ (see 1, above), each serve to reduce the importance of this discontinuity.
‘AGI is an invention’ and ‘General intelligence is powerful’ aren’t weird enough beliefs, I think, to call for some special explanation like ‘Rob B thinks the world is very discontinuous’. Those are obvious first-pass beliefs to have about the domain, regardless of whether they shake out as correct on further analysis.
‘We need a pivotal act’ is a consequence of 1 and 2, not a separate discontinuity. If AGI is a sudden huge dangerous deal (because 1 and 2 is true), then we’ll need to act fast or we’ll die, and there are viable paths to quickly ending the acute risk period. The discontinuity in the one case implies the discontinuity in this new case. There’s no need for a further explanation.
Note that I advocate for considering much more weird solutions, and also thinking much more weird world states when talking with the “general world”. While in contrast, on LW and AF, I’d like to see more discussion of various “boring” solutions on which the world can roughly agree.
Can I get us all to agree to push for including pivotal acts and pivotal processes in the Overton window, then? :) I’m happy to publicly talk about pivotal processes and encourage people to take them seriously as options to evaluate, while flagging that I’m ~2-5% on them being how the future is saved, if it’s saved. But I’ll feel more hopeful about this saving the future if you, Critch, etc. are simultaneously publicly talking about pivotal acts and encouraging people to take them seriously as options to evaluate, while flagging that you’re ~2-5% on them being how the future is saved.
My objection to Critch’s post wasn’t ‘you shouldn’t talk about pivotal processes, just pivotal acts’. On the contrary, I think bringing in pivotal processes is awesome.
My objection (more so to “Pivotal Act” Intentions, but also to the new one) is specifically to the idea that we should socially shun the concept of “pivotal acts”, and socially shun people who say they think humanity needs to execute a pivotal act, or people who say positive things about some subset of pivotal acts.
This seems unwise to me, because it amounts to giving up on humanity’s future in the worlds where it turns out humanity does need to execute a pivotal act. Suppose you have this combination of beliefs:
Humanity probably won’t need to execute any pivotal acts in order to avoid existential catastrophe.
… But there’s a non-tiny chance (e.g., 10%) that at least one pivotal act will in fact be necessary.
A decent number of people currently misunderstand the idea of “pivotal acts” as evil/adversarial/”villainous”, in spite of the fact that there’s a decent chance humanity will need someone to commit this “villainy” in order to prevent the death of every human on Earth.
I personally think that a large majority of humanity’s hope lies in someone executing a pivotal act. But I assume Critch disagrees with this, and holds a view closer to 1+2+3.
If so, then I think he shouldn’t go “well, pivotal acts sound weird and carry some additional moral hazards, so I will hereby push for pivotal acts to become more stigmatized and hard to talk about, in order to slightly increase our odds of winning in the worlds where pivotal acts are unnecessary”.
Rather, I think hypothetical-Critch should promote the idea of pivotal processes, and try to reduce any existing stigma around the idea of pivotal acts, so that humanity is better positioned to evade destruction if we do end up needing to do a pivotal act. We should try to set ourselves up to win in more worlds.
If you’re referring to my comment, then this is itself straw-manning me!
Rube-Goldberg-ishness is a matter of degree: as you increase the complexity of a plan, it becomes harder to analyze, and tends to accumulate points of failure that reduce the probability of success. This obviously doesn’t mean we should pick the simplest possible plan with no consideration for anything else; but it’s a cost to keep in mind, like any other.
I mentioned this as a quantitative cost to keep in mind; “things in this category get straw-manned as ‘Rube-Goldberg-machine-like’” seems to either be missing the fact that this is a real cost, or treating me as making some stronger and more specific claim.
This seems wrong to me, in multiple respects:
Continuity assumptions are about what’s likely to happen, not about what’s desirable. It would be a separate assumption to say “continuity is always good”, and I worry that a reasoning error is occurring if this is being conflated with “continuity tends to occur”.
Why this matters here: My claim is that pivotal acts are likely to be necessary for good outcomes, not that they’re necessarily likely to occur. If your choices are “execute a pivotal act, or die”, then insofar as you’re confident this is the case, the base rate of continuous events just isn’t relevant.
The primary argument for hard takeoff isn’t “stuff tends to be discontinuous”; it’s “AGI is a powerful invention, and e.g. GPT-3 isn’t a baby AGI”. The discontinuity of hard takeoff is not a primitive; it’s an implication of the claim that AGI is different from current AI tech, that it contains a package of qualitatively new kinds of cognition that aren’t just ‘what GPT-3 is currently doing, but scaled up’.
No one claims that AlphaGo needs to be continuous with theorem-proving AI systems, or that a washing machine needs to be continuous with a chariot. The core disagreement here is about whether X and Y are the same kind of thing, not about whether incremental tweaks to a given kind of thing tend to produce small improvements.
I think you should be more of a fox with respect to continuity, and less of a hedgehog. The reason hard takeoff is very likely true isn’t some grand, universal Discontinuity Narrative. It’s just that different things work differently. Sometimes you get continuities; sometimes you don’t. To figure out which is which, you need to actually analyze the specific phenomenon under discussion, not just consult the universal cosmic base rate of continuity.
(And indeed, I think Paul is doing a lot more ‘analyze the specific phenomenon under discussion’ than you seem to give him credit for. I think it’s straw-manning Paul and Eliezer to reduce their disagreement to a flat ‘we have different priors about how many random things tend to be continuous’.)
I agree with this in general, but I think this is a wrong lens for thinking about pivotal acts. On my model, a pivotal act isn’t a hail mary that you attempt because you want to re-roll the dice; it’s more like a very specific key that is needed in order to open a very specific lock. Achieving good outcomes is a very constrained problem, and you need to do a lot of specific things in order to make things go well.
We may disagree about variance-increasing tactics in other domains, but our disagreement about pivotal acts is about whether some subset of the specific class of keys called ‘pivotal acts’ is necessary and/or sufficient to open the lock.
I’m feeling much more optimistic than you about trying to resolve these points, in part because I feel that you’ve misunderstood almost every aspect of my view and of my comment above! If you’re that far from passing my ITT, then there’s a lot more hope that we may converge in the course of incrementally changing that.
(Or non-incrementally changing that. Sometimes non-continuous things do happen! ‘Gaining understanding of a topic’ being a classic example of a domain with many discontinuities.)
With the last point: I think can roughly pass your ITT—we can try that, if you are interested.
So, here is what I believe are your beliefs
With pretty high confidence, you expect sharp left turn to happen (in almost all trajectories)
This is to a large extent based on the belief that at some point “systems start to work really well in domains really far beyond the environments of their training” which is roughly the same as “discovering a core of generality” and few other formulations. These systems will be in some meaningful sense fundamentally different from eg Gato
From your perspective, this is based on thinking deeply about the nature of such system (note that this mostly based on hypothetical systems, and an analogy with evolution)
My claim roughly is this is only part of what’s going on, where the actual think is: people start with a deep prior on “continuity in the space of intelligent systems”. Looking into a specific question about hypothetical systems, their search in argument space is guided by this prior, and they end up mostly sampling arguments supporting their prior. (This is not to say the arguments are wrong.)
You probably don’t agree with the above point, but notice the correlations:
You expect sharp left turn due to discontinuity in “architectures” dimensions (which is the crux according to you)
But you also expect jumps in capabilities of individual systems (at least I think so)
Also, you expect majority of hope in a “sharp right turn” histories (in contrast to smooth right turn histories)
And more
In my view yours (or rather MIRI-esque) views on the above dimensions are correlated more than expected, which suggest the existence of hidden variable/hidden model explaining the correlation.
Can’t speak for Critch, but my view is pivotal acts planned as pivotal acts, in the way how most people in LW community think about them, have only a very small chance of being the solution. (my guess is one or two bits more extreme, more like 2-5% than 10%).
I’m not sure if I agree with you re: the stigma. My impression is while the broader world doesn’t think in terms of pivotal acts, if it payed more attention, yes, many proposals would be viewed with suspicion. On the other hand, I think on LW it’s the opposite: many people share the orthodoxy views about sharp turns, pivotal acts, etc., and proposals to steer the situation more gently are viewed as unworkable or engaging in thinking with “too optimistic assumptions” etc.
Note that I advocate for considering much more weird solutions, and also thinking much more weird world states when talking with the “general world”. While in contrast, on LW and AF, I’d like to see more discussion of various “boring” solutions on which the world can roughly agree.
Basically, no. Continuity assumptions are about how the space looks like. Obviously forecasting questions (“what’s likely to happen”) often depend on ideas how the space looks like.
Yes but your other claim is “sharp left turn” is likely and leads to bad outcomes. So if we partition the space of outcomes good/bad, in both branches you assume it is very likely because of sharp turns.
This is becoming maybe repetitive, but I’ll try to paraphrase again. Consider the option the “continuity assumptions” I’m talking about are not grounded in “takeoff scenarios”, but in “how you think about hypothetical points in the abstract space of intelligent systems”.
Thinking about features of this highly abstract space, in regions which don’t exist yet, is epistemically tricky (I hope we can at least agree on that).
It probably seems to you, you have many strong arguments giving you reliable insights about how the space works somewhere around “AGI”.
My claim is: “Yes, but the process which generated the arguments is based on black-box neural net, which has a strong prior on things like “stuff like math is discontinuous”″ (I suspect this “taste and intuition” box is located more in Eliezer’s mind, and some other people updated “on the strenght of arguments”) This isn’t to imply various people haven’t done a lot of thinking and generated a lot of arguments and intuitions about this. Unfortunately, given other epistemic constraints, in my view the “taste and intuitions” differences sort of “propagate” to “conclusion” differences.
That’s right, though the phrasing “discovering a core of generality” here sounds sort of mystical and mysterious to me, which makes me wonder whether you can see the perspective from which this is a very obvious and normal belief. I get a similar vibe when people talk about a “secret sauce” and say they can’t understand why MIRI thinks there might be a secret sauce—treating generalizability as a sort of occult property.
The way I would phrase it is in very plain, concrete terms:
If a machine can multiply two-digit numbers together as well as four-digit numbers together, then it can probably multiply three-digit numbers together. The structure of these problems is similar enough that it’s easier to build a generalist that can handle ‘multiplication’ than to solve two-digit and four-digit multiplication using fundamentally different techniques.
Similarly, it’s easier to teach a human or AI how to navigate physical environments in general, than to teach them how to navigate all physical environments except parking garages. Parking garages aren’t different enough from other physical environments, and the techniques for modeling and navigating physical spaces work too well, when they work at all.
Similarly, it’s easier to build an AI that is an excellent physicist and has the potential to be a passable or great chemist and/or biologist, than to build an excellent physicist that just can’t do chemistry or biology, no matter how many chemistry experiments or chemistry textbooks it sees. The problems have too much overlap.
We can see that the latter is true just by reflecting on what kinds of mental operations go into generating hypotheses about ontologies/carvings on the world, generating hypothesis about the state of the world given some ontology, fitting hypotheses about different levels/scales into a single cohesive world-model, calculating value of information, strategically directing attention toward more fruitful directions of thought, coming up with experiments, thinking about possible experimental outcomes, noticing anomalies, deducing implications and logical relationships, coming up with new heuristics and trying them out, etc. These clearly overlap enormously across the relevant domains.
We can also observe that this is in fact what happened with humans. We have zero special-purpose brain machinery for any science, or indeed for science as a category; we just evolved to be able to model physical environments well, and this generalized to all sciences once it generalized to any.
For things to not go this way would be quite weird.
Doesn’t seem to pass my ITT. Like, it’s true in a sense that I’m ‘thinking about hypothetical systems’, because I only care about human cognition inasmuch as it seems likely to generalize to AGI cognition. But this still seems like it’s treating generality as a mysterious occult property, and not as something coextensive with all our observations of general intelligences.
Seems to me that my core intuition is about there being common structure shared between physics research, biology research, chemistry research, etc.; plus the simple observation that humans don’t have specialized evolved modules for chemistry vs physics vs biology. Discontinuity is an implication of those views, not a generator of those views.
Like, sure, if I had a really incredibly strong prior in favor of continuity, then maybe I would try really hard to do a mental search for reasons not to accept those prime-facie sources of discontinuity. And since I don’t have a super strong prior like that, I guess you could call my absence of a super-continuity assumption a ‘discontinuity assumption’.
But it seems like a weird and unnatural way of trying to make sense of my reasoning: I don’t have an extremely strong prior that everything must be continuous, but I also don’t have an extremely strong prior that everything must be spherical, or that everything must be purple. I’m not arriving at any particular conclusions via a generator that keeps saying ‘not everything is spherical!’ or ‘not everything is purple!’; I’m not a non-sphere-ist or an anti-purple-ist; the deep secret heart and generator for all my views is not that I have a deep and abiding faith in “there exist non-spheres”. And putting me in a room with some weird person who does think everything is a sphere doesn’t change any of that.
I would say that there are two relevant sources of discontinuity here:
AGI is an invention, and inventions happen at particular times. This inherently involves a 0-to-1 transition when the system goes from ‘not working’ to ‘working’. Paul and I believe equally in discontinuities like this, though we may disagree about whether AGI has already been ‘invented’ (such that we just need to iterate and improve on it), vs. whether the invention lies in the future.
General intelligence is powerful and widely applicable. This is another category of discontinuity Paul believes can happen (e.g., washing machines are allowed to have capabilities that non-washing-machines lack; nukes are allowed to have capabilities that non-nukes lack), though Paul may be somewhat less impressed than me with general intelligence overall (resulting in a smaller gap/discontinuity). Separately, Paul’s belief in AGI development predictability, AI research efficiency, and ‘AGI is already solved’ (see 1, above), each serve to reduce the importance of this discontinuity.
‘AGI is an invention’ and ‘General intelligence is powerful’ aren’t weird enough beliefs, I think, to call for some special explanation like ‘Rob B thinks the world is very discontinuous’. Those are obvious first-pass beliefs to have about the domain, regardless of whether they shake out as correct on further analysis.
‘We need a pivotal act’ is a consequence of 1 and 2, not a separate discontinuity. If AGI is a sudden huge dangerous deal (because 1 and 2 is true), then we’ll need to act fast or we’ll die, and there are viable paths to quickly ending the acute risk period. The discontinuity in the one case implies the discontinuity in this new case. There’s no need for a further explanation.
Can I get us all to agree to push for including pivotal acts and pivotal processes in the Overton window, then? :) I’m happy to publicly talk about pivotal processes and encourage people to take them seriously as options to evaluate, while flagging that I’m ~2-5% on them being how the future is saved, if it’s saved. But I’ll feel more hopeful about this saving the future if you, Critch, etc. are simultaneously publicly talking about pivotal acts and encouraging people to take them seriously as options to evaluate, while flagging that you’re ~2-5% on them being how the future is saved.