Thanks for taking the time to explain your reasoning, Mark. I’m sorry to hear you won’t be continuing the discussion group! Is anyone else here interested in leading that project, out of curiosity? I was getting a lot out of seeing people’s reactions.
In any case, right now you’ve got people dismissing cryonics out of hand as “not scientific”, like it was some kind of pharmaceutical you could easily administer to 1000 patients and see what happened. “Call me when cryonicists actually revive someone,” they say; which, as Mike Li observes, is like saying “I refuse to get into this ambulance; call me when it’s actually at the hospital”. Maybe Martin Gardner warned them against believing in strange things without experimental evidence. So they wait for the definite unmistakable verdict of Science, while their family and friends and 150,000 people per day are dying right now, and might or might not be savable—
—a calculated bet you could only make rationally [i.e., using your own inference skills, without just echoing data from an experimental study, and without just echoing established, expert-verified scientific conclusions].
The drive of Science is to obtain a mountain of evidence so huge that not even fallible human scientists can misread it. But even that sometimes goes wrong, when people become confused about which theory predicts what, or bake extremely-hard-to-test components into an early version of their theory. And sometimes you just can’t get clear experimental evidence at all.
Either way, you have to try to do the thing that Science doesn’t trust anyone to do—think rationally, and figure out the answer before you get clubbed over the head with it.
(Oh, and sometimes a disconfirming experimental result looks like: “Your entire species has just been wiped out! You are now scientifically required to relinquish your theory. If you publicly recant, good for you! Remember, it takes a strong mind to give up strongly held beliefs. Feel free to try another hypothesis next time!”)
This is why there’s a lot of emphasis on hard-to-test (“philosophical”) questions in the Sequences, even though people are notorious for getting those wrong more often than scientific questions—because sometimes (e.g., in the case of cryonics and existential risk) the answer matters a lot for our decision-making, long before we have a definitive scientific answer. That doesn’t mean we should despair of empirically investigating these questions, but it does mean that our decision-making needs to be high-quality even during periods where we’re still in a state of high uncertainty.
The Sequences talk about the Many Worlds Interpretation precisely because it’s an unusually-difficult-to-test topic. The idea isn’t that this is a completely typical example, or that it’s a good idea to disregard evidence when it is available; the idea, rather, is that we sometimes do need to predicate our decisions on our best guess in the absence of perfect tests.
Its placement in Rationality: From AI to Zombies immediately after the ‘zombies’ sequence (which, incidentally, is an example of how and why we should reject philosophical thought experiments, no matter how intuitively compelling they are, when they don’t accord with established scientific theories and data) is deliberate. Rather than reading either sequence as an attempt to defend a specific fleshed-out theory of consciousness or of physical law, they should primarily be read as attempts to show that extreme uncertainty about a domain doesn’t always bleed over into ‘we don’t know anything about this topic’ or ‘we can’t rule out any of the candidate solutions’.
We can effectively rule out epiphenomenalism as a candidate solution to the hard problem of consciousness even if we don’t know the answer to the hard problem (which we don’t), and we can effectively rule out ‘consciousness causes collapse’ and ‘there is no objective reality’ as candidate solutions to the measurement problem in QM even if we don’t know the answer to the measurement problem (which, again, we don’t). Just advocating ‘physicalism’ or ‘many worlds’ is a promissory note, not a solution.
In discussions of EA and x-risk, we likewise need to be able to prioritize more promising hypotheses over less promising ones long before we’ve answered all the questions we’d like answered. Even deciding what studies to fund presupposes that we’ve ‘philosophized’, in the sense of mentally aggregating, heuristically analyzing, and drawing tentative conclusions from giant complicated accumulated-over-a-lifetime data sets.
You wrote:
The human capacity for abstract reasoning over mathematical models is in principle a fully general intelligent behaviour, as the scientific revolution has shown: there is no aspect of the natural world which has remained beyond the reach of human understanding, once a sufficient amount of evidence is available.
That’s true, and it’s one of the basic assumptions behind MIRI research: that understanding agents smarter than us isn’t obviously hopeless, because our human capacity for abstract reasoning makes it possible for us to model systems even when they’re extremely complex and dynamic. MIRI’s research is intended to make this likelier to happen.
It’s not the default that we’re always able to predict what our inventions will do before we run them to see what happens; and there are some basic limits on our ability to do so when the system we’re predicting is smarter than the predictor. But with enough intellectual progress we may become able to model abstract safety-relevant features of AGI behavior, even though we can’t predict in detail the exact decisions the AGI will make. (If we could predict the exact decisions of the AGI, we’d have to be at least as smart as the AGI.)
If it isn’t possible to learn a variety of generalizations about smarter autonomous systems, then, interestingly, that also undermines the case for intelligence explosion. Both ‘humans trying to make superintelligent AI safe’ and ‘AI undergoing a series of recursive self-improvements’ are cases where less intelligent agents are trying to reliably generate agents that meet various abstract criteria (including superior intelligence). The orthogonality thesis, likewise, simultaneously supports the claim ‘many possible AI systems won’t have humane goals’ and ‘it is possible for an AI system to have human goals’. This is why Bostrom/Yudkowsky-type arguments don’t uniformly inspire pessimism.
Are you familiar with MIRI’s technical agenda? You may also want to check out the AI Impacts project, if you think we should be prioritizing forecasting work at this point rather than object-level mathematical research.
That’s not true. (Or, at best, it’s misleading for present purposes.)
First, it’s important to keep in mind that if MWI is “untestable” relative to non-MWI, then non-MWI is also “untestable” relative to MWI. To use this as an argument against MWI, you’d need to talk specifically about which hypothesis MWI is untestable relative to; and you would then need to cite some other reason to reject MWI (e.g., its complexity relative to the other hypothesis, or its failures relative to some third hypothesis that it is testable relative to).
With that in mind:
1 - MWI is testable insofar as QM itself is testable. We normally ignore this fact because we’re presupposing QM, but it’s important to keep in mind if we’re trying to make a general claim like ‘MWI is unscientific because it’s untestable and lacks evidential support’. MWI is at least as testable as QM, and has at least as much supporting evidence.
2 - What I think people really mean to say (or what a steel-manned version of them would say) is that multiverse-style interpretations of QM are untestable relative to each other. This looks likely to be true, for practical purposes, when we’re comparing non-collapse interpretations: Bohmian Mechanics doesn’t look testable relative to Many Threads, for example. (And therefore Many Threads isn’t testable relative to Bohmian Mechanics, either.)
(Of course, many of the things we call “Many Worlds” are not fully fleshed out interpretations, so it’s a bit risky to make a strong statement right now about what will turn out to be testable in the real world. But this is at least a commonly accepted bit of guesswork on the part of theoretical physicists and philosophers of physics.)
3 - But, importantly, collapse interpretations generally are empirically distinguishable from non-collapse interpretations. So even though non-collapse interpretations are generally thought to be ‘untestable’ relative to each other, they are testable relative to collapse interpretations. (And collapse interpretations as a rule are falsifiable relative to each other.)
To date, attempts to test collapse interpretations have falsified the relevant interpretations. It is not technologically possible yet to test the most popular present-day ones, but it is possible for collapse theorists to argue ‘our views should get more attention because they’re easier to empirically distinguish’, and it’s also possible for anti-collapse theorists to try to make inductive arguments from past failures to the likelihood of future failures, with varying amounts of success.
I don’t have an argument against MWI specifically, no.
and you would then need to cite some other reason to reject MWI
No, that is not how it works: I don’t need to either accept or reject MWI. I can also treat it as a causal story lacking empirical content. Nothing wrong with such stories, they are quite helpful for understanding systems. But not a part of science.
MWI is testable insofar as QM itself is testable.
By that logic, if I invent any crazy hypothesis in addition to an empirically testable theory, then it inherits testability just on those grounds. You can do that with the word “testabiity” if you want, but that seems to be not how people use words.
If some smart catholic says that evolution is how God unfolds creation when it comes to living systems, then any specific claims we can empirically check pertaining to evolution (including those that did not pan out, and required repairs of evolutionary theory) also somehow are relevant to the catholic’s larger hypothesis? I suppose that is literally true, but silly. There is no empirical content to what this hypothetical catholic is saying, over and above the actual empirical stuff he is latching his baggage onto. I am not super interested in having catholic theologians read about minimum descriptive complexity, and then weaving a yarn about their favorite hypotheses based on that.
so it’s a bit risky to make a strong statement right now about what will turn out to be testable in the real world
I like money! I am happy to discuss bet terms on this.
collapse interpretations generally are empirically distinguishable from non-collapse interpretations
Yes if you have an interpretation that gives different predictions than QM, then yes that will render that interpretation falsifiable of course (and indeed some were). That is super boring, though, and not what this argument is about. But also, I don’t see what falsifiability of X has to do with falsifiabiliity of Y, if X and Y are different. Newtonian mechanics is both falsifiable and falsified, but that has little to do with falsifiability of any story fully consistent with QM predictions.
My personal take on MWI is I want to waste as little energy as possible on it and arguments about it, and actually go read Feynman instead. (This is not a dig at you, I am just explaining where I am coming from when it comes to physics).
No, that is not how it works: I don’t need to either accept or reject MWI. I can also treat it as a causal story lacking empirical content.
To say that MWI lacks empirical content is also to say that the negation of MWI lacks empirical content. So this doesn’t tell us, for example, whether to assign higher probability to MWI or to the disjunction of all non-MWI interpretations.
Suppose your ancestors sent out a spaceship eons ago, and by your calculations it recently traveled so far away that no physical process could ever cause you and the spaceship to interact again. If you then want to say that ‘the claim the spaceship still exists lacks empirical content,’ then OK. But you will also have to say ‘the claim the spaceship blipped out of existence when it traveled far enough away lacks empirical content’.
And there will still be some probability, given the evidence, that the spaceship did vs. didn’t blip out of existence; and just saying ‘it lacks empirical content!’ will not tell you whether to design future spaceships so that their life support systems keep operating past the point of no return.
By that logic, if I invent any crazy hypothesis in addition to an empirically testable theory, then it inherits testability just on those grounds. You can do that with the word “testabiity” if you want, but that seems to be not how people use words.
There’s no ambiguity if you clarify whether you’re talking about the additional crazy hypothesis, vs. talking about the conjunction ‘additional crazy hypothesis + empirically testable theory’. Presumably you’re imagining a scenario where the conjunction taken as a whole is testable, though one of the conjuncts is not. So just say that.
Sean Carroll summarizes collapse-flavored QM as the conjunction of these five claims:
Quantum states are represented by wave functions, which are vectors in a mathematical space called Hilbert space.
Wave functions evolve in time according to the Schrödinger equation.
The act of measuring a quantum system returns a number, known as the eigenvalue of the quantity being measured.
The probability of getting any particular eigenvalue is equal to the square of the amplitude for that eigenvalue.
After the measurement is performed, the wave function “collapses” to a new state in which the wave function is localized precisely on the observed eigenvalue (as opposed to being in a superposition of many different possibilities).
Many-worlds-flavored QM, on the other hand, is the conjunction of 1 and 2, plus the negation of 5 -- i.e., it’s an affirmation of wave functions and their dynamics (which effectively all physicists agree about), plus a rejection of the ‘collapses’ some theorists add to keep the world small and probabilistic. (If you’d like, you could supplement ‘not 5’ with ‘not Bohmian mechanics’; but for present purposes we can mostly lump Bohm in with multiverse interpretations, because Eliezer’s blog series is mostly about rejecting collapse rather than about affirming a particular non-collapse view.)
If we want ‘QM’ to be the neutral content shared by all these interpretations, then we can say that QM is simply the conjunction of 1 and 2. You are then free to say that we should assign 50% probability to claim 5, and maintain agnosticism between collapse and non-collapse views. But realize that, logically, either collapse or its negation does have to be true. You can frame denying collapse as ‘positing invisible extra worlds’, but you can equally frame denying collapse as ‘skepticism about positing invisible extra causal laws’.
Since every possible way the universe could be adds something ‘extra’ on top of what we observe—either an extra law (e.g., collapse) or extra ontology (because there are no collapses occurring to periodically annihilate the ontology entailed by the Schrodinger equation) -- it’s somewhat missing the point to attack any given interpretation for the crime of positing something extra. The more relevant question is just whether simplicity considerations or indirect evidence helps us decide which ‘something extra’ (a physical law, or more ‘stuff’, or both) is the right one. If not, then we stick with a relatively flat prior.
Claims 1 and 2 are testable, which is why we were able to acquire evidence for QM in the first place. Claim 5 is testable for pretty much any particular ‘collapse’ interpretation you have in mind; which means the negation of claim 5 is also testable. So all parts of bare-bones MWI are testable (though it may be impractical to run many of the tests), as long as we’re comparing MWI to collapse and not to Bohmian Mechanics.
(You can, of course, object that affirming 3-5 as fundamental laws has the advantage of getting us empirical adequacy. But ‘MWI (and therefore also ‘bare’ QM) isn’t empirically adequate’ is a completely different objection from ‘MWI asserts too many unobserved things’, and in fact the two arguments are in tension: it’s precisely because Eliezer isn’t willing to commit himself to a mechanism for the Born probabilities in the absence of definitive evidence that he’s sticking to ‘bare’ MWI and leaving almost entirely open how these relate to the Born rule. In the one case you’d be criticizing MWI theorists for refusing to stick their neck out and make some guesses about which untested physical laws and ontologies are the real ones; in the other case you’d be criticizing MWI theorists for making guesses about which untested physical laws and ontologies are the real ones.)
I am not super interested in having catholic theologians read about minimum descriptive complexity, and then weaving a yarn about their favorite hypotheses based on that.
Are you kidding? I would love it if theologians stopped hand-waving about how their God is ‘ineffably simple no really we promise’ and started trying to construct arguments that God (and, more importantly, the package deal ‘God + universe’) is information-theoretically simple, e.g., by trying to write a simple program that outputs Biblical morality plus the laws of physics. At best, that sort of precision would make it much clearer where the reasoning errors are; at worst, it would be entertainingly novel.
To say that MWI lacks empirical content is also to say that the negation of MWI lacks empirical content.
Yes.
So this doesn’t tell us, for example, whether to assign higher probability to MWI or to the disjunction of all non-
MWI interpretations.
Right. But I think it’s a waste of energy to assign probabilities to assertions lacking empirical content, because you will not be updating anyways, and a prior without possibility of data is just a slightly mathier way to formulate “your taste.” I don’t argue about taste.
[spaceship example]
One can assume a reasonable model here (e.g. leave a copy of the spaceship in earth orbit, or have it travel in a circle in the solar system, and assume similar degradation of modules). Yes, you will have indirect evidence only applicable due to your model. But I think the model here would have teeth.
But realize that, logically, either collapse or its negation does have to be true.
Or we are thinking about the problem incorrectly and those are not exhaustive/mutually exclusive. Compare: “logically the electron must either be a particle or must not be a particle.”
it’s somewhat missing the point to attack any given interpretation for the crime of positing something extra.
I am not explicitly attacking MWI, as I think I said multiple times. I am not even attacking having interpretations or preferring one over another for reasons such as “taste” or “having an easier time thinking about QM.” I am attacking the notion that there is anything more to the preference for MWI than this.
To summarize my view: “testability” is about “empirical claims,” not about “narratives.” MWI is, by its very nature, a narrative about empirical claims. The list of empirical claims it is a narrative about can certainly differ from another list of empirical claims with another narrative. For example, we can imagine some sort of “billiard ball Universe” narrative around Newtonian physics.
But I would not say “MWI is testable relative to the Newtonian narrative”, I would say “the list of empirical claims ‘QM’ is testable relative to the list of empirical claims ’Newtonian physics.”
The problem with the former statement is first it is a “type error,” and second, there are infinitely many narratives around any list of empirical claims. You may prefer MWI for [reasons] over [infinitely long list of other narratives], but it seems like “argument about taste.” What’s even the point of the argument?
Let’s say I prefer some other interpretation of ‘QM’ than MWI. What does that say about me? Does it say anything bad? Am I a ‘bad rationalist?’ Do I have ‘bad taste?’ Does it matter what my favorite interpretation is? I think this lesswrongian MWI thing is belief-as-attire.
Full disclaimer: I am not a professional philosopher, and do not think about testability for a living. I reviewed a paper about testability once.
Could you restate your response to the spaceship example? This seems to me to be an entirely adequate response to
it’s a waste of energy to assign probabilities to assertions lacking empirical content, because you will not be updating anyways, and a prior without possibility of data is just a slightly mathier way to formulate “your taste.” I don’t argue about taste.
Favoring simpler hypotheses matters, because if you’re indifferent to added complexity when it makes no difference to your observations (e.g., ‘nothing outside the observable universe exists’) you may make bad decisions that impact agents that you could never observe, but that might still live better or worse lives based on what you do.
This matters when you’re making predictions about agents far from you in space and/or time. MWI is a special case of the same general principle, so it’s a useful illustrative example even if it isn’t as important as those other belief-in-the-implied-invisible scenarios.
I am not even attacking having interpretations or preferring one over another for reasons such as “taste” or “having an easier time thinking about QM.” I am attacking the notion that there is anything more to the preference for MWI than this.
Collapse and non-collapse interpretations are empirically distinguishable from each other. I’ve been defining ‘QM’ in a way that leaves it indifferent between collapse and non-collapse—in which case you can’t say that the distinction between bare QM and MWI is just a matter of taste, because MWI adds the testable claim that collapse doesn’t occur. If you prefer to define ‘QM’ so that it explicitly rejects collapse, then yes, MWI (or some versions of MWI) is just a particular way of talking about QM, not a distinct theory. But in that case collapse interpretations of QM are incompatible with QM itself, which seems like a less fair-minded way of framing a foundations-of-physics discussion.
Collapse and non-collapse interpretations are empirically distinguishable from each other.
You are not engaging with my claim that testability is a property of empirical claims, not narratives. Not sure there is a point to continue until we resolve the disagreement about the possible category error here.
There is another weird thing where you think we test claims against other claims, but actually we test against Nature. If Nature says your claim is wrong, it’s falsified. If there is a possibility of Nature saying that, it’s falsifiable. You don’t need a pair of claims here. Testability is not a binary relation between claims. But that’s not central to the disagreement.
Why do you think collapse interpretations are ‘narratives’, and why do you think they aren’t empirical claims?
Regarding testability: if you treat testability as an intrinsic feature of hypotheses, you risk making the mistake of thinking that if there is no test that would distinguish hypothesis A from hypothesis B, then there must be no test that could distinguish hypothesis A from hypothesis C. It’s true that you can just speak of a test that’s better predicted by hypothesis ‘not-A’ than by hypothesis A, but the general lesson that testability can vary based on which possibilities you’re comparing is an important one, and directly relevant to the case we’re considering.
There are two issues, what I view as non-standard language use, and what I view as a category error.
You can use the word ‘testability’ to signify a binary relation, but that’s not what people typically mean when they use that word. They typically mean “possibility Nature can tell you that you are wrong.”
So when you responded many posts back with a claim “MWI is hard to test” you are using the word “test” in a way probably no one else in the thread is using. You are not wrong, but you will probably miscommunicate.
An empirical claim has this form: “if we do experiment A, we will get result B.” Nature will sometimes agree, and sometimes not, and give you result C instead. If you have a list of such claims, you can construct a “story” about them, like MWI, or something else. But adding the “story” is an extra step, and what Nature is responding to is not the story but the experiment.
The mapping from stories to lists of claims is always always always many to one. If you have [story1] about [list1] and [story2] about [list2], and Nature agrees with [list1], and disagrees with [list2], then you will say:
“story1 was falsified, story2 was falsifiable but not falsified.”
I will say:
“list1 was falsified, list2 was falsifiable but not falsified.”
What’s relevant here isn’t the details of story1 or story2, but what’s in the lists.
When I say “MWI is untestable” what I mean is:
“There is a list of empirical claims called ‘quantum mechanics.’ There is a set of stories about this list, one of which is MWI. There is no way to tell these stories apart empirically, so you pick the one you like best for non-empirical reasons.”
When you say “MWI is testable” what I think you mean is:
“There are two lists of empirical claims, called ‘quantum mechanics’ and ‘quantum mechanics prime,’ a story ‘story 1’ about the former, and a story ‘story 2’ about the latter. Nature will agree with the list ‘quantum mechanics’ and disagree with the list ‘quantum mechanics prime.’ Therefore, ‘story 1’ is testable relative to ‘story 2.’”
That’s fine, I understand what you mean, and I think you are right, up to the last sentence. But I think the last sentence is a category error.
Because you are equating lists of claims with stories, you are carrying over the testability property of the list ‘quantum mechanics’ to your favorite story about this list, ‘MWI.’ But there is an infinite list of stories consistent with ‘quantum mechanics’. I can replace ‘MWI’ in your argument with any other consistent story, including those involving the flying spaghetti monster, etc.
Then you get unintuitive statements like ‘the flying spaghetti interpretation of quantum mechanics is testable relative to X.’ This is a sufficiently weird use of the word “testable” that I think we should not use the word “testable” in this way. And indeed, I believe the standard usage of the word “testable” is not this.
At one point I started developing a religious RPG character who applied theoretical computer science to his faith.
I forget details, but among other details he believed that although the Bible prescribed the best way to live, the world is far too complex for any finite set of written rules to cover every situation. The same limitation applies to human reason: cognitive science and computational complexity theory have shown all the ways in which we are bounded reasoners, and can only ever hope to comprehend a small part of the whole world. Reason works best when it can be applied to constrained problems where clear objective answer can be found, but it easily fails once the number of variables grows.
Thus, because science has shown that both the written word of the Bible and human reason are fallible and easily lead us astray (though the word of the Bible is less likely to do so), the rational course of action for one who believes in science is to pray to God for guidance and trust the Holy Spirit to lead us to the right choices.
In so far as I understand what the “preferred basis problem” is actually supposed to be, the existence of a preferred basis seems to me to be not an assumption necessary for Everettian QM to work but an empirical fact about the world; if it were false then the world would not, as it does, appear broadly classical when one doesn’t look too closely. Without a preferred basis, you could still say “the wavefunction just evolves smoothly and there is no collapse”; it would no longer be a useful approximation to describe what happens in terms of “worlds”, but for the same reason you could not e.g. adopt a “collapse” interpretation in which everything looks kinda-classical on a human scale apart from random jumps when “observations” or “measurements” happen. The world would look different in the absence of a preferred basis.
But I am not very expert on this stuff. Do you think the above is wrong, and if so how?
First, it’s important to keep in mind that if MWI is “untestable” relative to non-MWI, then non-MWI is also “untestable” relative to MWI. To use this as an argument against MWI,
I think it’s being used as an argument against beliefs paying rent.
MWI is testable insofar as QM itself is testable.
Since there is more than one interpretation of QM, empirically testing QM does not prove any one interpretation over the others.
Whatever extra arguments are used to support a particular interpretation over the others are not going to be, and have not been, empirical.
But, importantly, collapse interpretations generally are empirically distinguishable from non-collapse interpretations.
No they are not, because of the meaning of the word “interpretation” but collapse theories, such as GRW, might be.
This is why there’s a lot of emphasis on hard-to-test (“philosophical”) questions in the Sequences, even though people are notorious for getting those wrong more often than scientific questions—because sometimes [..] the answer matters a lot for our decision-making,
Which is one of the ways in which beliefs that don’t pay rent do pay rent.
Are you familiar with MIRI’s technical agenda? You may also want to check out the AI Impacts project, if you think we should be prioritizing forecasting work at this point rather than object-level mathematical research.
Yes I’m familiar with the technical agenda. What do you mean by “forecasting work”—AI impacts? That seems to be of near-zero utility to me.
What MIRI should be doing, what I’ve advocated MIRI to do from the start, and which I can’t get a straight answer on why they are not doing that does not in some way terminate in referencing the more speculative sections of the sequences I take issue with, is this: build artificial general intelligence and study it. Not a provably-safe-from-first-principles-before-we-touch-a-single-line-of-code AGI. Just a regular, run of the mill AGI using any one of the architectures presently being researched in the artificial intelligence community. Build it and study it.
The closer we get to AGI, the more profitable further improvements in AI capabilities become. This means that the more we move the clock toward AGI, the more likely we are to engender an AI arms race between different nations or institutions, and the more (apparent) incentives there are to cut corners on safety and security. At the same time, AGI is an unusual technology in that it can potentially be used to autonomously improve on our AI designs—so that the more advanced and autonomous AI becomes, the likelier it is to undergo a speed-up in rates of improvement (and the likelier these improvements are to be opaque to human inspection). Both of these facts could make it difficult to put the brakes on AI progress.
Both of these facts also make it difficult to safely ‘box’ an AI. First, different groups in an arms race may simply refuse to stop reaping the economic or military/strategic benefits of employing their best AI systems. If there are many different projects that are near or at AGI-level when your own team suddenly stops deploying your AI algorithms and boxes them, it’s not clear there is any force on earth that can compel all other projects to freeze their work too, and to observe proper safety protocols. We are terrible at stopping the flow of information, and we have no effective mechanisms in place to internationally halt technological progress on a certain front. It’s possible we could get better at this over time, but the sooner we get AGI, the less intervening time we’ll have to reform our institutions and scientific protocols.
A second reason speed-ups make it difficult to safely box an AGI is that we may not arrest its self-improvement in the (narrow?) window between ‘too dumb to radically improve on our understanding of AGI’ and ‘too smart to keep in a box’. We can try to measure capability levels, but only using imperfect proxies; there is no actual way to test how hard it would be for an AGI to escape a box beyond ‘put the AGI in the box and see what happens’. Which means we can’t get much of a safety assurance until after we’ve done the research you’re talking about us doing on the boxed AI. If you aren’t clear on exactly how capable the AI is, or how well measures of its apparent capabilities in other domains transfer to measures of its capability at escaping boxes, there are limits to how confident you can be that the AI is incapable of finding clever methods to bridge air gaps, or simply adjusting its software in such a way the methods we’re using to inspect and analyze the AI compromise the box.
‘AGI’ is not actually a natural kind. It’s just an umbrella term for ‘any mind we could build that’s at least as powerful as a human’. Safe, highly reliable AI in particular is likely to be an extremely special and unusual subcategory. Studying a completely arbitrary AGI may tell as about as much about how to build a safe AGI as studying nautilus ecology would tell us about how to safely keep bees and farm their honey. Yes, they’re both ‘animals’, and we probably could learn a lot, but not as much as if we studied something a bit more bee-like. But in this case that presupposes that we understand AI safety well enough to build an AGI that we expect to look at least a little like our target safe AI. And our understanding just isn’t there yet.
We already have seven billion general intelligences we can study in the field, if we so please; it’s not obvious that a rushed-to-completion AGI would resemble a highly reliable safe AGI in all that much more detail than humans resemble either of those two hypothetical AGIs.
(Of course, our knowledge would obviously improve! Knowing about a nautilus and a squirrel really does tell us a lot more about beekeeping than either of those species would on its own, assuming we don’t have prior experience with any other animals. But if the nautilus is a potential global catastrophic risk, we need to weigh those gains against the risk and promise of alternative avenues of research.)
Thanks for taking the time to explain your reasoning, Mark. I’m sorry to hear you won’t be continuing the discussion group! Is anyone else here interested in leading that project, out of curiosity? I was getting a lot out of seeing people’s reactions.
I think John Maxwell’s response to your core argument is a good one. Since we’re talking about the Sequences, I’ll note that this dilemma is the topic of the Science and Rationality sequence:
This is why there’s a lot of emphasis on hard-to-test (“philosophical”) questions in the Sequences, even though people are notorious for getting those wrong more often than scientific questions—because sometimes (e.g., in the case of cryonics and existential risk) the answer matters a lot for our decision-making, long before we have a definitive scientific answer. That doesn’t mean we should despair of empirically investigating these questions, but it does mean that our decision-making needs to be high-quality even during periods where we’re still in a state of high uncertainty.
The Sequences talk about the Many Worlds Interpretation precisely because it’s an unusually-difficult-to-test topic. The idea isn’t that this is a completely typical example, or that it’s a good idea to disregard evidence when it is available; the idea, rather, is that we sometimes do need to predicate our decisions on our best guess in the absence of perfect tests.
Its placement in Rationality: From AI to Zombies immediately after the ‘zombies’ sequence (which, incidentally, is an example of how and why we should reject philosophical thought experiments, no matter how intuitively compelling they are, when they don’t accord with established scientific theories and data) is deliberate. Rather than reading either sequence as an attempt to defend a specific fleshed-out theory of consciousness or of physical law, they should primarily be read as attempts to show that extreme uncertainty about a domain doesn’t always bleed over into ‘we don’t know anything about this topic’ or ‘we can’t rule out any of the candidate solutions’.
We can effectively rule out epiphenomenalism as a candidate solution to the hard problem of consciousness even if we don’t know the answer to the hard problem (which we don’t), and we can effectively rule out ‘consciousness causes collapse’ and ‘there is no objective reality’ as candidate solutions to the measurement problem in QM even if we don’t know the answer to the measurement problem (which, again, we don’t). Just advocating ‘physicalism’ or ‘many worlds’ is a promissory note, not a solution.
In discussions of EA and x-risk, we likewise need to be able to prioritize more promising hypotheses over less promising ones long before we’ve answered all the questions we’d like answered. Even deciding what studies to fund presupposes that we’ve ‘philosophized’, in the sense of mentally aggregating, heuristically analyzing, and drawing tentative conclusions from giant complicated accumulated-over-a-lifetime data sets.
You wrote:
That’s true, and it’s one of the basic assumptions behind MIRI research: that understanding agents smarter than us isn’t obviously hopeless, because our human capacity for abstract reasoning makes it possible for us to model systems even when they’re extremely complex and dynamic. MIRI’s research is intended to make this likelier to happen.
It’s not the default that we’re always able to predict what our inventions will do before we run them to see what happens; and there are some basic limits on our ability to do so when the system we’re predicting is smarter than the predictor. But with enough intellectual progress we may become able to model abstract safety-relevant features of AGI behavior, even though we can’t predict in detail the exact decisions the AGI will make. (If we could predict the exact decisions of the AGI, we’d have to be at least as smart as the AGI.)
If it isn’t possible to learn a variety of generalizations about smarter autonomous systems, then, interestingly, that also undermines the case for intelligence explosion. Both ‘humans trying to make superintelligent AI safe’ and ‘AI undergoing a series of recursive self-improvements’ are cases where less intelligent agents are trying to reliably generate agents that meet various abstract criteria (including superior intelligence). The orthogonality thesis, likewise, simultaneously supports the claim ‘many possible AI systems won’t have humane goals’ and ‘it is possible for an AI system to have human goals’. This is why Bostrom/Yudkowsky-type arguments don’t uniformly inspire pessimism.
Are you familiar with MIRI’s technical agenda? You may also want to check out the AI Impacts project, if you think we should be prioritizing forecasting work at this point rather than object-level mathematical research.
No, MWI is not unusually difficult to test. It is untestable.
That’s not true. (Or, at best, it’s misleading for present purposes.)
First, it’s important to keep in mind that if MWI is “untestable” relative to non-MWI, then non-MWI is also “untestable” relative to MWI. To use this as an argument against MWI, you’d need to talk specifically about which hypothesis MWI is untestable relative to; and you would then need to cite some other reason to reject MWI (e.g., its complexity relative to the other hypothesis, or its failures relative to some third hypothesis that it is testable relative to).
With that in mind:
1 - MWI is testable insofar as QM itself is testable. We normally ignore this fact because we’re presupposing QM, but it’s important to keep in mind if we’re trying to make a general claim like ‘MWI is unscientific because it’s untestable and lacks evidential support’. MWI is at least as testable as QM, and has at least as much supporting evidence.
2 - What I think people really mean to say (or what a steel-manned version of them would say) is that multiverse-style interpretations of QM are untestable relative to each other. This looks likely to be true, for practical purposes, when we’re comparing non-collapse interpretations: Bohmian Mechanics doesn’t look testable relative to Many Threads, for example. (And therefore Many Threads isn’t testable relative to Bohmian Mechanics, either.)
(Of course, many of the things we call “Many Worlds” are not fully fleshed out interpretations, so it’s a bit risky to make a strong statement right now about what will turn out to be testable in the real world. But this is at least a commonly accepted bit of guesswork on the part of theoretical physicists and philosophers of physics.)
3 - But, importantly, collapse interpretations generally are empirically distinguishable from non-collapse interpretations. So even though non-collapse interpretations are generally thought to be ‘untestable’ relative to each other, they are testable relative to collapse interpretations. (And collapse interpretations as a rule are falsifiable relative to each other.)
To date, attempts to test collapse interpretations have falsified the relevant interpretations. It is not technologically possible yet to test the most popular present-day ones, but it is possible for collapse theorists to argue ‘our views should get more attention because they’re easier to empirically distinguish’, and it’s also possible for anti-collapse theorists to try to make inductive arguments from past failures to the likelihood of future failures, with varying amounts of success.
I don’t have an argument against MWI specifically, no.
No, that is not how it works: I don’t need to either accept or reject MWI. I can also treat it as a causal story lacking empirical content. Nothing wrong with such stories, they are quite helpful for understanding systems. But not a part of science.
By that logic, if I invent any crazy hypothesis in addition to an empirically testable theory, then it inherits testability just on those grounds. You can do that with the word “testabiity” if you want, but that seems to be not how people use words.
If some smart catholic says that evolution is how God unfolds creation when it comes to living systems, then any specific claims we can empirically check pertaining to evolution (including those that did not pan out, and required repairs of evolutionary theory) also somehow are relevant to the catholic’s larger hypothesis? I suppose that is literally true, but silly. There is no empirical content to what this hypothetical catholic is saying, over and above the actual empirical stuff he is latching his baggage onto. I am not super interested in having catholic theologians read about minimum descriptive complexity, and then weaving a yarn about their favorite hypotheses based on that.
I like money! I am happy to discuss bet terms on this.
Yes if you have an interpretation that gives different predictions than QM, then yes that will render that interpretation falsifiable of course (and indeed some were). That is super boring, though, and not what this argument is about. But also, I don’t see what falsifiability of X has to do with falsifiabiliity of Y, if X and Y are different. Newtonian mechanics is both falsifiable and falsified, but that has little to do with falsifiability of any story fully consistent with QM predictions.
My personal take on MWI is I want to waste as little energy as possible on it and arguments about it, and actually go read Feynman instead. (This is not a dig at you, I am just explaining where I am coming from when it comes to physics).
To say that MWI lacks empirical content is also to say that the negation of MWI lacks empirical content. So this doesn’t tell us, for example, whether to assign higher probability to MWI or to the disjunction of all non-MWI interpretations.
Suppose your ancestors sent out a spaceship eons ago, and by your calculations it recently traveled so far away that no physical process could ever cause you and the spaceship to interact again. If you then want to say that ‘the claim the spaceship still exists lacks empirical content,’ then OK. But you will also have to say ‘the claim the spaceship blipped out of existence when it traveled far enough away lacks empirical content’.
And there will still be some probability, given the evidence, that the spaceship did vs. didn’t blip out of existence; and just saying ‘it lacks empirical content!’ will not tell you whether to design future spaceships so that their life support systems keep operating past the point of no return.
There’s no ambiguity if you clarify whether you’re talking about the additional crazy hypothesis, vs. talking about the conjunction ‘additional crazy hypothesis + empirically testable theory’. Presumably you’re imagining a scenario where the conjunction taken as a whole is testable, though one of the conjuncts is not. So just say that.
Sean Carroll summarizes collapse-flavored QM as the conjunction of these five claims:
Many-worlds-flavored QM, on the other hand, is the conjunction of 1 and 2, plus the negation of 5 -- i.e., it’s an affirmation of wave functions and their dynamics (which effectively all physicists agree about), plus a rejection of the ‘collapses’ some theorists add to keep the world small and probabilistic. (If you’d like, you could supplement ‘not 5’ with ‘not Bohmian mechanics’; but for present purposes we can mostly lump Bohm in with multiverse interpretations, because Eliezer’s blog series is mostly about rejecting collapse rather than about affirming a particular non-collapse view.)
If we want ‘QM’ to be the neutral content shared by all these interpretations, then we can say that QM is simply the conjunction of 1 and 2. You are then free to say that we should assign 50% probability to claim 5, and maintain agnosticism between collapse and non-collapse views. But realize that, logically, either collapse or its negation does have to be true. You can frame denying collapse as ‘positing invisible extra worlds’, but you can equally frame denying collapse as ‘skepticism about positing invisible extra causal laws’.
Since every possible way the universe could be adds something ‘extra’ on top of what we observe—either an extra law (e.g., collapse) or extra ontology (because there are no collapses occurring to periodically annihilate the ontology entailed by the Schrodinger equation) -- it’s somewhat missing the point to attack any given interpretation for the crime of positing something extra. The more relevant question is just whether simplicity considerations or indirect evidence helps us decide which ‘something extra’ (a physical law, or more ‘stuff’, or both) is the right one. If not, then we stick with a relatively flat prior.
Claims 1 and 2 are testable, which is why we were able to acquire evidence for QM in the first place. Claim 5 is testable for pretty much any particular ‘collapse’ interpretation you have in mind; which means the negation of claim 5 is also testable. So all parts of bare-bones MWI are testable (though it may be impractical to run many of the tests), as long as we’re comparing MWI to collapse and not to Bohmian Mechanics.
(You can, of course, object that affirming 3-5 as fundamental laws has the advantage of getting us empirical adequacy. But ‘MWI (and therefore also ‘bare’ QM) isn’t empirically adequate’ is a completely different objection from ‘MWI asserts too many unobserved things’, and in fact the two arguments are in tension: it’s precisely because Eliezer isn’t willing to commit himself to a mechanism for the Born probabilities in the absence of definitive evidence that he’s sticking to ‘bare’ MWI and leaving almost entirely open how these relate to the Born rule. In the one case you’d be criticizing MWI theorists for refusing to stick their neck out and make some guesses about which untested physical laws and ontologies are the real ones; in the other case you’d be criticizing MWI theorists for making guesses about which untested physical laws and ontologies are the real ones.)
Are you kidding? I would love it if theologians stopped hand-waving about how their God is ‘ineffably simple no really we promise’ and started trying to construct arguments that God (and, more importantly, the package deal ‘God + universe’) is information-theoretically simple, e.g., by trying to write a simple program that outputs Biblical morality plus the laws of physics. At best, that sort of precision would make it much clearer where the reasoning errors are; at worst, it would be entertainingly novel.
Yes.
Right. But I think it’s a waste of energy to assign probabilities to assertions lacking empirical content, because you will not be updating anyways, and a prior without possibility of data is just a slightly mathier way to formulate “your taste.” I don’t argue about taste.
One can assume a reasonable model here (e.g. leave a copy of the spaceship in earth orbit, or have it travel in a circle in the solar system, and assume similar degradation of modules). Yes, you will have indirect evidence only applicable due to your model. But I think the model here would have teeth.
Or we are thinking about the problem incorrectly and those are not exhaustive/mutually exclusive. Compare: “logically the electron must either be a particle or must not be a particle.”
I am not explicitly attacking MWI, as I think I said multiple times. I am not even attacking having interpretations or preferring one over another for reasons such as “taste” or “having an easier time thinking about QM.” I am attacking the notion that there is anything more to the preference for MWI than this.
To summarize my view: “testability” is about “empirical claims,” not about “narratives.” MWI is, by its very nature, a narrative about empirical claims. The list of empirical claims it is a narrative about can certainly differ from another list of empirical claims with another narrative. For example, we can imagine some sort of “billiard ball Universe” narrative around Newtonian physics.
But I would not say “MWI is testable relative to the Newtonian narrative”, I would say “the list of empirical claims ‘QM’ is testable relative to the list of empirical claims ’Newtonian physics.”
The problem with the former statement is first it is a “type error,” and second, there are infinitely many narratives around any list of empirical claims. You may prefer MWI for [reasons] over [infinitely long list of other narratives], but it seems like “argument about taste.” What’s even the point of the argument?
Let’s say I prefer some other interpretation of ‘QM’ than MWI. What does that say about me? Does it say anything bad? Am I a ‘bad rationalist?’ Do I have ‘bad taste?’ Does it matter what my favorite interpretation is? I think this lesswrongian MWI thing is belief-as-attire.
Full disclaimer: I am not a professional philosopher, and do not think about testability for a living. I reviewed a paper about testability once.
Could you restate your response to the spaceship example? This seems to me to be an entirely adequate response to
Favoring simpler hypotheses matters, because if you’re indifferent to added complexity when it makes no difference to your observations (e.g., ‘nothing outside the observable universe exists’) you may make bad decisions that impact agents that you could never observe, but that might still live better or worse lives based on what you do.
This matters when you’re making predictions about agents far from you in space and/or time. MWI is a special case of the same general principle, so it’s a useful illustrative example even if it isn’t as important as those other belief-in-the-implied-invisible scenarios.
Collapse and non-collapse interpretations are empirically distinguishable from each other. I’ve been defining ‘QM’ in a way that leaves it indifferent between collapse and non-collapse—in which case you can’t say that the distinction between bare QM and MWI is just a matter of taste, because MWI adds the testable claim that collapse doesn’t occur. If you prefer to define ‘QM’ so that it explicitly rejects collapse, then yes, MWI (or some versions of MWI) is just a particular way of talking about QM, not a distinct theory. But in that case collapse interpretations of QM are incompatible with QM itself, which seems like a less fair-minded way of framing a foundations-of-physics discussion.
You are not engaging with my claim that testability is a property of empirical claims, not narratives. Not sure there is a point to continue until we resolve the disagreement about the possible category error here.
There is another weird thing where you think we test claims against other claims, but actually we test against Nature. If Nature says your claim is wrong, it’s falsified. If there is a possibility of Nature saying that, it’s falsifiable. You don’t need a pair of claims here. Testability is not a binary relation between claims. But that’s not central to the disagreement.
Why do you think collapse interpretations are ‘narratives’, and why do you think they aren’t empirical claims?
Regarding testability: if you treat testability as an intrinsic feature of hypotheses, you risk making the mistake of thinking that if there is no test that would distinguish hypothesis A from hypothesis B, then there must be no test that could distinguish hypothesis A from hypothesis C. It’s true that you can just speak of a test that’s better predicted by hypothesis ‘not-A’ than by hypothesis A, but the general lesson that testability can vary based on which possibilities you’re comparing is an important one, and directly relevant to the case we’re considering.
There are two issues, what I view as non-standard language use, and what I view as a category error.
You can use the word ‘testability’ to signify a binary relation, but that’s not what people typically mean when they use that word. They typically mean “possibility Nature can tell you that you are wrong.”
So when you responded many posts back with a claim “MWI is hard to test” you are using the word “test” in a way probably no one else in the thread is using. You are not wrong, but you will probably miscommunicate.
An empirical claim has this form: “if we do experiment A, we will get result B.” Nature will sometimes agree, and sometimes not, and give you result C instead. If you have a list of such claims, you can construct a “story” about them, like MWI, or something else. But adding the “story” is an extra step, and what Nature is responding to is not the story but the experiment.
The mapping from stories to lists of claims is always always always many to one. If you have [story1] about [list1] and [story2] about [list2], and Nature agrees with [list1], and disagrees with [list2], then you will say:
“story1 was falsified, story2 was falsifiable but not falsified.”
I will say:
“list1 was falsified, list2 was falsifiable but not falsified.”
What’s relevant here isn’t the details of story1 or story2, but what’s in the lists.
When I say “MWI is untestable” what I mean is:
“There is a list of empirical claims called ‘quantum mechanics.’ There is a set of stories about this list, one of which is MWI. There is no way to tell these stories apart empirically, so you pick the one you like best for non-empirical reasons.”
When you say “MWI is testable” what I think you mean is:
“There are two lists of empirical claims, called ‘quantum mechanics’ and ‘quantum mechanics prime,’ a story ‘story 1’ about the former, and a story ‘story 2’ about the latter. Nature will agree with the list ‘quantum mechanics’ and disagree with the list ‘quantum mechanics prime.’ Therefore, ‘story 1’ is testable relative to ‘story 2.’”
That’s fine, I understand what you mean, and I think you are right, up to the last sentence. But I think the last sentence is a category error.
Because you are equating lists of claims with stories, you are carrying over the testability property of the list ‘quantum mechanics’ to your favorite story about this list, ‘MWI.’ But there is an infinite list of stories consistent with ‘quantum mechanics’. I can replace ‘MWI’ in your argument with any other consistent story, including those involving the flying spaghetti monster, etc.
Then you get unintuitive statements like ‘the flying spaghetti interpretation of quantum mechanics is testable relative to X.’ This is a sufficiently weird use of the word “testable” that I think we should not use the word “testable” in this way. And indeed, I believe the standard usage of the word “testable” is not this.
At one point I started developing a religious RPG character who applied theoretical computer science to his faith.
I forget details, but among other details he believed that although the Bible prescribed the best way to live, the world is far too complex for any finite set of written rules to cover every situation. The same limitation applies to human reason: cognitive science and computational complexity theory have shown all the ways in which we are bounded reasoners, and can only ever hope to comprehend a small part of the whole world. Reason works best when it can be applied to constrained problems where clear objective answer can be found, but it easily fails once the number of variables grows.
Thus, because science has shown that both the written word of the Bible and human reason are fallible and easily lead us astray (though the word of the Bible is less likely to do so), the rational course of action for one who believes in science is to pray to God for guidance and trust the Holy Spirit to lead us to the right choices.
Plus 6: There is a preferred basis.
In so far as I understand what the “preferred basis problem” is actually supposed to be, the existence of a preferred basis seems to me to be not an assumption necessary for Everettian QM to work but an empirical fact about the world; if it were false then the world would not, as it does, appear broadly classical when one doesn’t look too closely. Without a preferred basis, you could still say “the wavefunction just evolves smoothly and there is no collapse”; it would no longer be a useful approximation to describe what happens in terms of “worlds”, but for the same reason you could not e.g. adopt a “collapse” interpretation in which everything looks kinda-classical on a human scale apart from random jumps when “observations” or “measurements” happen. The world would look different in the absence of a preferred basis.
But I am not very expert on this stuff. Do you think the above is wrong, and if so how?
I think it’s being used as an argument against beliefs paying rent.
Since there is more than one interpretation of QM, empirically testing QM does not prove any one interpretation over the others. Whatever extra arguments are used to support a particular interpretation over the others are not going to be, and have not been, empirical.
No they are not, because of the meaning of the word “interpretation” but collapse theories, such as GRW, might be.
Which is one of the ways in which beliefs that don’t pay rent do pay rent.
Yes I’m familiar with the technical agenda. What do you mean by “forecasting work”—AI impacts? That seems to be of near-zero utility to me.
What MIRI should be doing, what I’ve advocated MIRI to do from the start, and which I can’t get a straight answer on why they are not doing that does not in some way terminate in referencing the more speculative sections of the sequences I take issue with, is this: build artificial general intelligence and study it. Not a provably-safe-from-first-principles-before-we-touch-a-single-line-of-code AGI. Just a regular, run of the mill AGI using any one of the architectures presently being researched in the artificial intelligence community. Build it and study it.
A few quick concerns:
The closer we get to AGI, the more profitable further improvements in AI capabilities become. This means that the more we move the clock toward AGI, the more likely we are to engender an AI arms race between different nations or institutions, and the more (apparent) incentives there are to cut corners on safety and security. At the same time, AGI is an unusual technology in that it can potentially be used to autonomously improve on our AI designs—so that the more advanced and autonomous AI becomes, the likelier it is to undergo a speed-up in rates of improvement (and the likelier these improvements are to be opaque to human inspection). Both of these facts could make it difficult to put the brakes on AI progress.
Both of these facts also make it difficult to safely ‘box’ an AI. First, different groups in an arms race may simply refuse to stop reaping the economic or military/strategic benefits of employing their best AI systems. If there are many different projects that are near or at AGI-level when your own team suddenly stops deploying your AI algorithms and boxes them, it’s not clear there is any force on earth that can compel all other projects to freeze their work too, and to observe proper safety protocols. We are terrible at stopping the flow of information, and we have no effective mechanisms in place to internationally halt technological progress on a certain front. It’s possible we could get better at this over time, but the sooner we get AGI, the less intervening time we’ll have to reform our institutions and scientific protocols.
A second reason speed-ups make it difficult to safely box an AGI is that we may not arrest its self-improvement in the (narrow?) window between ‘too dumb to radically improve on our understanding of AGI’ and ‘too smart to keep in a box’. We can try to measure capability levels, but only using imperfect proxies; there is no actual way to test how hard it would be for an AGI to escape a box beyond ‘put the AGI in the box and see what happens’. Which means we can’t get much of a safety assurance until after we’ve done the research you’re talking about us doing on the boxed AI. If you aren’t clear on exactly how capable the AI is, or how well measures of its apparent capabilities in other domains transfer to measures of its capability at escaping boxes, there are limits to how confident you can be that the AI is incapable of finding clever methods to bridge air gaps, or simply adjusting its software in such a way the methods we’re using to inspect and analyze the AI compromise the box.
‘AGI’ is not actually a natural kind. It’s just an umbrella term for ‘any mind we could build that’s at least as powerful as a human’. Safe, highly reliable AI in particular is likely to be an extremely special and unusual subcategory. Studying a completely arbitrary AGI may tell as about as much about how to build a safe AGI as studying nautilus ecology would tell us about how to safely keep bees and farm their honey. Yes, they’re both ‘animals’, and we probably could learn a lot, but not as much as if we studied something a bit more bee-like. But in this case that presupposes that we understand AI safety well enough to build an AGI that we expect to look at least a little like our target safe AI. And our understanding just isn’t there yet.
We already have seven billion general intelligences we can study in the field, if we so please; it’s not obvious that a rushed-to-completion AGI would resemble a highly reliable safe AGI in all that much more detail than humans resemble either of those two hypothetical AGIs.
(Of course, our knowledge would obviously improve! Knowing about a nautilus and a squirrel really does tell us a lot more about beekeeping than either of those species would on its own, assuming we don’t have prior experience with any other animals. But if the nautilus is a potential global catastrophic risk, we need to weigh those gains against the risk and promise of alternative avenues of research.)
Was any of that unclear?