But maybe your intuitions are wrong (or maybe both). I think a desirable property of plans/âstrategies for alignment would be robustness to either of us being wrong about this đ
Generally it is a good idea to be robust with plans. However, in this specific instance, the way Eliezer phrases it, any iterative plan for alignment would be excluded. Since I also believe that this is the only realistic plan (there will simply never be a design that has the properties that Eliezer thinks guarantee alignment), the only realistic remaining path would be a permanent freeze (which I actually believe comes with large risks as well: unenforcability and thus worse actors making ASI first, biotech in the wrong hands becoming a larger threat to humanity, etc.).
What I would agree to is that it is good to plan for the eventuality that a lot less data could be needed by an ASI to do something like âcreate nanobotsâ. For example, we could conclude that itâs for now simply a bad idea if AI is used in biotech labs, because these are the places where it could easily gather a lot of data and maybe even influence experiments so that they let it learn the things it needs to create nanobots. Similarly, we could try to create a worldwide warning systems around technologies that seem likely to be necessary for an AI takeover, and watch these closely, so that we would notice any specific experiments. However, there is no way to scale this to a one-shot scenario.
Eliezerâs scenario does assume the involvement of human labs (he describes a scenario where DNA is ordered online).
His claim is that an ASI will order some DNA and get some scientists in a lab to mix it together with some substances and create nanobots. That is what I describe as a one-shot scenario. Even if it were 10,000 shots in parallel I simply donât think it is possible, because I donât think the data itself is out there. Similarly to how you need accelerators to work out how physical laws work in high energy regimes (and random noise from other measurements just tells you nothing about it), if you are planning to design a completely new type of molecular machinery then you will need to do measurements on those specific molecules. So there will need to be a feedback loop, where the AI can learn detailed outcomes from experiments to gain more data.
I agree with you here (although I would hope that much of this iteration can be done in quick succession, and hopefully in a low-risk way) đ
Itâs great that we agree on this :) And I do agree on finding ways to make this lower risk, and I think taking into account few-shot learning scenarios on biotech would be a good idea. And donât get me wrongâthere may be biotech scenarios with very few shots that kill a lot of humans available today, probably even without any ASI (humans can do it). I just think if an AI executed it today it would have no way of surviving and expanding.
Engineers, however, can constrain and master this sort of unpredictability. A pipe carrying turbulent water is unpredictable inside (despite being like a shielded box), yet can deliver water reliably through a faucet downstream. The details of this turbulent flow are beyond prediction, yet everything about the flow is bounded in magnitude, and in a robust engineering design the unpredictable details wonât matter.
This is absolutely what engineers do. But finding the right design patterns that do this involves a lot of experimentation (not for a pipe, but for constructing e.g. a reliable transistor). If someone eventually constructs non-biological self-replicating nanobots, it will probably involve high-reliability design patterns around certain molecular machinery. However, finding the right molecules that reliably do what you want, as well as how to put them together, etc., is a lot of research that I am pretty certain will involve actually producing those molecules and doing experiments with them.
That protein folding is âsolvedâ does not disprove this IMO. Biological molecules are, after all, made from simple building blocks (amino acid) with some very predictable properties (how they stick together) so itâs already vastly simplified the problem. And solving protein folding (as far as I know) does not solve the question of understanding what molecules actually doâI believe understanding protein function is still vastly less developed (correct me if Iâm wrong here, I havenât followed it in detail).
Also, sorry about the length of this reply. As the adage goes: âIf I had more time, I would have written a shorter letter.â
From my perspective you seem simply very optimistic on what kind of data can be extracted from unspecific measurements.
That seems to be one of the relevant differences between us. Although I donât think it is the only difference that causes us to see things differently.
Other differences (I guess some of these overlap):
It seems I have higher error-bars than you on the question we are discussing now. You seem more comfortable taking the availability heuristic (if you can think of approaches for how something can be done) as conclusive evidence.
Compared to me, it seems that you see experimentation as more inseparably linked with needing to build extensive infrastructure /â having access to labs, and spending lots of serial time (with much back-and-fourth).
You seem more pessimistic about the impressiveness/âreliability of engineering that can be achieved by a superintelligence that lacks knowledge/âdata about lots of stuff.
The probability of having a single plan work, and having one of several plans (carried out in parallel) work, seems to be more linked in your mind than mine.
You seem more dismissive than me of conclusions maybe being possible to reach from first-principles thinking (about how universes might work).
I seem to be more optimistic about approaches to thinking that are akin to (a more efficient version of) âthink of lots of ways the universe might work, do Montecarlo-simulations for how those conjectures would affect the probability of lots of aspects of lots of different observations, and take notice if some theories about the universe seem unusually consistent with the data we seeâ.
I wonder if you maybe think of computability in a different way from me. Like, you may think that itâs computationally intractable to predict the properties of complex molecules based on knowledge of the standard model /â quantum physics. And my perspective would be that this is extremely contingent on the molecule, what the AI needs to know about it, etcâand that an AGI, unlike us, isnât forced to approach this sort of thing in an extremely crude manner.
The AI only needs to find one approach that works (from an extremely vast space of possible designs/âapproaches). I suspect you of having fewer qualms about playing fast and lose with the distinction between âan AI will often/âmostly be prevented from doing x due to yâ and âan AI will always be prevented from doing x due to yâ.
Itâs unclear if you share my perspective about how itâs an extremely important factor that an AGI could be much better than us at doing reasoning where it has a low error-rate (in terms of logical flaws in reasoning-steps, etc).
From my perspective, I donât see how your reasoning is qualitatively distinct from saying in the 1500s: âWe will for sure never be able to know what the sun is made out of, since we wonât be able to travel there and take samples.â
Even if we didnât have e.g. the standard model, my perspective would still be roughly what it is (with some adjustments to credences, but not qualitatively so). So to me, us having the standard model is âicing on the cakeâ.
Eliezer says âA Bayesian superintelligence, hooked up to a webcam, would invent General Relativity as a hypothesis (...)â. I might add more qualifiers (replacing âwouldâ with âmightâ, etc). I think I have wider error-bars than Eliezer, but similar intuitions when it comes to this kind of thing.
Speaking of intuitions, one question that maybe gets at deeper intuitions is âcould AGIs find out how to play theoretically perfect chess /â solve the game of chess?â. At 5â1 odds, this is a claim that I myself would bet neither for nor against (I wouldnât bet large sums at 1â1 odds either). While I think people of a certain mindset will think âthat is computationally intractable [when using the crude methods I have in mind]â, and leave it at that.
As to my credences that a superintelligence could âoneshotâ nanobots[1] - without being able to design and run experiments prior to designing this planâI would bet neither âyesâ or ânoâ to that a 1â1 odds (but if I had to bet, I would bet âyesâ).
Upon seeing three frames of a falling apple and with no other information, a superintelligence would assign a high probability to Newtonian mechanics, including Newtonian gravity. [from the post you reference]
But it would have other information. Insofar as it can reason about the reasoning-process that it itself consists of, thatâs a source of information (some ways by which the universe could work would be more/âless likely to produce itself). And among ways that reality might workâwhich the AI might hypothesize about (in the absence of data) - some will be more likely than others in a âKolmogorov complexityâ sort of way.
How far/âshort a superintelligence could get with this sort of reasoning, I dunno.
Here is an excerpt from a TED-talk from the Wolfram Alpha that feels a bit relevant (I find the sort of methodology that he outlines deeply intuitive):
âWell, so, that leads to kind of an ultimate question: Could it be that someplace out there in the computational universe we might find our physical universe? Perhaps thereâs even some quite simple rule, some simple program for our universe. Well, the history of physics would have us believe that the rule for the universe must be pretty complicated. But in the computational universe, weâve now seen how rules that are incredibly simple can produce incredibly rich and complex behavior. So could that be whatâs going on with our whole universe? If the rules for the universe are simple, itâs kind of inevitable that they have to be very abstract and very low level; operating, for example, far below the level of space or time, which makes it hard to represent things. But in at least a large class of cases, one can think of the universe as being like some kind of network, which, when it gets big enough, behaves like continuous space in much the same way as having lots of molecules can behave like a continuous fluid. Well, then the universe has to evolve by applying little rules that progressively update this network. And each possible rule, in a sense, corresponds to a candidate universe.
Actually, I havenât shown these before, but here are a few of the candidate universes that Iâve looked at. Some of these are hopeless universes, completely sterile, with other kinds of pathologies like no notion of space, no notion of time, no matter, other problems like that. But the exciting thing that Iâve found in the last few years is that you actually donât have to go very far in the computational universe before you start finding candidate universes that arenât obviously not our universe. Hereâs the problem: Any serious candidate for our universe is inevitably full of computational irreducibility. Which means that it is irreducibly difficult to find out how it will really behave, and whether it matches our physical universe. A few years ago, I was pretty excited to discover that there are candidate universes with incredibly simple rules that successfully reproduce special relativity, and even general relativity and gravitation, and at least give hints of quantum mechanics.â
invent General Relativity as a hypothesis [from the post you reference]
As I understand it, the original experiment humans did to test for general relativity (not to figure out that general relativity probably was correct, mind you, but to test it âofficiallyâ) was to measure gravitational redshift.
And I guess redshift is an example of something that will affect many photos. And a superintelligent mind might be able to use such data better than us (we, having âpatheticâ mental abilities, will have a much greater need to construct experiments where we only test one hypothesis at a time, and to gather the Bayesian evidence we need relating to that hypothesis from one or a few experiments).
It seems that any photo that contains lighting stemming from the sun (even if the picture itself doesnât include the sun) can be a source of Bayesian evidence relating to general relativity:
It seems that GPS data must account for redshift in its timing system. This could maybe mean that some internet logs (where info can be surmised about how long it takes to send messages via satellite) could be another potential source for Bayesian evidence:
I donât know exactly what and how much data a superintelligence would need to surmise general relativity (if any!). How much/âlittle evidence it could gather from a single picture of an apple I dunno.
There is just absolutely no reason to consider general relativity at all when simpler versions of physics explain absolutely all observations you have ever encountered (which in this case is 2 frames). [from the post you reference]
I disagree with this.
First off, it makes sense to consider theories that explain more observations than just the ones youâve encountered.
Secondly, simpler versions of physics do not explain your observations when you see 2 webcam-frames of a falling apple. In particular, the colors you see will be affected by non-Newtonian physics.
Also, the existence of apples and digital cameras also relates to which theories of physics are likely/âplausible. Same goes for the resolution of the video, etc, etc.
However, there is no way to scale this to a one-shot scenario.
You say that so definitively. Almost as if you arenât really imagining an entity that is orders of magnitude more capable/âintelligent than humans. Or as if you have ruled out large swathes of the possibility-space that I would not rule out.
I just think if an AI executed it today it would have no way of surviving and expanding.
If an AGI is superintelligent and malicious, then surviving/âexpanding (if it gets onto the internet) seems quite clearly feasible to me.
We even have a hard time getting corona-viruses back in the box! Thatâs a fairly different sort of thing, but it does show how feeble we are. Another example is illegal images/âvideos, etc (where the people sharing those are humans).
An AGI could plant itself onto lots of different computers, and there are lots of different humans it could try to manipulate (a low success rate would not necessarily be prohibitive). Many humans fall for pretty simple scams, and AGIs would be able to pull off much more impressive scams.
This is absolutely what engineers do. But finding the right design patterns that do this involves a lot of experimentation (not for a pipe, but for constructing e.g. a reliable transistor).
Here you speak about how humans workâand in such an absolutist way. Being feeble and error-prone reasoners, it makes sense that we need to rely heavily on experiments (and have a hard time making effective use of data not directly related to the thing weâre interested in).
That protein folding is âsolvedâ does not disprove this IMO.
I think protein being âsolvedâ exemplifies my perspective, but I agree about it not âprovingâ or âdisprovingâ that much.
Biological molecules are, after all, made from simple building blocks (amino acid) with some very predictable properties (how they stick together) so itâs already vastly simplified the problem.
When it comes to predictable properties, I think there are other molecules where this is more the case than for biological ones (DNA-stuff needs to be âmessyâ in order for mutations that make evolution work to occur). Iâm no chemist, but this is my rough impression.
are, after all, made from simple building blocks (amino acid) with some very predictable properties (how they stick together)
Ok, so you acknowledge that there are molecules with very predictable properties.
Itâs ok for much/âmost stuff not to be predictable to an AGI, as long as the subset of stuff that can be predicted is sufficient for the AGI to make powerful plans/âdesigns.
finding the right molecules that reliably do what you want, as well as how to put them together, etc., is a lot of research that I am pretty certain will involve actually producing those molecules and doing experiments with them.
Even IF that is the case (an assumption that I donât share but also donât rule out), design-plans may be made to have experimentation built into them. It wouldnât necessarily need to be like this:
experiments being run
data being sent to the AI so that it can reason about it
then having the AI think a bit and construct new experiments
more experiments being run
data being sent to the AI so that it can reason about it
etc
I could give specific examples of ways to avoid having to do it that way, but any example I gave would be impoverished, and understate the true space of possible approaches.
His claim is that an ASI will order some DNA and get some scientists in a lab to mix it together with some substances and create nanobots.
I read the scenario he described as:
involving DNA being ordered from lab
having some gullible person elsewhere carry out instructions, where the DNA is involved somehow
being meant as one example of a type of thing that was possible (but not ruling out that there could be other ways for a malicious AGI to go about it)
I interpreted him as pointing to a larger possibility-space than the one you present. I donât think the more specific scenario you describe would appear prominently in his mind, and not mine either (you talk about getting âsome scientists in a lab to mix it togetherââwhile I donât think this would need to happen in a lab).
Here is an excerpt from here (written in 2008), with boldening of text done by me:
â1. Crack the protein folding problem, to the extent of being able to generate DNA strings whose folded peptide sequences fill specific functional roles in a complex chemical interaction. 2. Email sets of DNA strings to one or more online laboratories which offer DNA synthesis, peptide sequencing, and FedEx delivery. (Many labs currently offer this service, and some boast of 72-hour turnaround times.) 3. Find at least one human connected to the Internet who can be paid, blackmailed, or fooled by the right background story, into receiving FedExed vials and mixing them in a specified environment. 4. The synthesized proteins form a very primitive âwetâ nanosystem which, ribosomelike, is capable of accepting external instructions; perhaps patterned acoustic vibrations delivered by a speaker attached to the beaker. 5. Use the extremely primitive nanosystem to build more sophisticated systems, which construct still more sophisticated systems, bootstrapping to molecular nanotechnologyâor beyond.â
âNaturally, with this in mind, we started to build a biological teleporter. We call it the DBC. Thatâs short for digital-to-biological converter. Unlike the BioXp, which starts from pre-manufactured short pieces of DNA, the DBC starts from digitized DNA code and converts that DNA code into biological entities, such as DNA, RNA, proteins or even viruses. You can think of the BioXp as a DVD player, requiring a physical DVD to be inserted, whereas the DBC is Netflix. To build the DBC, my team of scientists worked with software and instrumentation engineers to collapse multiple laboratory workflows, all in a single box. This included software algorithms to predict what DNA to build, chemistry to link the G, A, T and C building blocks of DNA into short pieces, Gibson Assembly to stitch together those short pieces into much longer ones, and biology to convert the DNA into other biological entities, such as proteins.
This is the prototype. Although it wasnât pretty, it was effective. It made therapeutic drugs and vaccines. And laboratory workflows that once took weeks or months could now be carried out in just one to two days. And thatâs all without any human intervention and simply activated by the receipt of an email which could be sent from anywhere in the world. We like to compare the DBC to fax machines.
(...)
Hereâs what our DBC looks like today. We imagine the DBC evolving in similar ways as fax machines have. Weâre working to reduce the size of the instrument, and weâre working to make the underlying technology more reliable, cheaper, faster and more accurate.
(...)
The DBC will be useful for the distributed manufacturing of medicine starting from DNA. Every hospital in the world could use a DBC for printing personalized medicines for a patient at their bedside. I can even imagine a day when itâs routine for people to have a DBC to connect to their home computer or smart phone as a means to download their prescriptions, such as insulin or antibody therapies. The DBC will also be valuable when placed in strategic areas around the world, for rapid response to disease outbreaks. For example, the CDC in Atlanta, Georgia could send flu vaccine instructions to a DBC on the other side of the world, where the flu vaccine is manufactured right on the front lines.â
I believe understanding protein function is still vastly less developed (correct me if Iâm wrong here, I havenât followed it in detail).
Iâm no expert on this, but what you say here seems in line with my own vague impression of things. As you maybe noticed, I also put âsolvedâ in quotation marks.
However, in this specific instance, the way Eliezer phrases it, any iterative plan for alignment would be excluded.
As touched upon earlier, I am myself am optimistic when it comes to iterative plans for alignment. But I would prefer such iteration to be done with caution that errs on the side of paranoia (rather than being ânot paranoid enoughâ).
It would be ok if (many of the) people doing this iteration would think it unlikely that intuitions like Eliezerâs or mine are correct. But it would be preferable for them to carry out plans that would be likely to have positive results even if they are wrong about that.
Like, you expect that since something seems hopeless to you, a superintelligent AGI would be unable to do it? Ok, fine. But letâs try to minimize the amount of assumptions like that which are loadbearing in our alignment strategies. Especially for assumptions where smart people who have thought about the question extensively disagree strongly.
As a sidenote:
If I lived in the stone age, I would assign low credence to us going step by step from stone-age technologies akin to iPhones and the international space station and IBM being written with xenon atoms.
If I lived prior to complex life (but my own existence didnât factor into my reasoning), I would assign low credence to anything like mammals evolving.
Itâs interesting to note that even though many people (such as yourself) have a âconservativeâ way of thinking (about things such as this) compared to me, I am still myself âconservativeâ in the sense that there are several things that have happened that would have seemed too âout thereâ to appear realistic to me.
Another sidenote:
One question we might ask ourselves is: âhow many rules by which the universe could work would be consistent with e.g. the data we see on the internet?â. And by rules here, I donât mean rules that can be derived from other rules (like e.g. the weight of a helium atom), but the parameters that most fundamentally determine how the universe works. If we...
Rank rules by (1) how simple/âelegant they are and (2) by how likely the data we see on the internet would be to occur with those rules
Consider rules âdifferent from each otherâ if there are differences between them in regards to predictions they make for which nano-technology-designs that would work
...my (possibly wrong) guess is that there would be a âclear winnerâ.
Even if my guess is correct, that leaves the question of whether finding/âdetermining the âwinnerâ is computationally tractable. With crude/ânaive search-techniques it isnât tractable, but we donât know the specifics of the techniques that a superintelligence might useâit could maybe develop very efficient methods for ruling out large swathes of search-space.
And a third sidenote (the last one, I promise):
Speculating about this feels sort of analogous to reasoning about a powerful chess engine (although there are also many disanalogies). I know that I can beat an arbitrarily powerful chess engine if I start from a sufficiently advantageous position. But I find it hard to predict where that âlineâ is (looking at a specific board position, and guessing if an optimal chess-player could beat me). Like, for some board positions the answer will be a clear âyesâ or a clear ânoâ, but for other board-positions, it will not be clear.
I donât know how much info and compute a superintelligence would need to make nanotechnology-designs that work in a âone shortâ-ish sort of way. Iâm fairly confident that the amount of computational resources used for the initial moon-landing would be far too little (Iâm picking an extreme example here, since I want plenty of margin for error). But I donât know where the âlineâ is.
Although keep in mind that âoneshottingâ does not exclude being able to run experiments (nor does it rule out fairly extensive experimentation). As I touched upon earlier, it may be possible for a plan to have experimentation built into itself. Needing to do experimentation â needing access to a lab and lots of serial time.
This tweet from Eliezer seems relevant btw. I would give similar answers to all of the questions he lists that relate to nanotechnology (but Iâd be somewhat more hedged/âguardedâe.g. replacing âYESâ with âPROBABLYâ for some of them).
Thanks for engaging with my post. From my perspective you seem simply very optimistic on what kind of data can be extracted from unspecific measurements. Here is another good example on how Eliezer makes some pretty out there claims about what might be possible to infer from very little data: https://ââwww.lesswrong.com/ââposts/ââALsuxpdqeTXwgEJeZ/ââcould-a-superintelligence-deduce-general-relativity-from-aâI wonder what your intuition says about this?
Generally it is a good idea to be robust with plans. However, in this specific instance, the way Eliezer phrases it, any iterative plan for alignment would be excluded. Since I also believe that this is the only realistic plan (there will simply never be a design that has the properties that Eliezer thinks guarantee alignment), the only realistic remaining path would be a permanent freeze (which I actually believe comes with large risks as well: unenforcability and thus worse actors making ASI first, biotech in the wrong hands becoming a larger threat to humanity, etc.).
What I would agree to is that it is good to plan for the eventuality that a lot less data could be needed by an ASI to do something like âcreate nanobotsâ. For example, we could conclude that itâs for now simply a bad idea if AI is used in biotech labs, because these are the places where it could easily gather a lot of data and maybe even influence experiments so that they let it learn the things it needs to create nanobots. Similarly, we could try to create a worldwide warning systems around technologies that seem likely to be necessary for an AI takeover, and watch these closely, so that we would notice any specific experiments. However, there is no way to scale this to a one-shot scenario.
His claim is that an ASI will order some DNA and get some scientists in a lab to mix it together with some substances and create nanobots. That is what I describe as a one-shot scenario. Even if it were 10,000 shots in parallel I simply donât think it is possible, because I donât think the data itself is out there. Similarly to how you need accelerators to work out how physical laws work in high energy regimes (and random noise from other measurements just tells you nothing about it), if you are planning to design a completely new type of molecular machinery then you will need to do measurements on those specific molecules. So there will need to be a feedback loop, where the AI can learn detailed outcomes from experiments to gain more data.
Itâs great that we agree on this :) And I do agree on finding ways to make this lower risk, and I think taking into account few-shot learning scenarios on biotech would be a good idea. And donât get me wrongâthere may be biotech scenarios with very few shots that kill a lot of humans available today, probably even without any ASI (humans can do it). I just think if an AI executed it today it would have no way of surviving and expanding.
This is absolutely what engineers do. But finding the right design patterns that do this involves a lot of experimentation (not for a pipe, but for constructing e.g. a reliable transistor). If someone eventually constructs non-biological self-replicating nanobots, it will probably involve high-reliability design patterns around certain molecular machinery. However, finding the right molecules that reliably do what you want, as well as how to put them together, etc., is a lot of research that I am pretty certain will involve actually producing those molecules and doing experiments with them.
That protein folding is âsolvedâ does not disprove this IMO. Biological molecules are, after all, made from simple building blocks (amino acid) with some very predictable properties (how they stick together) so itâs already vastly simplified the problem. And solving protein folding (as far as I know) does not solve the question of understanding what molecules actually doâI believe understanding protein function is still vastly less developed (correct me if Iâm wrong here, I havenât followed it in detail).
Likewise :)
Also, sorry about the length of this reply. As the adage goes: âIf I had more time, I would have written a shorter letter.â
That seems to be one of the relevant differences between us. Although I donât think it is the only difference that causes us to see things differently.
Other differences (I guess some of these overlap):
It seems I have higher error-bars than you on the question we are discussing now. You seem more comfortable taking the availability heuristic (if you can think of approaches for how something can be done) as conclusive evidence.
Compared to me, it seems that you see experimentation as more inseparably linked with needing to build extensive infrastructure /â having access to labs, and spending lots of serial time (with much back-and-fourth).
You seem more pessimistic about the impressiveness/âreliability of engineering that can be achieved by a superintelligence that lacks knowledge/âdata about lots of stuff.
The probability of having a single plan work, and having one of several plans (carried out in parallel) work, seems to be more linked in your mind than mine.
You seem more dismissive than me of conclusions maybe being possible to reach from first-principles thinking (about how universes might work).
I seem to be more optimistic about approaches to thinking that are akin to (a more efficient version of) âthink of lots of ways the universe might work, do Montecarlo-simulations for how those conjectures would affect the probability of lots of aspects of lots of different observations, and take notice if some theories about the universe seem unusually consistent with the data we seeâ.
I wonder if you maybe think of computability in a different way from me. Like, you may think that itâs computationally intractable to predict the properties of complex molecules based on knowledge of the standard model /â quantum physics. And my perspective would be that this is extremely contingent on the molecule, what the AI needs to know about it, etcâand that an AGI, unlike us, isnât forced to approach this sort of thing in an extremely crude manner.
The AI only needs to find one approach that works (from an extremely vast space of possible designs/âapproaches). I suspect you of having fewer qualms about playing fast and lose with the distinction between âan AI will often/âmostly be prevented from doing x due to yâ and âan AI will always be prevented from doing x due to yâ.
Itâs unclear if you share my perspective about how itâs an extremely important factor that an AGI could be much better than us at doing reasoning where it has a low error-rate (in terms of logical flaws in reasoning-steps, etc).
From my perspective, I donât see how your reasoning is qualitatively distinct from saying in the 1500s: âWe will for sure never be able to know what the sun is made out of, since we wonât be able to travel there and take samples.â
Even if we didnât have e.g. the standard model, my perspective would still be roughly what it is (with some adjustments to credences, but not qualitatively so). So to me, us having the standard model is âicing on the cakeâ.
Eliezer says âA Bayesian superintelligence, hooked up to a webcam, would invent General Relativity as a hypothesis (...)â. I might add more qualifiers (replacing âwouldâ with âmightâ, etc). I think I have wider error-bars than Eliezer, but similar intuitions when it comes to this kind of thing.
Speaking of intuitions, one question that maybe gets at deeper intuitions is âcould AGIs find out how to play theoretically perfect chess /â solve the game of chess?â. At 5â1 odds, this is a claim that I myself would bet neither for nor against (I wouldnât bet large sums at 1â1 odds either). While I think people of a certain mindset will think âthat is computationally intractable [when using the crude methods I have in mind]â, and leave it at that.
As to my credences that a superintelligence could âoneshotâ nanobots[1] - without being able to design and run experiments prior to designing this planâI would bet neither âyesâ or ânoâ to that a 1â1 odds (but if I had to bet, I would bet âyesâ).
But it would have other information. Insofar as it can reason about the reasoning-process that it itself consists of, thatâs a source of information (some ways by which the universe could work would be more/âless likely to produce itself). And among ways that reality might workâwhich the AI might hypothesize about (in the absence of data) - some will be more likely than others in a âKolmogorov complexityâ sort of way.
How far/âshort a superintelligence could get with this sort of reasoning, I dunno.
Here is an excerpt from a TED-talk from the Wolfram Alpha that feels a bit relevant (I find the sort of methodology that he outlines deeply intuitive):
âWell, so, that leads to kind of an ultimate question: Could it be that someplace out there in the computational universe we might find our physical universe? Perhaps thereâs even some quite simple rule, some simple program for our universe. Well, the history of physics would have us believe that the rule for the universe must be pretty complicated. But in the computational universe, weâve now seen how rules that are incredibly simple can produce incredibly rich and complex behavior. So could that be whatâs going on with our whole universe? If the rules for the universe are simple, itâs kind of inevitable that they have to be very abstract and very low level; operating, for example, far below the level of space or time, which makes it hard to represent things. But in at least a large class of cases, one can think of the universe as being like some kind of network, which, when it gets big enough, behaves like continuous space in much the same way as having lots of molecules can behave like a continuous fluid. Well, then the universe has to evolve by applying little rules that progressively update this network. And each possible rule, in a sense, corresponds to a candidate universe.
Actually, I havenât shown these before, but here are a few of the candidate universes that Iâve looked at. Some of these are hopeless universes, completely sterile, with other kinds of pathologies like no notion of space, no notion of time, no matter, other problems like that. But the exciting thing that Iâve found in the last few years is that you actually donât have to go very far in the computational universe before you start finding candidate universes that arenât obviously not our universe. Hereâs the problem: Any serious candidate for our universe is inevitably full of computational irreducibility. Which means that it is irreducibly difficult to find out how it will really behave, and whether it matches our physical universe. A few years ago, I was pretty excited to discover that there are candidate universes with incredibly simple rules that successfully reproduce special relativity, and even general relativity and gravitation, and at least give hints of quantum mechanics.â
As I understand it, the original experiment humans did to test for general relativity (not to figure out that general relativity probably was correct, mind you, but to test it âofficiallyâ) was to measure gravitational redshift.
And I guess redshift is an example of something that will affect many photos. And a superintelligent mind might be able to use such data better than us (we, having âpatheticâ mental abilities, will have a much greater need to construct experiments where we only test one hypothesis at a time, and to gather the Bayesian evidence we need relating to that hypothesis from one or a few experiments).
It seems that any photo that contains lighting stemming from the sun (even if the picture itself doesnât include the sun) can be a source of Bayesian evidence relating to general relativity:
It seems that GPS data must account for redshift in its timing system. This could maybe mean that some internet logs (where info can be surmised about how long it takes to send messages via satellite) could be another potential source for Bayesian evidence:
I donât know exactly what and how much data a superintelligence would need to surmise general relativity (if any!). How much/âlittle evidence it could gather from a single picture of an apple I dunno.
I disagree with this.
First off, it makes sense to consider theories that explain more observations than just the ones youâve encountered.
Secondly, simpler versions of physics do not explain your observations when you see 2 webcam-frames of a falling apple. In particular, the colors you see will be affected by non-Newtonian physics.
Also, the existence of apples and digital cameras also relates to which theories of physics are likely/âplausible. Same goes for the resolution of the video, etc, etc.
You say that so definitively. Almost as if you arenât really imagining an entity that is orders of magnitude more capable/âintelligent than humans. Or as if you have ruled out large swathes of the possibility-space that I would not rule out.
If an AGI is superintelligent and malicious, then surviving/âexpanding (if it gets onto the internet) seems quite clearly feasible to me.
We even have a hard time getting corona-viruses back in the box! Thatâs a fairly different sort of thing, but it does show how feeble we are. Another example is illegal images/âvideos, etc (where the people sharing those are humans).
An AGI could plant itself onto lots of different computers, and there are lots of different humans it could try to manipulate (a low success rate would not necessarily be prohibitive). Many humans fall for pretty simple scams, and AGIs would be able to pull off much more impressive scams.
Here you speak about how humans workâand in such an absolutist way. Being feeble and error-prone reasoners, it makes sense that we need to rely heavily on experiments (and have a hard time making effective use of data not directly related to the thing weâre interested in).
I think protein being âsolvedâ exemplifies my perspective, but I agree about it not âprovingâ or âdisprovingâ that much.
When it comes to predictable properties, I think there are other molecules where this is more the case than for biological ones (DNA-stuff needs to be âmessyâ in order for mutations that make evolution work to occur). Iâm no chemist, but this is my rough impression.
Ok, so you acknowledge that there are molecules with very predictable properties.
Itâs ok for much/âmost stuff not to be predictable to an AGI, as long as the subset of stuff that can be predicted is sufficient for the AGI to make powerful plans/âdesigns.
Even IF that is the case (an assumption that I donât share but also donât rule out), design-plans may be made to have experimentation built into them. It wouldnât necessarily need to be like this:
experiments being run
data being sent to the AI so that it can reason about it
then having the AI think a bit and construct new experiments
more experiments being run
data being sent to the AI so that it can reason about it
etc
I could give specific examples of ways to avoid having to do it that way, but any example I gave would be impoverished, and understate the true space of possible approaches.
I read the scenario he described as:
involving DNA being ordered from lab
having some gullible person elsewhere carry out instructions, where the DNA is involved somehow
being meant as one example of a type of thing that was possible (but not ruling out that there could be other ways for a malicious AGI to go about it)
I interpreted him as pointing to a larger possibility-space than the one you present. I donât think the more specific scenario you describe would appear prominently in his mind, and not mine either (you talk about getting âsome scientists in a lab to mix it togetherââwhile I donât think this would need to happen in a lab).
Here is an excerpt from here (written in 2008), with boldening of text done by me:
â1. Crack the protein folding problem, to the extent of being able to generate DNA
strings whose folded peptide sequences fill specific functional roles in a complex
chemical interaction.
2. Email sets of DNA strings to one or more online laboratories which offer DNA
synthesis, peptide sequencing, and FedEx delivery. (Many labs currently offer this
service, and some boast of 72-hour turnaround times.)
3. Find at least one human connected to the Internet who can be paid, blackmailed,
or fooled by the right background story, into receiving FedExed vials and mixing
them in a specified environment.
4. The synthesized proteins form a very primitive âwetâ nanosystem which, ribosomelike, is capable of accepting external instructions; perhaps patterned acoustic vibrations delivered by a speaker attached to the beaker.
5. Use the extremely primitive nanosystem to build more sophisticated systems, which
construct still more sophisticated systems, bootstrapping to molecular
nanotechnologyâor beyond.â
Btw, here are excerpts from a TED-talk by Dan Gibson from 2018:
âNaturally, with this in mind, we started to build a biological teleporter. We call it the DBC. Thatâs short for digital-to-biological converter. Unlike the BioXp, which starts from pre-manufactured short pieces of DNA, the DBC starts from digitized DNA code and converts that DNA code into biological entities, such as DNA, RNA, proteins or even viruses. You can think of the BioXp as a DVD player, requiring a physical DVD to be inserted, whereas the DBC is Netflix. To build the DBC, my team of scientists worked with software and instrumentation engineers to collapse multiple laboratory workflows, all in a single box. This included software algorithms to predict what DNA to build, chemistry to link the G, A, T and C building blocks of DNA into short pieces, Gibson Assembly to stitch together those short pieces into much longer ones, and biology to convert the DNA into other biological entities, such as proteins.
This is the prototype. Although it wasnât pretty, it was effective. It made therapeutic drugs and vaccines. And laboratory workflows that once took weeks or months could now be carried out in just one to two days. And thatâs all without any human intervention and simply activated by the receipt of an email which could be sent from anywhere in the world. We like to compare the DBC to fax machines.
(...)
Hereâs what our DBC looks like today. We imagine the DBC evolving in similar ways as fax machines have. Weâre working to reduce the size of the instrument, and weâre working to make the underlying technology more reliable, cheaper, faster and more accurate.
(...)
The DBC will be useful for the distributed manufacturing of medicine starting from DNA. Every hospital in the world could use a DBC for printing personalized medicines for a patient at their bedside. I can even imagine a day when itâs routine for people to have a DBC to connect to their home computer or smart phone as a means to download their prescriptions, such as insulin or antibody therapies. The DBC will also be valuable when placed in strategic areas around the world, for rapid response to disease outbreaks. For example, the CDC in Atlanta, Georgia could send flu vaccine instructions to a DBC on the other side of the world, where the flu vaccine is manufactured right on the front lines.â
Iâm no expert on this, but what you say here seems in line with my own vague impression of things. As you maybe noticed, I also put âsolvedâ in quotation marks.
As touched upon earlier, I am myself am optimistic when it comes to iterative plans for alignment. But I would prefer such iteration to be done with caution that errs on the side of paranoia (rather than being ânot paranoid enoughâ).
It would be ok if (many of the) people doing this iteration would think it unlikely that intuitions like Eliezerâs or mine are correct. But it would be preferable for them to carry out plans that would be likely to have positive results even if they are wrong about that.
Like, you expect that since something seems hopeless to you, a superintelligent AGI would be unable to do it? Ok, fine. But letâs try to minimize the amount of assumptions like that which are loadbearing in our alignment strategies. Especially for assumptions where smart people who have thought about the question extensively disagree strongly.
As a sidenote:
If I lived in the stone age, I would assign low credence to us going step by step from stone-age technologies akin to iPhones and the international space station and IBM being written with xenon atoms.
If I lived prior to complex life (but my own existence didnât factor into my reasoning), I would assign low credence to anything like mammals evolving.
Itâs interesting to note that even though many people (such as yourself) have a âconservativeâ way of thinking (about things such as this) compared to me, I am still myself âconservativeâ in the sense that there are several things that have happened that would have seemed too âout thereâ to appear realistic to me.
Another sidenote:
One question we might ask ourselves is: âhow many rules by which the universe could work would be consistent with e.g. the data we see on the internet?â. And by rules here, I donât mean rules that can be derived from other rules (like e.g. the weight of a helium atom), but the parameters that most fundamentally determine how the universe works. If we...
Rank rules by (1) how simple/âelegant they are and (2) by how likely the data we see on the internet would be to occur with those rules
Consider rules âdifferent from each otherâ if there are differences between them in regards to predictions they make for which nano-technology-designs that would work
...my (possibly wrong) guess is that there would be a âclear winnerâ.
Even if my guess is correct, that leaves the question of whether finding/âdetermining the âwinnerâ is computationally tractable. With crude/ânaive search-techniques it isnât tractable, but we donât know the specifics of the techniques that a superintelligence might useâit could maybe develop very efficient methods for ruling out large swathes of search-space.
And a third sidenote (the last one, I promise):
Speculating about this feels sort of analogous to reasoning about a powerful chess engine (although there are also many disanalogies). I know that I can beat an arbitrarily powerful chess engine if I start from a sufficiently advantageous position. But I find it hard to predict where that âlineâ is (looking at a specific board position, and guessing if an optimal chess-player could beat me). Like, for some board positions the answer will be a clear âyesâ or a clear ânoâ, but for other board-positions, it will not be clear.
I donât know how much info and compute a superintelligence would need to make nanotechnology-designs that work in a âone shortâ-ish sort of way. Iâm fairly confident that the amount of computational resources used for the initial moon-landing would be far too little (Iâm picking an extreme example here, since I want plenty of margin for error). But I donât know where the âlineâ is.
Although keep in mind that âoneshottingâ does not exclude being able to run experiments (nor does it rule out fairly extensive experimentation). As I touched upon earlier, it may be possible for a plan to have experimentation built into itself. Needing to do experimentation â needing access to a lab and lots of serial time.
This tweet from Eliezer seems relevant btw. I would give similar answers to all of the questions he lists that relate to nanotechnology (but Iâd be somewhat more hedged/âguardedâe.g. replacing âYESâ with âPROBABLYâ for some of them).