Adam Jermyn comments on Contra EY: Can AGI destroy us without trial & error?

Adam Jermyn 14 Jun 2022 0:55 UTC
18 points
A crux here seems to be the question of how well the AGI can simulate physical systems. If it can simulate them perfectly, there’s no need for real-world R&D. If its simulations are below some (high) threshold fidelity, it’ll need actors in the physical world to conduct experiments for it, and that takes human time-scales.
A big point in favor of “sufficiently good simulation is possible” is that we know the relevant laws of physics for anything the AGI might need to take over the world. We do real-world experiments because we haven’t managed to write simulation software that implements these laws at sufficiently high fidelity and for sufficiently complex systems, and because the compute cost of doing so is enormous. But in 20 years, an AGI running on a giant compute farm might both write more efficient simulation codes and have enough compute to power them.
- anonymousaisafety 14 Jun 2022 1:41 UTC
  59 points
  0
  Parent
  You’re making two separate assumptions here, both found in the field of computational complexity:
  1. That an AGI can simulate the physical systems perfectly, i.e. physical systems are computable processes.
  2. That an AGI can simulate the physical systems efficiently, i.e. either P = NP, or for some reason all of these interesting problems that the AGI is solving are NOT known to be isomorphic to some NP-hard problem.
  For (1), it might suffice to show that some physical system can be approximated by some algorithm, even if the true system is not known to be computable. Computability is a property of formal systems.
  It is an open question if all real world physical processes are computable. Turing machines are described using natural numbers and discrete time steps. Real world phenomena rely on real numbers and continuous time. Arguments that all physical processes are computable are based on discretizing everything down to the Planck time, length, mass, temperature, and then assuming determinism.
  This axiom tends to come as given if you already believe the “computable universe” hypothesis, “digital physics”, or “Church-Turing-Deutsch” principle is true. In those hypotheses, the entire universe can be simulated by a Turing machine.
  There are examples of problems that are currently not known to be computable in the general case. For example, the chaotic N-body problems in physics, or undecidable problems in logic like Hilbert’s tenth problem.
  But in 20 years, an AGI running on a giant compute farm might both write more efficient simulation codes and have enough compute to power them
  For (2), there is a lack of appreciation for the “exponential blowup” often seen in trying to solve many interesting problems optimally. When we say it is not efficient to solve some problem N optimally, there’s a wide range of ways to interpret that statement:
  - It’s not feasible to solve the problem on your desktop within 1 year.
  - It’s not feasible to solve the problem on the local university’s super-computing cluster within 1 year.
  - It’s not feasible to solve the problem on the USA’s best super-computing clusters within 1 year.
  - It’s not feasible to solve the problem on the world’s compute resources within 1 year.
  - It’s not feasible to solve the problem with compute resources on the scale of the solar system within 1 year.
  Here, for instance, is the famous quote by Paul Erdős regarding Ramsey numbers:
  Suppose aliens invade the earth and threaten to obliterate it in a year’s time unless human beings can find the Ramsey number for red five and blue five. We could marshal the world’s best minds and fastest computers, and within a year we could probably calculate the value. If the aliens demanded the Ramsey number for red six and blue six, however, we would have no choice but to launch a preemptive attack.
  If I had to rank the beliefs by most to least suspect, this would be near the top of the “most suspect” section. We are already aware of many problems that are computable, but not efficiently. We can see complexity classes like NP or EXPTIME in many real world problems, such as chess, Go, traveling salesman, and so forth. We also have problems that we think aren’t NP, but we haven’t found an algorithm in P yet, like graph isomorphism or integer factorization.
  Most measures we can use for ranking people by their respective intelligence, e.g. math proofs, skill at competitive games, mental math, finding solutions to problems with tricky constraints, and so forth are examples of solving problems that we know are isomorphic to NP-hard or EXPTIME problems. So it seems to follow that a hypothetical algorithm (let’s call it “general intelligence”) that could solve all of the above in some optimal manner would be problematic for a digital computer.
  When we write a program that can solve those types of problems, we have 2 options.
  The first option is to try and solve it for some optimal solution. This is where the complexity classes bite us. We don’t know how to do this for many real world problems in an efficient manner. If we did, then we’d have a proof for P=NP, and whoever came up with that proof could get a check for 1 million dollars from the Clay Institute.
  The second option, and the one we actually use, is to use an algorithm that can find any solution, and then to terminate that algorithm once a solution is “good” enough. For some algorithms, called “approximation algorithms”, we can even prove a bound on how close our “good” solution is to the optimal solution. This is called “optimality”. When we train neural networks, those neural networks learn heuristics that often find “good” solutions, but those random or probabilistic algorithms don’t have guarantees on optimality. This last point is often confused by stating that it’s possible to show a training process has resulted in an optimal configuration of a neural network. That is a subtle and different point. Saying that a neural network has the optimal configuration to decide A → B is not the same as saying that B is the optimal solution for A.
  This is basically arguing that the concept of “bounded rationality” is a necessary precondition for efficiently running the hypothetical algorithm of “general intelligence”. If true, all of the normal limits and weaknesses we see in human rationality, like biases, heuristics, or fallacies would be expected within any hypothetical AI/ML system that demonstrates “general intelligence” under reasonable time constraints.
  How much this matters depends on how strongly you believe in the power of simulation-focused “general intelligence”. For example, if you believe that the key distinguishing factor of an AI/ML system is that it will be able to simulate the universe and thus accurately predict outcomes with near-perfect probability, or do the same for protein folding such that the simulation of proposed protein is 100% accurate when that same protein is synthesized in real life, then you’ve implicitly subscribed to some form of this belief where the AI/ML system returns results quickly that are not just “good”, but indeed “optimal”.
  We can look at AlphaFold—a monumental advance in protein folding—for an example of what this looks like in practice. I’ll refer to the post^[1] by Dr. AlQuraishi discussing the accuracy and limitations of AlphaFold. The important point is that while AlphaFold can predict proteins with high accuracy, it is not perfect. Sometimes the calculation of how a protein would fold is still wildly different from how it actually folds. This is exactly what we’d expect from fast, efficient, and reasonably good solvers that have no guarantee on optimality. I don’t want this point to be misunderstood. AI/ML systems are the cause of huge leaps in our understanding and capabilities within various domains. But they have not replaced the need for actual experimentation.
  We can argue that an AI/ML system doesn’t need to approximate solutions, and it will simply perform the inefficient computations for optimality, because it has the hardware or compute resources to do so. This is fine. We can write optimal solvers for NP-hard problems. We can also calculate how long they will take to run on certain inputs, given certain hardware. We can ask how parallelizable the algorithm is by looking at Amdahl’s law. What we’ll discover if we do these calculations is that the computation of the optimal solution for these NP-hard or EXPTIME problems becomes non-feasible to compute with any realistic amount of compute power. By realistic amount, I mean “a solar system worth of material devoted solely to computing still cannot provide optimal solutions within the time-span of a human life even on distressingly small inputs for NP-hard problems”.
  There is another argument that gets made for this belief, one where we replace the digital computer or Turing machine with a hypothetical quantum computer. I don’t have a problem with this argument either, except to note that if we accept this rewording, many downstream assumptions about the AI/ML systems ability to scale in hardware go out of the window. The existence of the AWS cloud or billions of desktop home computers networked to the Internet is irrelevant if your AI/ML system is only efficient when run on a quantum computer. The counter-argument is that a Turing machine can simulate a quantum computer. Yes—but can it do so efficiently? This is the whole reason for the “quantum supremacy” argument. If “quantum supremacy” is true, then a Turing machine, even if it can simulate a quantum computer, cannot do so efficiently. In that world, the AI/ML system is still bottlenecked to a real quantum computer platform. Furthermore, a quantum computer is not magically capable of solving any arbitrary problems faster than a classical computer. While there are some problems that a quantum computer could provably solve faster than a classical computer if P != NP, such as integer factorization, it is not true that a quantum computer can solve arbitrary NP-hard problems efficiently.
  Another motivating example we can look at here are things like weather prediction or circuit simulation. In both scenarios, we have a very good understanding of the physical laws that the systems obey. We also have very good simulations for these physical laws. The reason why it’s computationally expensive for a digital computer to simulate these systems is because the real processes are continuous, while the simulated process uses some time delta (usually referred to as dT in code) between discrete steps of the simulation.
  For complicated, physical processes, the algorithm that can simulate that process can be made arbitrarily close to the real world measurement of that process, but only by reducing dT. With each reduction in dT, the algorithm takes longer to run. If we want to avoid the increase in wall time, we need to find a way to break up the problem into some type of sub-problem that can be parallelized on independent compute units, and then reduce dT on those sub-problems. If we can’t do that, because the problem has some tricky global behavior like the N-body problem alluded to above, we have to accept limitations on how accurate our results are for the given wall time we can afford.
  In electrical engineering, this is the tradeoff that an electrical engineer makes when they’re designing hardware. They can request quick & fast simulations for sanity checks during iterative development. The fast simulations might run at higher dT values, or they might use simplifying assumptions for circuit behavior. Then, prior to sending a board to be manufactured, they can run a much longer and detailed simulation. The longer solution might take hours or days to run. For this reason, the EE might only run the detailed simulation on the most critical part of a piece of hardware, and assume that the rest will work as predicted by other simulation or design work. This tradeoff means that there is a risk when the real hardware is manufactured that it does not work as intended, e.g. due to an unexpected RF effect.
  For a concrete example of this tradeoff, Nvidia recently announced that they’ve been able to speed up a certain calculation that takes 3 hours with a traditional tool to only 3 seconds using an AI/ML system. The AI/ML system is 94% accurate. This is a huge win for day-to-day iterative work, but you’d still want to run the most accurate simulation you can (even if it takes hours or days!) before going to manufacturing, where the lead time on manufacturing might be weeks or months.
  The final point I’d like to discuss is arguments where a superintelligent AI/ML system can predict far into the future & therefore exploit the actions of an arbitrary number of other actors.
  I have referred to this point in discussion among friends as the “strong Psychohistory assumption”. Psychohistory is a fictional science conceptualized in the novel Foundation by Isaac Asimov. The idea of psychohistory is that a sufficiently intelligent actor can predict the actions of others (tangent: in the context of the novels it only applies to “large groups”) into the future at an accuracy that allows one to run circles around them in a competitive scenario. In Foundation, the fictional stakes play out over thousands of years. This problem, and others like it, are reducible to the same complexity concerns above. Unless “general intelligence” is able to be efficiently simulated, there’s no reason to expect a superintelligent AI/ML system to be able to make perfect predictions into the future. Without this assumed trait, many takeoff strategies for intelligent AI/ML systems are not sound. They rely, implicitly or explicitly, on the intelligent AI/ML system being able to control individual humans and the society around them like a puppeteer and their puppets.
  The conclusion that I’d like for you to consider is that AI/ML systems will produce astonishingly efficient, reasonably good solutions to arbitrary tasks—but at no point will that invalidate the need for actual, iterative experimentation in the real world, without making an expensive trade between wall time required for simulation vs the accuracy of that simulation.
  Or, to try and phrase this another way, an AI/ML system that is 99% accurate at throwing a twenty-sided die to achieve a desired result can still roll a critical failure.
  1. ^
    https://moalquraishi.wordpress.com/2020/12/08/alphafold2-casp14-it-feels-like-ones-child-has-left-home
  What links here?
  - Luke A Somers 14 Jun 2022 13:08 UTC
    19 points
    Parent
    It seems like you’re relying on the existence of exponentially hard problems to mean that taking over the world is going to BE an exponentially hard problem. But you don’t need to solve every problem. You just need to take over the world.
    Like, okay, the three body problem is ‘incomputable’ in the sense that it has chaotically sensitive dependence on initial conditions in many cases. So… don’t rely on specific behavior in those cases on long time horizons without the ability to do small adjustments to keep things on track.
    If the AI can detect most of the hard cases and avoid relying on them, and include robustness by having multiple alternate mechanisms and backup plans, even just 94% success on arbitrary problems could translate into better than that on an overall solution.
    - Yair Halberstadt 14 Jun 2022 14:24 UTC
      8 points
      0
      Parent
      This was specifically responding to the claim that an AI could solve problems without trial and error by perfectly simulating them, which I think it does a pretty reasonable job of shooting down.
  - MSRayne 14 Jun 2022 13:13 UTC
    6 points
    Parent
    This deserves to be a whole post. You brought up points I, at least, had never considered before.
  - Nikita Sokolsky 14 Jun 2022 15:23 UTC
    3 points
    Parent
    You should make a separate post on “Can AGI just simulate the physical world?”. Will make it easier to find and reference in the future.
    - anonymousaisafety 14 Jun 2022 22:37 UTC
      7 points
      Parent
      That was extracted from a much larger work I’ve been writing for the past 2 months. The above is less than ~10% of what I’ve written on the topic, and it goes much further than simulation problems. I am also trying to correct misunderstandings around distributed computation, hardware vs software inefficiency, improvements in performance from algorithmic gains, this community’s accepted definition for “intelligence”, the necessity or inevitability of self-improving systems, etc.
      I’ll post it when done but in the meantime I’m just tossing various bits & pieces of it into debate wherever I see an opening to do so.
- LGS 14 Jun 2022 9:36 UTC
  4 points
  Parent
  Proteins and other chemical interactions are governed by quantum mechanics, so the AGI would probably need a quantum computer to do a faithful simulation. And that’s for a single, local interaction of chemicals; for a larger system, there are too many particles to simulate, so some systems will be as unpredictable as the weather in 3 weeks.
  - Luke A Somers 14 Jun 2022 12:58 UTC
    4 points
    Parent
    The distribution of outcomes is much more achievable and much more useful than determining the one true way some specific thing will evolve. Like, it’s actually in-principle achievable, unlike making a specific pointlike prediction of where a molecular ensemble is going to be given a starting configuration (QM dependency? Not merely a matter of chaos). And it’s actually useful, in that it shows which configurations have tightly distributed outcomes and which don’t, unlike that specific pointlike prediction.
    - LGS 14 Jun 2022 21:39 UTC
      1 point
      Parent
      What does “the distribution of outcomes” mean? I feel like you’re just not understanding the issue.
      The interaction of chemical A with chemical B might always lead to chemical C; the distribution might be a fixed point there. Yet you may need a quantum computer to tell you what chemical C is. If you just go “well I don’t know what chemical it’s gonna be, but I have a Bayesian probability distribution over all possible chemicals, so everything is fine”, then you are in fact simulating the world extremely poorly. So poorly, in fact, that it’s highly unlikely you’ll be able to design complex machines. You cannot build a machine out of building blocks you don’t understand.
      Maybe the problem is that you don’t understand the computational complexity of quantum effects? Using a classical computer, it is not possible to efficiently calculate the “distribution of outcomes” of a quantum process. (Not the true distribution, anyway; you could always make up a different distribution and call it your Bayesian belief, but this borders on the tautological.)
  - mukashi 14 Jun 2022 12:15 UTC
    4 points
    Parent
    Not am expert at all here, so please correct me if I am wrong, but I think that quantum systems are routinely simulated with non quantum computers. Nothing to argue against the second part
    - jacopo 14 Jun 2022 17:16 UTC
      3 points
      Parent
      You are correct (QM-based simulation of materials is what I do). The caveat is that exact simulations are so slow that they are impossible, that would not be the case with quantum computing I think. Fortunately, we have different levels of approximation for different purposes that work quite well. And you can use QM results to fit faster atomistic potentials.
    - LGS 14 Jun 2022 21:33 UTC
      2 points
      Parent
      You are wrong in the general case—quantum systems cannot are are not routinely simulated with non-quantum computers.
      Of course, since all of the world is quantum, you are right that many systems can be simulated classically (e.g. classical computers are technically “quantum” because the entire world is technically quantum). But on the nano level, the quantum effects do tend to dominate.
      IIRC some well-known examples where we don’t know how to simulate anything (due to quantum effects) are the search for a better catalyst in nitrogen fixation and the search for room-temperature superconductors. For both of these, humanity has basically gone “welp, these are quantum effects, I guess we’re just trying random chemicals now”. I think that’s also the basic story for the design of efficient photovoltaic cells.
    - Measure 14 Jun 2022 14:16 UTC
      2 points
      Parent
      Quick search found this
      
      Simulating a quantum computer on a classical one does indeed require a phenomenal amount of resources when no approximations are made and noise is not considered. Exact algorithms for simulating quantum computers require time or memory that grows exponentially with the number of qubits or other physical resources. Here, we discuss a class of algorithms that has attracted little attention in the context of quantum-computing simulations. These algorithms use quantum state compression and are exponentially faster than their exact counterparts. In return, they possess a finite fidelity (or finite compression rate, similar to that in image compression), very much as in a real quantum computer.
      - LGS 14 Jun 2022 21:58 UTC
        4 points
        Parent
        This paper is about simulating current (very weak, very noisy) quantum computers using (large, powerful) classical computers. It arguably improves the state of the art for this task.
        Virtually no expert believes you can efficiently simulate actual quantum systems (even approximately) using a classical computer. There are some billon-dollar bounties on this (e.g. if you could simulate any quantum system of your choice, you could run Shor’s algorithm, break RSA, break the signature scheme of bitcoin, and steal arbitrarily many bitcoins).
        Donald Hobson 26 Aug 2022 1:29 UTC
        3 points
        Parent
        Simulating a nanobot is a lower bar than simulating a quantum computer. Also a big unsolved problem like P=NP might be easy to an ASI.
        LGS 26 Aug 2022 21:36 UTC
        1 point
        Parent
        It remains to be seen whether it’s easier. It could also be harder (the nanobot interacts with a chaotic environment which is hard to predict).
        “Also a big unsolved problem like P=NP might be easy to an ASI.”
        
        I don’t understand what this means. The ASI may indeed be good at proving that P does not equal NP, in which case it has successfully proven that it itself cannot do certain tasks (the NP complete ones). Similarly, if the ASI is really good at complexity theory, it could prove that BQP is not equal to BPP, at which point is has proven that it itself cannot simulate quantum computation on a classical computer. But that still does not let it simulate quantum computation on a classical computer!
        Donald Hobson 27 Aug 2022 2:13 UTC
        2 points
        Parent
        The reason for the exponential term is that a quantum computer uses a superposition of exponentially many states. A well functioning nanomachine doesn’t need to be in a superposition of exponentially many states.
        For that matter, the AI can make its first nanomachines using designs that are easy to reason about. This is a big hole in any complexity theory based argument. Complexity theory only applies in the worst case. The AI can actively optimize its designs to be easy to simulate.
        Its possible the AI shows P!=NP, but also possible the AI shows P=NP, and finds a fast algorithm. Maybe the AI realizes that BQP=BPP.
        LGS 30 Aug 2022 9:31 UTC
        1 point
        Parent
        Maybe the AI can make its first nanomachines easy to reason about… but maybe not. We humans cannot predict the outcome of even relatively simple chemical interactions (resorting to the lab to see what happens). That’s because these chemical interactions are governed by the laws of quantum mechanics, and yes, they involve superpositions of a large number of states.
        “Its possible the AI shows P!=NP, but also possible the AI shows P=NP, and finds a fast algorithm. Maybe the AI realizes that BQP=BPP.”
        It’s also possible the AI finds a way to break the second law of thermodynamics and to travel faster than light, if we’re just gonna make things up. (I have more confidence in P!=NP than in just about any phsyical law.) If we only have to fear an AI in a world where P=NP, then I’m personally not afraid.
        Donald Hobson 30 Aug 2022 11:41 UTC
        2 points
        Parent
        Not sure why you are quite so confident P!=NP. But that doesn’t really matter.
        Consider bond strength. Lets say the energy taken to break a C-C bond varies by $\pm 5 %$ based on all sorts of complicated considerations involving the surrounding chemical structure. An AI designing a nanomachine can just apply 10% more energy than needed.
        A quantum computer doesn’t just have a superposition of many states, its a superposition carefully chosen such that all the pieces destructively and constructively interfere in exactly the right way. Not that the AI needs exact predictions anyway.
        Also, the AI can cheat. As well as fundamental physics, it has access to a huge dataset of experiments conducted by humans. It doesn’t need to deduce everything from QED, it can deduce things from random chemistry papers too.