I was talking about a variety of reasons for simulation and arguing that simulating a single entity seems as reasonable as many—but you seem only to be concerned with historical recreation.
Historical recreation currently seems to be the best rationale for a superintelligence to simulate this timeslice, although there are probably other motivations as well.
Power efficiency, in terms of ops/joule, increases directly with transistor density.
To my knowledge, this is incorrect. Increases in transistor density have dramatically increased circuit leakage (because of bumping into quantum tunneling), requiring more power per transistor in order to accurately distinguish one path from another.
If that was actually the case, then there would be no point to moving to a new technology node!
Yes leakage is a problem at the new tech nodes, but of course power per transistor can not possibly be increasing. I think you mean power per surface area has increased.
Shrinking a circuit by half in each dimension makes the wires thinner, shorter and less resistant, decreasing power use per transistor just as you’d think. Leakage makes this decrease somewhat less than the shrinkage rate, but it doesn’t reverse the entire trend.
There are also other design trends that can compensate and overpower this to an extent, which is why we have a plethora of power efficient circuits in the modern handheld market.
“which mentioned that the increased waste heat from modern circuits was rising at a faster exponential than circuit density”
Do you remember when this was from or have a link? I could see that being true when speeds were also increasing, but that trend has stopped or reversed.
I recall seeing some slides from NVidia where they are claiming there next GPU architecture will cut power use per transistor dramatically as well at several times the rate of shrinkage.
You propose that not existing would be a terrible evil. But how much better, for all the trillions upon trillions you’re proposing must suffer for the creator’s whims, would it be to have that computational substrate be used to host entities that have amazingly positive, productive, maximally Fun lives?
Even if the goal is maximizing fun, creating some historical sims for the purpose of resurrecting the dead may serve that goal. But I really doubt that current-human-fun-maximization is an evolutionary stable goal system.
I imagine that future posthuman morality and goals will evolve into something quite different.
Knowledge is a universal feature of intelligence. Even the purely mathematical hypothetical superintelligence AIXI would end up creating tons of historical simulations—and that might be hopelessly brute force, but nonetheless superintelligences with a wide variety of goal systems would find utility in various types of simulation.
Historical recreation currently seems to be the best rationale for a superintelligence to simulate this timeslice, although there are probably other motivations as well.
Much of the information from the past is probably irretrievably lost to us. If the information input into the simulation were not precisely the same as the actual information from that point in history, the differences would quickly propagate so that the simulation would bear little resemblance to the history. Supposing the individuals in question did have access to all the information they’d need to simulate the past, they’d have no need for the simulation, because they’d already have complete informational access to the past. It suffers similar problems to your sandboxed anthropomorphic AI proposal; provided you have all the resources necessary to actually do it, it ceases to be a good idea.
There are other possible motivations, but it’s not clear that there are any others that are as good or better, so we have little reason to suppose it will ever happen.
Historical recreation currently seems to be the best rationale for a superintelligence to simulate this timeslice, although there are probably other motivations as well.
This seems to be overly restrictive, but I don’t mind confining the discussion to this hypothesis.
I think you mean power per surface area has increased.
Yes, you are correct.
Do you remember when this was from or have a link? I could see that being true when speeds were also increasing, but that trend has stopped or reversed.
The roundtable was at SC′08, a while after speeds had stabilized, and since it is a supercomputing conference, the focus was on massively parallel systems. It was part of this.
I really doubt that current-human-fun-maximization is an evolutionary stable goal system. I imagine that future posthuman morality and goals will evolve into something quite different.
Without needing to dispute this, I can remain exceptionally upset that whatever their future morality is, it is blind to suffering and willing to create innumerable beings that will suffer in order to gain historical knowledge. Does this really not bother you in the slightest?
The roundtable was at SC′08, a while after speeds had stabilized, and since it is a supercomputing conference, the focus was on massively parallel systems. It was part of this.
While the leakage issue is important and I want to read a little more about this reference, I don’t think that any single such current technical issue is nearly sufficient to change the general analysis. There have always been major issues on the horizon, the question is more of the increase in engineering difficulty as we progress vs the increase in our effective intelligence and simulation capacity.
In the specific case of leakage, even if it is a problem that persists far into the future, it just slightly lowers the growth exponent as we just somewhat lower the clock speeds. And even if leakage can never be fully prevented, eventually it itself can probably be exploited for computation.
I really doubt that current-human-fun-maximization is an evolutionary stable goal system. I imagine that future posthuman morality and goals will evolve into something quite different.
Without needing to dispute this, I can remain exceptionally upset that whatever their future morality is, it is blind to suffering and willing to create innumerable beings that will suffer in order to gain historical knowledge.
As I child I liked Mcdonalds, bread, plain pizza and nothing more—all other foods were poisonous. I was convinced that my parent’s denial of my right to eat these wonderful foods and condemn me to terrible suffering as a result was a sure sign of their utter lack of goodness.
Imagine if I could go back and fulfill that child’s wish to reduce it’s suffering. It would never then evolve into anything like my current self, and in fact may evolve into something that would suffer more or at the very least wish that it could be me.
Imagine if we could go back in time and alter our primate ancestors to reduce their suffering. The vast majority of such naive interventions would cripple their fitness and wipe out the lineage. There is probably a tiny set of sophisticated interventions that could simultaneously eliminate suffering and improve fitness, but these altered creatures would not develop into humans.
Our current existence is completely contingent on a great evolutionary epic of suffering on an astronomical scale. But suffering itself is just one little component of that vast mechanism, and forms no basis from which to judge the totality.
You made the general point earlier, which I very much agree with, about opportunity cost. Simulating humanity’s current time-line has an opportunity cost in the form of some paradise that could exist in it’s place. You seem to think that the paradise is clearly better, and I agree: from our current moral perspective.
In the end of the day morality is governed by evolution. There is an entire landscape of paradises that could exist, the question is what fitness advantage do they provide their creator? The more they diverge from reality, the less utility they have in advancing knowledge of reality towards closure.
It looks like earth will evolve into a vast planetary hierarchical superintelligence, but ultimately it will probably be just one of many, and still subject to evolutionary pressure.
In the specific case of leakage, even if it is a problem that persists far into the future, it just slightly lowers the growth exponent as we just somewhat lower the clock speeds.
I disagree; I think that problems like this, unresolved, may or may not decrease the base of our exponent, but will cap its growth earlier.
I don’t think that any single such current technical issue is nearly sufficient to change the general analysis. There have always been major issues on the horizon, the question is more of the increase in engineering difficulty as we progress vs the increase in our effective intelligence and simulation capacity.
On this point, we disagree, and I may be on the unpopular side of this agreement. I don’t see how past increases that have required technological revolutions can be considered more than weak evidence for future technological revolutions. I actually think it quite likely that increase in computational power per Joule will bottom out in ten to twenty years. I wouldn’t be too surprised if exponential increase lasts thirty years, but forty seems unlikely, and fifty even less likely.
Imagine if we could go back in time and alter our primate ancestors to reduce their suffering. The vast majority of such naive interventions would cripple their fitness and wipe out the lineage. There is probably a tiny set of sophisticated interventions that could simultaneously eliminate suffering and improve fitness, but these altered creatures would not develop into humans.
I don’t care. We aren’t talking about destroying the future of intelligence by going back in time. We’re talking about repeating history umpteen many times, creating suffering anew each time. It sounds to me like you are insisting that this suffering is worthwhile, even if the result of all of it will never be more than a data point in a historian’s database.
We live in a heartbreaking world. Under the assumption that we are not in a simulation, we can recognize facts like ‘suffering is decreasing over time’ and realize that it is our job to work to aid this progress. Under the assumption that we are in a simulation, we know that the capacity for this progress is already fully complete, and the agents who control it simply don’t care. If we are being simulated, it means that one or more entities have chosen to create unimaginable quantities of suffering for their own purposes—to your stated belief, for historical knowledge.
Your McDonald’s example doesn’t address this in the slightest. You were already a living, thinking being, and your parents took care of you in the right way in an attempt to make your future life better. They couldn’t have chosen before you were born to instead create someone who would be happier, smarter, wiser, and better in every way. If they could have, wouldn’t it be upsetting that they chose not to?
Given the choice between creating agents that have to endure suffering for generations upon generations, and creating agents that will have much more positive, productive lives, why are you arguing for the side that chooses the former? Of course the former and latter are entirely different entities, but that serves as no argument whatsoever for choosing the former!
A person running such a simulation could create a simulated afterlife, without suffering, where each simulated intelligence would go after dying in the simulated universe. It’s like a nice version of Pascal’s Wager, since there’s no wagering involved. Such an afterlife wouldn’t last infinitely long, but it could easily be made long enough to outweigh any suffering in the simulated universe.
So far the only person who seems dedicated to making such a simulation is jacob cannell, and he already seems to be having enough trouble separating the idea from cached theistic assumptions.
The simulated afterlife wouldn’t need to outweigh the suffering in the first universe according to our value system, only according to the value system of the aliens who set up the simulation.
I don’t see how past increases that have required technological revolutions can be considered more than weak evidence for future technological revolutions.
Technology doesn’t really advance through ‘revolutions’, it evolves. Some aspects of that evolution appear to be rather remarkably predictable.
That aside, the current predictions do posit a slow-down around 2020 for the general lithography process, but there are plenty of labs researching alternatives. As the slow-down approaches, their funding and progress will accelerate.
But there is a much more fundamental and important point to consider, which is that circuit shrinkage is just one dimension of improvement amongst several. As that route of improvement slows down, other routes will become more profitable.
For example, for AGI algorithms, current general purpose CPUs are inefficient by a factor of perhaps around 10^4. That is a decade of exponential gain right there just from architectural optimization. This route—neuromorphic hardware and it’s ilk—currently receives a tiny slice of the research budget, but this will accelerate as AGI advances and would accelerate even more if the primary route of improvement slowed.
Another route of improvement is exponentially reducing manufacturing cost. The bulk of the price of high-end processors pays for the vast amortized R&D cost of developing the manufacturing node within the timeframe that the node is economical. Refined silicon is cheap and getting cheaper, research is expensive. The per transistor cost of new high-end circuitry on the latest nodes for a CPU or GPU is 100 times more expensive than the per transistor cost of bulk circuitry produced on slightly older nodes.
So if moore’s law stopped today, the cost of circuitry would still decay down to the bulk cost. This is particularly relevant to neurmorphic AGI designs as they can use a mass of cheap repetitive circuitry, just like the brain. So we have many other factors that will kick in even as moore’s law slows.
I suspect that we will hit a slow ramping wall around or by 2020, but these other factors will kick in and human-level AGI will ramp up, and then this new population and speed explosion will drive the next S-curve using a largely new and vastly more complex process (such as molecular nano-tech) that is well beyond our capability or understanding.
I don’t care. We aren’t talking about destroying the future of intelligence by going back in time.
It’s more or less equivalent from the perspective of a historical sim. A historical sim is a recreation of some branch of the multiverse near your own incomplete history that you then run forward to meet your present.
It sounds to me like you are insisting that this suffering is worthwhile
My existence is fully contingent on the existence of my ancestors in all of their suffering glory. So from my perspective, yes their suffering was absolutely worthwhile, even if it wasn’t from their perspective.
Likewise, I think that it is our noble duty to solve AI, morality, and control a Singularity in order to eliminate suffering and live in paradise.
I also understand that after doing that we will over time evolve into beings quite unlike what we are now and eventually look back at our prior suffering and view it from an unimaginably different perspective, just as my earlier mcdonald’s loving child-self evolved into a being with a completely different view of it’s prior suffering.
your parents took care of you in the right way in an attempt to make your future life better.
It was right from both their and my current perspective, it was absolutely wrong from my perspective at the time.
They couldn’t have chosen before you were born to instead create someone who would be happier, smarter, wiser, and better in every way. If they could have, wouldn’t it be upsetting that they chose not to?
Of course! Just as we should create something better than ourselves. But ‘better’ is relative to a particular subjective utility function.
I understand that my current utility function works well now, that it is poorly tuned to evaluate the well-being of bacteria, just as poorly tuned to evaluate the well-being of future posthuman godlings, and most importantly—my utility function or morality will improve over time.
Given the choice between creating agents that have to endure suffering for generations upon generations, and creating agents that will have much more positive, productive lives, why are you arguing for the side that chooses the former?
Imagine you are the creator. How do you define ‘positive’ or ‘productive’? From your perspective, or theirs?
There are an infinite variety of uninteresting paradises. In some virtual humans do nothing but experience continuous rapturous bliss well outside the range of current drug-induced euphoria. There are complex agents that just set their reward functions to infinity and loop.
There are also a spectrum of very interesting paradises, all having the key differentiator that they evolve. I suspect that future godlings will devote most of their resources to creating these paradises.
I also suspect that evolution may operate again at an intergalactic or higher level, ensuring that paradises and all simulations somehow must pay for themselves.
At some point our descendants will either discover for certain they are in a sim and integrate up a level, or they will approach local closure and perhaps discover an intergalactic community. At that point we may have to compete with other singularity-civilizations, and we may have the opportunity to historically intervene on pre-singularity planets we encounter. We’d probably want to simulate any interventions before preceeding, don’t you think?
A historical recreation can develop into a new worldline with it’s own set of branching paradises that increase overall variation in a blossoming metaverse.
If you could create a new big bang, an entire new singularity and new universe, would you?
You seem to be arguing that you would not because it would include humans who suffer. I think this ends up being equivalent to arguing the universe should not exist.
At some point our descendants will either discover for certain they are in a sim, or they will approach local closure and perhaps discover an intergalactic community. At that point we may have to compete with other singularity-civilizations, and we may have the opportunity to historically intervene on pre-singularity planets we encounter. We’d probably want to simulate any interventions before preceeding, don’t you think?
If we had enough information to create an entire constructed reality of them in simulation, we’d have much more than we needed to just go ahead and intervene.
If you could create a new big bang, an entire new singularity and new universe, would you? You seem to be arguing that you would not because it would include humans who suffer. I think this ends up being equivalent to arguing the universe should not exist.
Some people would argue that it shouldn’t (this is an extreme of negative utilitarianism.) However, since we’re in no position to decide whether the universe gets to exist or not, the dispute is fairly irrelevant. If we’re in a position to decide between creating a universe like ours, creating one that’s much better, with more happiness and productivity and less suffering, and not creating one at all, though, I would have an extremely poor regard for the morality of someone who chose the first.
My existence is fully contingent on the existence of my ancestors in all of their suffering glory. So from my perspective, yes their suffering was absolutely worthwhile, even if it wasn’t from their perspective.
If my descendants think that all my suffering was worthwhile so that they could be born instead of someone else, then you know what? Fuck them. I certainly have a higher regard for my own ancestors. If they could have been happier, and given rise to a world as good as better than this one, then who am I to argue that they should have been unhappy so I could be born instead? If, as you point out
A historical recreation can develop into a new worldline with it’s own set of branching paradises that increase overall variation in a blossoming metaverse.
then why not skip the historical recreation and go straight to simulating the paradises?
For example, for AGI algorithms, current general purpose CPUs are inefficient by a factor of perhaps around 10^4. That is a decade of exponential gain right there just from architectural optimization.
I’m curious how you’ve reached this conclusion given how little we know about what AGI algorithms would look like.
For example, for AGI algorithms, current general purpose CPUs are inefficient by a factor of perhaps around 10^4. That is a decade of exponential gain right there just from architectural optimization.
I’m curious how you’ve reached this conclusion given how little we know about what AGI algorithms would look like.
The particular type of algorithm is actually not that important. There is a general speedup in moving from a general CPU-like architecture to a specialized ASIC—once you are willing to settle on the algorithms involved.
There is another significant speedup moving into analog computation.
Also, we know enough about the entire space of AI sub-problems to get a general idea of what AGI algorithms look like and the types of computations they need. Naturally the ideal hardware ends up looking much more like the brain than current von neumann machines—because the brain evolved to solve AI problems in an energy efficient manner.
If you know your are working in the space of probabilistic/bayesian like networks, exact digital computations are extremely wasteful. Using ten or hundreds of thousands of transistors to do an exact digital multiply is useful for scientific or financial calculations, but it’s a pointless waste when the algorithm just needs to do a vast number of probabilistic weighted summations, for example.
Thanks. Hefty read, but this one paragraph is worth quoting:
Statistical inference algorithms involve parsing large quantities of noisy (often analog)
data to extract digital meaning. Statistical inference algorithms are ubiquitous and
of great importance. Most of the neurons in your brain and a growing number of
CPU cycles on desk-tops are spent running statistical inference algorithms to perform compression, categorization, control, optimization, prediction, planning, and learning.
I had forgot that term, statistical inference algorithms, need to remember that.
Well, there’s also another quote worth quoting, and in fact the quote that is in my Mnemosyne database and which enabled me to look that thesis up so fast...
“In practice replacing digital computers with an alternative computing paradigm is a risky proposition.
Alternative computing architectures, such as parallel digital computers have not tended to be commercially viable, because Moore’s Law has consistently enabled conventional von Neumann architectures to render alternatives unnecessary.
Besides Moore’s Law, digital computing also benefits from mature tools and expertise for optimizing performance at all levels of the system: process technology, fundamental circuits, layout and algorithms.
Many engineers are simultaneously working to improve every aspect of digital technology, while alternative technologies like analog computing do not have the same kind of industry juggernaut pushing them forward.”
This is true in general but this particular statement appears out of date:
’Alternative computing architectures, such as parallel digital computers have not tended to be commercially viable”
That was true perhaps circa 2000, but we hit a speed/heat wall and since then everything has been going parallel.
You may see something similar happen eventually with analog computing once the market for statistical inference computation is large enough and or we approach other constraints similar to the speed/heat wall.
The particular type of algorithm is actually not that important. There is a general speedup in moving from a general CPU-like architecture to a specialized ASIC—once you are willing to settle on the algorithms involved.
Ok. But this prevents you from directly improving your algorithms. And if the learning mechanisms are to be highly flexible (like say those of a human brain) then the underlying algorithms may need to modify a lot even to just approximate being an intelligent entity. I do agree that given a fixed algorithm this would plausibly lead to some speed-up.
There is another significant speedup moving into analog computation.
A lot of things can’t be put into analog. For example, what if you need factor large numbers. And making analog and digital stuff interact is difficult.
Also, we know enough about the entire space of AI sub-problems to get a general idea of what AGI algorithms look like and the types of computations they need. Naturally the ideal hardware ends up looking much more like the brain than current von neumann machines—because the brain evolved to solve AI problems in an energy efficient manner.
This doesn’t follow. The brain evolved through a long path of natural selection. It isn’t at all obvious that the brain is even highly efficient at solving AI-type problems, especially given that humans have only needed to solve much of what we consider standard problems for a very short span of evolutionary history (and note that general mammal brain architecture looks very similar to ours).
Ok. But this prevents you from directly improving your algorithms.
Yes—which is part of the reason there is a big market for CPUs.
And if the learning mechanisms are to be highly flexible (like say those of a human brain) then the underlying algorithms may need to modify a lot even to just approximate being an intelligent entity.
Not necessarily. For example, the cortical circuit in the brain can be reduced to an algorithm which would include the learning mechanism built in. The learning can modify the network structure to a degree but largely adjusts synaptic weights. That can be described as (is equivalent to) a single fixed algorithm. That algorithm in turn can be encoded into an efficient circuit. The circuit would learn just as the brain does, no algorithmic changes ever needed past that point, as the self-modification is built into the algorithm.
A modern CPU is a jack-of all trades that is designed to do many things, most of which have little or nothing to do with the computational needs of AGI.
A lot of things can’t be put into analog. For example, what if you need factor large numbers. And making analog and digital stuff interact is difficult.
If the AGI need to factor large numbers, it can just use an attached CPU. Factoring large numbers is easy compared to reading this sentence about factoring large numbers and understanding what that actually means.
It isn’t at all obvious that the brain is even highly efficient at solving AI-type problems,
The brain has roughly 10^15 noisy synapses that can switch around 10^3 times per second and store perhaps a bit each as well. (computation and memory integrated)
My computer has about 10^9 exact digital transistors in it’s CPU & GPU that can switch around 10^9 times per second. It has around the same amount of separate memory and around 10^13 bits of much slower disk storage.
These systems have similar peak throughputs of about 10^18 bits/second, but they are specialized for very different types of computational problems. The brain is very slow but massively wide, the computer is very narrow but massively fast.
The brain is highly specialized and extremely adept at doing typical AGI stuff—vision, pattern recognition, inference, and so on—problems that are suited to massively wide but slow processing with huge memory demands.
Our computers are specialized and extremely adept at doing the whole spectrum of computational problems brains suck at—problems that involve long complex chains of exact computations, problems that require massive speed and precision but less bulk processing and memory.
So to me, yes it’s obvious that the brain is highly efficient at doing AGI-type stuff—almost because that’s how we define AGI-type stuff—its all the stuff that brains are currently much better than computers at!
Not necessarily. For example, the cortical circuit in the brain can be reduced to an algorithm which would include the learning mechanism built in. The learning can modify the network structure to a degree but largely adjusts synaptic weights. That can be described as (is equivalent to) a single fixed algorithm. That algorithm in turn can be encoded into an efficient circuit. The circuit would learn just as the brain does, no algorithmic changes ever needed past that point, as the self-modification is built into the algorithm.
This limits the amount of modification one can do. Moreover, the more flexible your algorithm the less you gain from hard-wiring it.
The brain is highly specialized and extremely adept at doing typical AGI stuff—vision, pattern recognition, inference, and so on—problems that are suited to massively wide but slow processing with huge memory demands.
No, we don’t know that the brain is “extremely adept” at these things. We just know that it is better than anything else that we know of. That’s not at all the same thing. The brain’s architecture is formed by a succession of modifications to much simpler entities. The successive, blind modification has been stuck with all sorts of holdovers from our early chordate ancestors and a lot from our more recent ancestors.
If the AGI need to factor large numbers, it can just use an attached CPU. Factoring large numbers is easy compared to reading this sentence about factoring large numbers and understanding what that actually means.
Easy is a misleading term in this context. I certainly can’t factor a forty digit number but for a computer that’s trivial. Moreover, some operations are only difficult because we don’t know an efficient algorithm. In any event, if your speedup is only occuring for the narrow set of tasks which humans can do decently such as vision, then you aren’t going to get a very impressive AGI. The ability to engage in face recognition if it takes you only a tiny amount of time that it would for a person to do is not an impressive ability.
The circuit would learn just as the brain does, no algorithmic changes ever needed past that point, as the self-modification is built into the algorithm.
This limits the amount of modification one can do.
Limits it compared to what?. Every circuit is equivalent to a program. The circuit of a general processor is equivalent to a program which simulates another circuit—the program which it keeps in memory.
Current Von Neumman processors are not the only circuits which have this simulation-flexibility. The brain has similar flexibility using very different mechanisms.
Finally, even if we later find out that lo and behold, the inference algorithm we hard-coded into our AGI circuits was actually not so great, and somebody comes along with a much better one . . . that is still not an argument for simulating the algorithm in software.
Moreover, the more flexible your algorithm the less you gain from hard-wiring it.
Not at all true. The class of statistical inference algorithms including Bayesian Networks and the cortex are both extremely flexible and greatly benefit from ‘hard-wiring’ it.
The brain is highly specialized and extremely adept at doing typical AGI stuff—vision, pattern recognition, inference, and so on—problems that are suited to massively wide but slow processing with huge memory demands.
No, we don’t know that the brain is “extremely adept” at these things. We just know that it is better than anything else that we know of.
This is like saying we don’t know that Usain Bolt is extremely adept at running, he’s just better than anything else that we know of. The latter sentence in each case of course is true, but it doesn’t impinge on the former.
But my larger point was that the brain and current computers occupy two very different regions in the space of possible circuit designs, and are rather clearly optimized for a different slice over the space of computational problems.
There are some routes that we can obviously improve on the brain at the hardware level. Electronic circuits are orders of magnitude faster, and eventually we can make them much denser and thus much more massive.
However, it is much more of an open question in computer science if we will ever be able to greatly improve on the statistical inference algorithm used in the cortex. It is quite possible that evolution had enough time to solve that problem completely—or at least reach some nearly global maxima.
The brain’s architecture is formed by a succession of modifications to much simpler entities.
Yes—this is an excellent strategy for solving complex optimization problems.
If the AGI need to factor large numbers, it can just use an attached CPU. Factoring large numbers is easy compared to reading this sentence about factoring large numbers and understanding what that actually means.
Easy is a misleading term in this context.
Yes, and on second thought—largely mistaken. To be more precise we should speak of computational complexity and bitops. The best known factorization algorithms are running time exponential for the number of input bits. That makes them ‘hard’ in the scalability sense. But factoring small primes is still easy in the absolute cost sense.
Factoring is also easy in the algorithmic sense, as the best algorithms are very simple and short. Physics is hard in the algorithmic sense, AGI seems to be quite hard, etc.
In any event, if your speedup is only occuring for the narrow set of tasks which humans can do decently such as vision, then you aren’t going to get a very impressive AGI
The cortex doesn’t have a specialized vision circuit—there appears to be just one general purpose circuit it uses for everything. The visual regions become visual regions on account of . . processing visual input data.
AGI hardware could take advantage of specialized statistical inference circuitry and still be highly general.
I’m having a hard time understanding what you really mean by saying “the narrow set of tasks which humans can do decently such as vision”. What about quantum mechanics, computer science, mathematics, game design, poetry, economics, sports, art, or comedy? One could probably fill a book with the narrow set of tasks that humans can do decently. Of course, that other section of the bookstore—filled with books about things computers can do decently, is growing at an exciting pace.
The ability to engage in face recognition if it takes you only a tiny amount of time that it would for a person to do is not an impressive ability.
I’m not sure what you mean by this or how it relates. If you could do face recognition that fast . . it’s not impressive?
The main computational cost of every main competing AGI route I’ve seen involves some sort of deep statistical inference, and this amounts to a large matrix multiplication possibly with some non-linear stepping or a normalization. Neural nets, bayesian nets, whatever—if you look at the mix of required instructions, it amounts to a massive repetition of simple operations that are well suited to hardware optimization.
Finally, even if we later find out that lo and behold, the inference algorithm we hard-coded into our AGI circuits was actually not so great, and somebody comes along with a much better one . . . that is still not an argument for simulating the algorithm in software.
If we have many generations of rapid improvement of the algorithms this will be much easier if one doesn’t need to make new hardware each time.
Not at all true. The class of statistical inference algorithms including Bayesian Networks and the cortex are both extremely flexible and greatly benefit from ‘hard-wiring’ it.
The general trend should still occur this way. I’m also not sure that you can reach that conclusion about the cortex given that we don’t have a very good understanding of how the brain’s algorithms function.
he cortex doesn’t have a specialized vision circuit—there appears to be just one general purpose circuit it uses for everything. The visual regions become visual regions on account of . . processing visual input data.
That seems plausibly correct but we don’t actually know that. Given how much humans rely on vision it isn’t at all implausible that there have been subtle genetic tweaks that make our visual regions more effective in processing visual data (I don’t know the literature in this area at all).
To be more precise we should speak of computational complexity and bitops. The best known factorization algorithms are running time exponential for the number of input bits.
Incorrect, the best factoring algorithms are subexponential. See for example the quadratic field sieve and the number field sieve both of which have subexponential running time. This has been true since at least the early 1980s (there are other now obsolete algorithms that were around before then that may have had slightly subexponential running time. I don’t know enough about them in detail to comment.)
But factoring small primes is still easy in the absolute cost sense.
Factoring primes is always easy. For any prime p, it has no non-trivial factorizations. You seem to be confusing factorization with primality testing. The second is much easier than the first; we’ve had Agrawal’s algorithm which is provably polynomial time for about a decade. Prior to that we had a lot of efficient tests that were empirically faster than our best factorization procedures. We can determine the primality of numbers much larger than those we can factor.
Factoring is also easy in the algorithmic sense, as the best algorithms are very simple and short.
Really? The general number field sieve is simple and short? Have you tried to understand it or write an implementation? Simple and short compared to what exactly?
I’m having a hard time understanding what you really mean by saying “the narrow set of tasks which humans can do decently such as vision”. What about quantum mechanics, computer science, mathematics, game design, poetry, economics, sports, art, or comedy? One could probably fill a book with the narrow set of tasks that humans can do decently.
There are some tasks where we can argue that humans are doing a good job by comparison to others in the animal kingdom. Vision is a good example of this (we have some of the best vision of any mammal.) The rest are tasks which no other entities can do very well, and we don’t have any good reason to think humans are anywhere near good at them in an absolute sense. Note also that most humans can’t do math very well (Apparently 10% or so of my calculus students right now can’t divide one fraction by another). And the vast majority of poetry is just awful. It isn’t even obvious to me that the “good” poetry isn’t labeled that way in part simply from social pressure.
I’m not sure what you mean by this or how it relates. If you could do face recognition that fast . . it’s not impressive?
A lot of the tasks that humans have specialized in are not generally bottlenecks for useful computation. Improved facial recognition isn’t going to help much with most of the interesting stuff, like recursive self-improvement, constructing new algorithms, making molecular nanotech, finding a theory of everything, figuring out how Fred and George tricked Rita, etc.
The main computational cost of every main competing AGI route I’ve seen involves some sort of deep statistical inference, and this amounts to a large matrix multiplication possibly with some non-linear stepping or a normalization. Neural nets, bayesian nets, whatever—if you look at the mix of required instructions, it amounts to a massive repetition of simple operations that are well suited to hardware optimization.
Incorrect, the best factoring algorithms are subexponential.
To clarify, subexponential does not mean polynomial, but super-polynomial.
(Interestingly, while factoring a given integer is hard, there is a way to get a random integer within [1..N] and its factorization quickly. See Adam Kalai’s paper Generating Random Factored Numbers, Easily (PDF).
This is mostly irrelevant, but think complexity theorists use a weird definition of exponential according to which GNFS might still be considered exponential—I know when they say “at most exponential” they mean O(e^(n^k)) rather than O(e^n), so it seems plausible that by “at least exponential” they might mean Omega(e^(n^k)) where now k can be less than 1.
They like keeping things invariant under polynomial transformations of the input, since that’s has been observed to be a somewhat “natural” class. This is one of the areas where it seems to not quite.
Hmm, interesting in the notation that Scott says is standard to complexity theory my earlier statement that factoring is “subexponential” is wrong even though it is slower growing than exponential. But apparently Greg Kuperberg is perfectly happy labeling something like 2^(n^(1/2)) as subexponential.
If we have many generations of rapid improvement of the algorithms this will be much easier if one doesn’t need to make new hard-ware each time.
Yes, and this tradeoff exists today with some rough mix between general processors and more specialized ASICs.
I think this will hold true for a while, but it is important to point out a few subpoints:
If moore’s law slows down this will shift the balance farther towards specialized processors.
Even most ‘general’ processors today are actually a mix of CISC and vector processing, with more and more performance coming from the less-general vector portion of the chip.
For most complex real world problems algorithms eventually tend to have much less room for improvement than hardware—even if algorithmic improvements intially dominate. After a while algorithmic improvements end within the best complexity class and then further improvements are just constants and are swamped by hardware improvement.
Modern GPUs for example have 16 or more vector processors for every general logic processor.
The brain is like a very slow processor with massively wide dedicated statistical inference circuitry.
As a result of all this (and the point at the end of my last post) I expect that future AGIs will be built out of a heterogeneous mix of processors but with the bulk being something like a wide-vector processor with alot of very specialized statistical inference circuitry.
This type of design will still have huge flexibility by having program-ability at the network architecture level—it could for example simulate humanish and various types of mammalian brains as well as a whole range of radically different mind architectures all built out of the same building blocks.
The cortex doesn’t have a specialized vision circuit—there appears to be just one general purpose circuit it uses for everything. The visual regions become visual regions on account of . . processing visual input data.
That seems plausibly correct but we don’t actually know that.
We have pretty good maps of the low-level circuitry in the cortex at this point and it’s clearly built out of a highly repetitive base circuit pattern, similar to how everything is built out of cells at a lower level. I don’t have a single good introductory link, but it’s called the laminar cortical pattern.
Given how much humans rely on vision it isn’t at all implausible that there have been subtle genetic tweaks that make our visual regions more effective in processing visual data (I don’t know the literature in this area at all).
Yes, there are slight variations, but slight is the keyword. The cortex is highly general—the ‘visual’ region develops very differently in deaf people, for example, creating a entirely different audio processing networks much more powerful than what most people have.
The flexibility is remarkable—if you hook up electrodes to the tongue that send a rough visual signal from a camera, in time the cortical regions connected to the tongue start becoming rough visual regions and limited tongue based vision is the result.
Incorrect, the best factoring algorithms are subexponential.
I stand corrected on prime factorization—I saw the exp(....) part and assumed exponential before reading into it more.
Vision is a good example of this (we have some of the best vision of any mammal.) The rest are tasks which no other entities can do very well, and we don’t have any good reason to think humans are anywhere near good at them in an absolute sense.
This is a good point, but note the huge difference between the abilities or efficiency of an entire human mind vs the efficiency of the brain’s architecture or the efficiency of the lower level components from which it is built—such as the laminar cortical circuit.
I think this discussion started concerning your original point:
It isn’t at all obvious that the brain is even highly efficient at solving AI-type problems, especially given that humans have only needed to solve much of what we consider standard problems for a very short span of evolutionary history (and note that general mammal brain architecture looks very similar to ours
The cortical algorithm appears to be a pretty powerful and efficient low level building block. In evolutionary terms it has been around for much longer than human brains and naturally we can expect that it is much closer to optimality in the design configuration space in terms of the components it is built from.
As we go up a level to higher level brain architectures that are more recent in evolutionary terms we should expect there to be more room for improvement.
A lot of the tasks that humans have specialized in are not generally bottlenecks for useful computation.
The mammalian cortex is not specialized for particular tasks—this is the primary advantage of it’s architecture over it’s predecessors (at the cost of a much larger size than more specialized circuitry).
The mammalian cortex is not specialized for particular tasks—this is the primary advantage of it’s architecture over it’s predecessors (at the cost of a much larger size than more specialized circuitry).
How do you reconcile this claim with the fact that some people are faceblind from an early age and never develop the ability to recognize faces? This would suggest that there’s at least one aspect of humans that is normally somewhat hard-wired.
I’ve read a great deal about the cortex, and my immediate reaction to your statement was “no, that’s just not how it works”. (strong priors)
About one minute later on the Prosopagnosia wikipedia article, I find the first reference to this idea (that of congenital Prosopagnosia):
The idea of congenital prosopagnosia appears to be a new theory supported by one researcher and one? study:
Dr Jane Whittaker, writing in 1999, described the case of a Mr. C. and referred to other similar cases (De Haan & Campbell, 1991, McConachie, 1976 and Temple, 1992).[7] The reported cases suggest that this form of the disorder may be heritable and much more common than previously thought (about 2.5% of the population may be affected), although this congenital disorder is commonly accompanied by other forms of visual agnosia, and may not be “pure” prosopagnosia
The last part about it being “commonly accompanied by other forms of visual agnosia” gives it away—this is not anything close to what you originally thought/claimed, even if this new research is actually correct.
Known cases of true prosopagnosia are caused by brain damage—what this research is describing is probably a disorder of the higher region (V4 I believe) which typically learns to recognize faces and other complex objects.
However, there is an easy way to cause prosopagnosia during development—prevent the creature from ever seeing faces.
I dont have the link on hand, but there have been experiments in cats where you mess with their vision—by using grating patterns or carefully controlled visual environments, and you can create cats that literally can’t even see vertical lines.
So even the simplest most basic thing which nature could hard-code—a vertical line feature detector, actually develops from the same extremely flexible general cortical circuit—the same circuit which can learn to represent everything from sounds to quantum mechanics.
Humans can represent a massive number of faces, and in general the brain’s vast information storage capacity over the genome (10^15 ish vs 10^9 ish) more or less require a generalized learning circuit.
The cortical circuits do basically nothing but fire randomly when you are born—you really are a blank slate in that respect (although obviously the rest of the brain has plenty of genetically fixed functionality).
Of course the arrangement of the brain’s regions with respect to sensory organs and it’s overall wiring architecture do naturally lead to the familiar specializations of brain regions, but really one should consider this a developmental attractor—information is colonizing each cortex anew, but the similar architecture and similarity of information ensures that two brains end up having largely overlapping colonizations.
How do you reconcile this claim with the fact that some people are faceblind from an early age and never develop the ability to recognize faces? This would suggest that there’s at least one aspect of humans that is normally somewhat hard-wired.
There are all sorts of aspects of humans that are normally somewhat—or nearly entirely—hard-wired. The cortex just doesn’t tend to be. Even the parts of the cortex that are similarly specialised in most humans seem to be so due to what they are connected to. (As can be seen by looking at how the atypical cases have adapted differently.) It would surprise me if the inability to recognise faces was caused by a dysfunction in the cortex specifically.
Disclaimer: I disagree with nearly everything else Jacob has said in this thread. This position specifically appears to be well researched.
However, it is much more of an open question in computer science if we will ever be able to greatly improve on the statistical inference algorithm used in the cortex. It is quite possible that evolution had enough time to solve that problem completely—or at least reach some nearly global maxima.
This is unlikely. We haven’t been selected based on sheer brain power or brain inefficiency. Humans have been selected by their ability to reproduce in a complicated environment. Efficient intelligence helps, but there’s selection for a lot of other things, such as good immune systems and decent muscle systems. A lot of the selection that was brain selection was probably simply around the fantastically complicated set of tasks involved in navigating human societies. Note that human brain size on average has decreased over the last 50,000 years. Humans are subject to a lot of different selection pressures.
(Tangent: This is related to how at a very vague level we should expect genetic algorithms to outperform evolution at optimizing tasks. Genetic algorithms can select for narrow task completion goals, rather than select in a constantly changing environment with competition and interaction between the various entities being bred.)
It is quite possible that evolution had enough time to solve that problem completely [statistical inference in the cortex] - or at least reach some nearly global maxima
This is unlikely. We haven’t been selected based on sheer brain power or brain inefficiency.
I largely agree with your point about human evolution, but my point was about the laminar cortical circuit which is shared in various forms across the entire mammalian lineage and has an analog in birds.
It’s a building block pattern that appears to have a long evolutionary history.
Genetic algorithms can select for narrow task completion goals, rather than select in a constantly changing environment with competition and interaction between the various entities being bred.
Yes, but there is a limit to this of course. We are, after all, talking about general intelligence.
You made the general point earlier, which I very much agree with, about opportunity cost. Simulating humanity’s current time-line has an opportunity cost in the form of some paradise that could exist in it’s place. You seem to think that the paradise is clearly better, and I agree: from our current moral perspective.
It seems you’re arguing that our successors will develop a preference for simulating universes like ours over paradises. If that’s what you’re arguing, then what reason do we have to believe that this is probable?
If their preferences do not change significantly from ours, it seems highly unlikely that they will create simulations identical to our current existence. And out of the vast space of possible ways their preferences could change, selecting that direction in the absence of evidence is a serious case of privileging the hypothesis.
Historical recreation currently seems to be the best rationale for a superintelligence to simulate this timeslice, although there are probably other motivations as well.
If that was actually the case, then there would be no point to moving to a new technology node!
Yes leakage is a problem at the new tech nodes, but of course power per transistor can not possibly be increasing. I think you mean power per surface area has increased.
Shrinking a circuit by half in each dimension makes the wires thinner, shorter and less resistant, decreasing power use per transistor just as you’d think. Leakage makes this decrease somewhat less than the shrinkage rate, but it doesn’t reverse the entire trend.
There are also other design trends that can compensate and overpower this to an extent, which is why we have a plethora of power efficient circuits in the modern handheld market.
“which mentioned that the increased waste heat from modern circuits was rising at a faster exponential than circuit density”
Do you remember when this was from or have a link? I could see that being true when speeds were also increasing, but that trend has stopped or reversed.
I recall seeing some slides from NVidia where they are claiming there next GPU architecture will cut power use per transistor dramatically as well at several times the rate of shrinkage.
Even if the goal is maximizing fun, creating some historical sims for the purpose of resurrecting the dead may serve that goal. But I really doubt that current-human-fun-maximization is an evolutionary stable goal system.
I imagine that future posthuman morality and goals will evolve into something quite different.
Knowledge is a universal feature of intelligence. Even the purely mathematical hypothetical superintelligence AIXI would end up creating tons of historical simulations—and that might be hopelessly brute force, but nonetheless superintelligences with a wide variety of goal systems would find utility in various types of simulation.
Much of the information from the past is probably irretrievably lost to us. If the information input into the simulation were not precisely the same as the actual information from that point in history, the differences would quickly propagate so that the simulation would bear little resemblance to the history. Supposing the individuals in question did have access to all the information they’d need to simulate the past, they’d have no need for the simulation, because they’d already have complete informational access to the past. It suffers similar problems to your sandboxed anthropomorphic AI proposal; provided you have all the resources necessary to actually do it, it ceases to be a good idea.
There are other possible motivations, but it’s not clear that there are any others that are as good or better, so we have little reason to suppose it will ever happen.
This seems to be overly restrictive, but I don’t mind confining the discussion to this hypothesis.
Yes, you are correct.
The roundtable was at SC′08, a while after speeds had stabilized, and since it is a supercomputing conference, the focus was on massively parallel systems. It was part of this.
Without needing to dispute this, I can remain exceptionally upset that whatever their future morality is, it is blind to suffering and willing to create innumerable beings that will suffer in order to gain historical knowledge. Does this really not bother you in the slightest?
ETA: still 404
While the leakage issue is important and I want to read a little more about this reference, I don’t think that any single such current technical issue is nearly sufficient to change the general analysis. There have always been major issues on the horizon, the question is more of the increase in engineering difficulty as we progress vs the increase in our effective intelligence and simulation capacity.
In the specific case of leakage, even if it is a problem that persists far into the future, it just slightly lowers the growth exponent as we just somewhat lower the clock speeds. And even if leakage can never be fully prevented, eventually it itself can probably be exploited for computation.
As I child I liked Mcdonalds, bread, plain pizza and nothing more—all other foods were poisonous. I was convinced that my parent’s denial of my right to eat these wonderful foods and condemn me to terrible suffering as a result was a sure sign of their utter lack of goodness.
Imagine if I could go back and fulfill that child’s wish to reduce it’s suffering. It would never then evolve into anything like my current self, and in fact may evolve into something that would suffer more or at the very least wish that it could be me.
Imagine if we could go back in time and alter our primate ancestors to reduce their suffering. The vast majority of such naive interventions would cripple their fitness and wipe out the lineage. There is probably a tiny set of sophisticated interventions that could simultaneously eliminate suffering and improve fitness, but these altered creatures would not develop into humans.
Our current existence is completely contingent on a great evolutionary epic of suffering on an astronomical scale. But suffering itself is just one little component of that vast mechanism, and forms no basis from which to judge the totality.
You made the general point earlier, which I very much agree with, about opportunity cost. Simulating humanity’s current time-line has an opportunity cost in the form of some paradise that could exist in it’s place. You seem to think that the paradise is clearly better, and I agree: from our current moral perspective.
In the end of the day morality is governed by evolution. There is an entire landscape of paradises that could exist, the question is what fitness advantage do they provide their creator? The more they diverge from reality, the less utility they have in advancing knowledge of reality towards closure.
It looks like earth will evolve into a vast planetary hierarchical superintelligence, but ultimately it will probably be just one of many, and still subject to evolutionary pressure.
I disagree; I think that problems like this, unresolved, may or may not decrease the base of our exponent, but will cap its growth earlier.
On this point, we disagree, and I may be on the unpopular side of this agreement. I don’t see how past increases that have required technological revolutions can be considered more than weak evidence for future technological revolutions. I actually think it quite likely that increase in computational power per Joule will bottom out in ten to twenty years. I wouldn’t be too surprised if exponential increase lasts thirty years, but forty seems unlikely, and fifty even less likely.
I don’t care. We aren’t talking about destroying the future of intelligence by going back in time. We’re talking about repeating history umpteen many times, creating suffering anew each time. It sounds to me like you are insisting that this suffering is worthwhile, even if the result of all of it will never be more than a data point in a historian’s database.
We live in a heartbreaking world. Under the assumption that we are not in a simulation, we can recognize facts like ‘suffering is decreasing over time’ and realize that it is our job to work to aid this progress. Under the assumption that we are in a simulation, we know that the capacity for this progress is already fully complete, and the agents who control it simply don’t care. If we are being simulated, it means that one or more entities have chosen to create unimaginable quantities of suffering for their own purposes—to your stated belief, for historical knowledge.
Your McDonald’s example doesn’t address this in the slightest. You were already a living, thinking being, and your parents took care of you in the right way in an attempt to make your future life better. They couldn’t have chosen before you were born to instead create someone who would be happier, smarter, wiser, and better in every way. If they could have, wouldn’t it be upsetting that they chose not to?
Given the choice between creating agents that have to endure suffering for generations upon generations, and creating agents that will have much more positive, productive lives, why are you arguing for the side that chooses the former? Of course the former and latter are entirely different entities, but that serves as no argument whatsoever for choosing the former!
A person running such a simulation could create a simulated afterlife, without suffering, where each simulated intelligence would go after dying in the simulated universe. It’s like a nice version of Pascal’s Wager, since there’s no wagering involved. Such an afterlife wouldn’t last infinitely long, but it could easily be made long enough to outweigh any suffering in the simulated universe.
Or you could skip the part with all the suffering. That would be a lot easier.
In general, I agree. I just wanted to offer a more creative alternative for someone truly dedicated to operating such a simulation.
So far the only person who seems dedicated to making such a simulation is jacob cannell, and he already seems to be having enough trouble separating the idea from cached theistic assumptions.
I don’t think that’s how it works.
How much future happiness would you need in order to choose to endure 50 years of torture?
That depends if happiness without torture is an option. The options are better/worse, not good/bad.
The simulated afterlife wouldn’t need to outweigh the suffering in the first universe according to our value system, only according to the value system of the aliens who set up the simulation.
Technology doesn’t really advance through ‘revolutions’, it evolves. Some aspects of that evolution appear to be rather remarkably predictable.
That aside, the current predictions do posit a slow-down around 2020 for the general lithography process, but there are plenty of labs researching alternatives. As the slow-down approaches, their funding and progress will accelerate.
But there is a much more fundamental and important point to consider, which is that circuit shrinkage is just one dimension of improvement amongst several. As that route of improvement slows down, other routes will become more profitable.
For example, for AGI algorithms, current general purpose CPUs are inefficient by a factor of perhaps around 10^4. That is a decade of exponential gain right there just from architectural optimization. This route—neuromorphic hardware and it’s ilk—currently receives a tiny slice of the research budget, but this will accelerate as AGI advances and would accelerate even more if the primary route of improvement slowed.
Another route of improvement is exponentially reducing manufacturing cost. The bulk of the price of high-end processors pays for the vast amortized R&D cost of developing the manufacturing node within the timeframe that the node is economical. Refined silicon is cheap and getting cheaper, research is expensive. The per transistor cost of new high-end circuitry on the latest nodes for a CPU or GPU is 100 times more expensive than the per transistor cost of bulk circuitry produced on slightly older nodes.
So if moore’s law stopped today, the cost of circuitry would still decay down to the bulk cost. This is particularly relevant to neurmorphic AGI designs as they can use a mass of cheap repetitive circuitry, just like the brain. So we have many other factors that will kick in even as moore’s law slows.
I suspect that we will hit a slow ramping wall around or by 2020, but these other factors will kick in and human-level AGI will ramp up, and then this new population and speed explosion will drive the next S-curve using a largely new and vastly more complex process (such as molecular nano-tech) that is well beyond our capability or understanding.
It’s more or less equivalent from the perspective of a historical sim. A historical sim is a recreation of some branch of the multiverse near your own incomplete history that you then run forward to meet your present.
My existence is fully contingent on the existence of my ancestors in all of their suffering glory. So from my perspective, yes their suffering was absolutely worthwhile, even if it wasn’t from their perspective.
Likewise, I think that it is our noble duty to solve AI, morality, and control a Singularity in order to eliminate suffering and live in paradise.
I also understand that after doing that we will over time evolve into beings quite unlike what we are now and eventually look back at our prior suffering and view it from an unimaginably different perspective, just as my earlier mcdonald’s loving child-self evolved into a being with a completely different view of it’s prior suffering.
It was right from both their and my current perspective, it was absolutely wrong from my perspective at the time.
Of course! Just as we should create something better than ourselves. But ‘better’ is relative to a particular subjective utility function.
I understand that my current utility function works well now, that it is poorly tuned to evaluate the well-being of bacteria, just as poorly tuned to evaluate the well-being of future posthuman godlings, and most importantly—my utility function or morality will improve over time.
Imagine you are the creator. How do you define ‘positive’ or ‘productive’? From your perspective, or theirs?
There are an infinite variety of uninteresting paradises. In some virtual humans do nothing but experience continuous rapturous bliss well outside the range of current drug-induced euphoria. There are complex agents that just set their reward functions to infinity and loop.
There are also a spectrum of very interesting paradises, all having the key differentiator that they evolve. I suspect that future godlings will devote most of their resources to creating these paradises.
I also suspect that evolution may operate again at an intergalactic or higher level, ensuring that paradises and all simulations somehow must pay for themselves.
At some point our descendants will either discover for certain they are in a sim and integrate up a level, or they will approach local closure and perhaps discover an intergalactic community. At that point we may have to compete with other singularity-civilizations, and we may have the opportunity to historically intervene on pre-singularity planets we encounter. We’d probably want to simulate any interventions before preceeding, don’t you think?
A historical recreation can develop into a new worldline with it’s own set of branching paradises that increase overall variation in a blossoming metaverse.
If you could create a new big bang, an entire new singularity and new universe, would you?
You seem to be arguing that you would not because it would include humans who suffer. I think this ends up being equivalent to arguing the universe should not exist.
If we had enough information to create an entire constructed reality of them in simulation, we’d have much more than we needed to just go ahead and intervene.
Some people would argue that it shouldn’t (this is an extreme of negative utilitarianism.) However, since we’re in no position to decide whether the universe gets to exist or not, the dispute is fairly irrelevant. If we’re in a position to decide between creating a universe like ours, creating one that’s much better, with more happiness and productivity and less suffering, and not creating one at all, though, I would have an extremely poor regard for the morality of someone who chose the first.
If my descendants think that all my suffering was worthwhile so that they could be born instead of someone else, then you know what? Fuck them. I certainly have a higher regard for my own ancestors. If they could have been happier, and given rise to a world as good as better than this one, then who am I to argue that they should have been unhappy so I could be born instead? If, as you point out
then why not skip the historical recreation and go straight to simulating the paradises?
I’m curious how you’ve reached this conclusion given how little we know about what AGI algorithms would look like.
The particular type of algorithm is actually not that important. There is a general speedup in moving from a general CPU-like architecture to a specialized ASIC—once you are willing to settle on the algorithms involved.
There is another significant speedup moving into analog computation.
Also, we know enough about the entire space of AI sub-problems to get a general idea of what AGI algorithms look like and the types of computations they need. Naturally the ideal hardware ends up looking much more like the brain than current von neumann machines—because the brain evolved to solve AI problems in an energy efficient manner.
If you know your are working in the space of probabilistic/bayesian like networks, exact digital computations are extremely wasteful. Using ten or hundreds of thousands of transistors to do an exact digital multiply is useful for scientific or financial calculations, but it’s a pointless waste when the algorithm just needs to do a vast number of probabilistic weighted summations, for example.
Cite for last paragraph about analog probability: http://phm.cba.mit.edu/theses/03.07.vigoda.pdf
Thanks. Hefty read, but this one paragraph is worth quoting:
I had forgot that term, statistical inference algorithms, need to remember that.
Well, there’s also another quote worth quoting, and in fact the quote that is in my Mnemosyne database and which enabled me to look that thesis up so fast...
This is true in general but this particular statement appears out of date:
’Alternative computing architectures, such as parallel digital computers have not tended to be commercially viable”
That was true perhaps circa 2000, but we hit a speed/heat wall and since then everything has been going parallel.
You may see something similar happen eventually with analog computing once the market for statistical inference computation is large enough and or we approach other constraints similar to the speed/heat wall.
Ok. But this prevents you from directly improving your algorithms. And if the learning mechanisms are to be highly flexible (like say those of a human brain) then the underlying algorithms may need to modify a lot even to just approximate being an intelligent entity. I do agree that given a fixed algorithm this would plausibly lead to some speed-up.
A lot of things can’t be put into analog. For example, what if you need factor large numbers. And making analog and digital stuff interact is difficult.
This doesn’t follow. The brain evolved through a long path of natural selection. It isn’t at all obvious that the brain is even highly efficient at solving AI-type problems, especially given that humans have only needed to solve much of what we consider standard problems for a very short span of evolutionary history (and note that general mammal brain architecture looks very similar to ours).
EDIT: why the downvotes?
Yes—which is part of the reason there is a big market for CPUs.
Not necessarily. For example, the cortical circuit in the brain can be reduced to an algorithm which would include the learning mechanism built in. The learning can modify the network structure to a degree but largely adjusts synaptic weights. That can be described as (is equivalent to) a single fixed algorithm. That algorithm in turn can be encoded into an efficient circuit. The circuit would learn just as the brain does, no algorithmic changes ever needed past that point, as the self-modification is built into the algorithm.
A modern CPU is a jack-of all trades that is designed to do many things, most of which have little or nothing to do with the computational needs of AGI.
If the AGI need to factor large numbers, it can just use an attached CPU. Factoring large numbers is easy compared to reading this sentence about factoring large numbers and understanding what that actually means.
The brain has roughly 10^15 noisy synapses that can switch around 10^3 times per second and store perhaps a bit each as well. (computation and memory integrated)
My computer has about 10^9 exact digital transistors in it’s CPU & GPU that can switch around 10^9 times per second. It has around the same amount of separate memory and around 10^13 bits of much slower disk storage.
These systems have similar peak throughputs of about 10^18 bits/second, but they are specialized for very different types of computational problems. The brain is very slow but massively wide, the computer is very narrow but massively fast.
The brain is highly specialized and extremely adept at doing typical AGI stuff—vision, pattern recognition, inference, and so on—problems that are suited to massively wide but slow processing with huge memory demands.
Our computers are specialized and extremely adept at doing the whole spectrum of computational problems brains suck at—problems that involve long complex chains of exact computations, problems that require massive speed and precision but less bulk processing and memory.
So to me, yes it’s obvious that the brain is highly efficient at doing AGI-type stuff—almost because that’s how we define AGI-type stuff—its all the stuff that brains are currently much better than computers at!
This limits the amount of modification one can do. Moreover, the more flexible your algorithm the less you gain from hard-wiring it.
No, we don’t know that the brain is “extremely adept” at these things. We just know that it is better than anything else that we know of. That’s not at all the same thing. The brain’s architecture is formed by a succession of modifications to much simpler entities. The successive, blind modification has been stuck with all sorts of holdovers from our early chordate ancestors and a lot from our more recent ancestors.
Easy is a misleading term in this context. I certainly can’t factor a forty digit number but for a computer that’s trivial. Moreover, some operations are only difficult because we don’t know an efficient algorithm. In any event, if your speedup is only occuring for the narrow set of tasks which humans can do decently such as vision, then you aren’t going to get a very impressive AGI. The ability to engage in face recognition if it takes you only a tiny amount of time that it would for a person to do is not an impressive ability.
Limits it compared to what?. Every circuit is equivalent to a program. The circuit of a general processor is equivalent to a program which simulates another circuit—the program which it keeps in memory.
Current Von Neumman processors are not the only circuits which have this simulation-flexibility. The brain has similar flexibility using very different mechanisms.
Finally, even if we later find out that lo and behold, the inference algorithm we hard-coded into our AGI circuits was actually not so great, and somebody comes along with a much better one . . . that is still not an argument for simulating the algorithm in software.
Not at all true. The class of statistical inference algorithms including Bayesian Networks and the cortex are both extremely flexible and greatly benefit from ‘hard-wiring’ it.
This is like saying we don’t know that Usain Bolt is extremely adept at running, he’s just better than anything else that we know of. The latter sentence in each case of course is true, but it doesn’t impinge on the former.
But my larger point was that the brain and current computers occupy two very different regions in the space of possible circuit designs, and are rather clearly optimized for a different slice over the space of computational problems.
There are some routes that we can obviously improve on the brain at the hardware level. Electronic circuits are orders of magnitude faster, and eventually we can make them much denser and thus much more massive.
However, it is much more of an open question in computer science if we will ever be able to greatly improve on the statistical inference algorithm used in the cortex. It is quite possible that evolution had enough time to solve that problem completely—or at least reach some nearly global maxima.
Yes—this is an excellent strategy for solving complex optimization problems.
Yes, and on second thought—largely mistaken. To be more precise we should speak of computational complexity and bitops. The best known factorization algorithms are running time exponential for the number of input bits. That makes them ‘hard’ in the scalability sense. But factoring small primes is still easy in the absolute cost sense.
Factoring is also easy in the algorithmic sense, as the best algorithms are very simple and short. Physics is hard in the algorithmic sense, AGI seems to be quite hard, etc.
The cortex doesn’t have a specialized vision circuit—there appears to be just one general purpose circuit it uses for everything. The visual regions become visual regions on account of . . processing visual input data.
AGI hardware could take advantage of specialized statistical inference circuitry and still be highly general.
I’m having a hard time understanding what you really mean by saying “the narrow set of tasks which humans can do decently such as vision”. What about quantum mechanics, computer science, mathematics, game design, poetry, economics, sports, art, or comedy? One could probably fill a book with the narrow set of tasks that humans can do decently. Of course, that other section of the bookstore—filled with books about things computers can do decently, is growing at an exciting pace.
I’m not sure what you mean by this or how it relates. If you could do face recognition that fast . . it’s not impressive?
The main computational cost of every main competing AGI route I’ve seen involves some sort of deep statistical inference, and this amounts to a large matrix multiplication possibly with some non-linear stepping or a normalization. Neural nets, bayesian nets, whatever—if you look at the mix of required instructions, it amounts to a massive repetition of simple operations that are well suited to hardware optimization.
If we have many generations of rapid improvement of the algorithms this will be much easier if one doesn’t need to make new hardware each time.
The general trend should still occur this way. I’m also not sure that you can reach that conclusion about the cortex given that we don’t have a very good understanding of how the brain’s algorithms function.
That seems plausibly correct but we don’t actually know that. Given how much humans rely on vision it isn’t at all implausible that there have been subtle genetic tweaks that make our visual regions more effective in processing visual data (I don’t know the literature in this area at all).
Incorrect, the best factoring algorithms are subexponential. See for example the quadratic field sieve and the number field sieve both of which have subexponential running time. This has been true since at least the early 1980s (there are other now obsolete algorithms that were around before then that may have had slightly subexponential running time. I don’t know enough about them in detail to comment.)
Factoring primes is always easy. For any prime p, it has no non-trivial factorizations. You seem to be confusing factorization with primality testing. The second is much easier than the first; we’ve had Agrawal’s algorithm which is provably polynomial time for about a decade. Prior to that we had a lot of efficient tests that were empirically faster than our best factorization procedures. We can determine the primality of numbers much larger than those we can factor.
Really? The general number field sieve is simple and short? Have you tried to understand it or write an implementation? Simple and short compared to what exactly?
There are some tasks where we can argue that humans are doing a good job by comparison to others in the animal kingdom. Vision is a good example of this (we have some of the best vision of any mammal.) The rest are tasks which no other entities can do very well, and we don’t have any good reason to think humans are anywhere near good at them in an absolute sense. Note also that most humans can’t do math very well (Apparently 10% or so of my calculus students right now can’t divide one fraction by another). And the vast majority of poetry is just awful. It isn’t even obvious to me that the “good” poetry isn’t labeled that way in part simply from social pressure.
A lot of the tasks that humans have specialized in are not generally bottlenecks for useful computation. Improved facial recognition isn’t going to help much with most of the interesting stuff, like recursive self-improvement, constructing new algorithms, making molecular nanotech, finding a theory of everything, figuring out how Fred and George tricked Rita, etc.
This seems to be a good point.
To clarify, subexponential does not mean polynomial, but super-polynomial.
(Interestingly, while factoring a given integer is hard, there is a way to get a random integer within [1..N] and its factorization quickly. See Adam Kalai’s paper Generating Random Factored Numbers, Easily (PDF).
Interesting. I had not seen that paper before. That’s very cute.
This is mostly irrelevant, but think complexity theorists use a weird definition of exponential according to which GNFS might still be considered exponential—I know when they say “at most exponential” they mean O(e^(n^k)) rather than O(e^n), so it seems plausible that by “at least exponential” they might mean Omega(e^(n^k)) where now k can be less than 1.
EDIT: Nope, I’m wrong about this. That seems kind of inconsistent.
They like keeping things invariant under polynomial transformations of the input, since that’s has been observed to be a somewhat “natural” class. This is one of the areas where it seems to not quite.
Hmm, interesting in the notation that Scott says is standard to complexity theory my earlier statement that factoring is “subexponential” is wrong even though it is slower growing than exponential. But apparently Greg Kuperberg is perfectly happy labeling something like 2^(n^(1/2)) as subexponential.
Yes, and this tradeoff exists today with some rough mix between general processors and more specialized ASICs.
I think this will hold true for a while, but it is important to point out a few subpoints:
If moore’s law slows down this will shift the balance farther towards specialized processors.
Even most ‘general’ processors today are actually a mix of CISC and vector processing, with more and more performance coming from the less-general vector portion of the chip.
For most complex real world problems algorithms eventually tend to have much less room for improvement than hardware—even if algorithmic improvements intially dominate. After a while algorithmic improvements end within the best complexity class and then further improvements are just constants and are swamped by hardware improvement.
Modern GPUs for example have 16 or more vector processors for every general logic processor.
The brain is like a very slow processor with massively wide dedicated statistical inference circuitry.
As a result of all this (and the point at the end of my last post) I expect that future AGIs will be built out of a heterogeneous mix of processors but with the bulk being something like a wide-vector processor with alot of very specialized statistical inference circuitry.
This type of design will still have huge flexibility by having program-ability at the network architecture level—it could for example simulate humanish and various types of mammalian brains as well as a whole range of radically different mind architectures all built out of the same building blocks.
We have pretty good maps of the low-level circuitry in the cortex at this point and it’s clearly built out of a highly repetitive base circuit pattern, similar to how everything is built out of cells at a lower level. I don’t have a single good introductory link, but it’s called the laminar cortical pattern.
Yes, there are slight variations, but slight is the keyword. The cortex is highly general—the ‘visual’ region develops very differently in deaf people, for example, creating a entirely different audio processing networks much more powerful than what most people have.
The flexibility is remarkable—if you hook up electrodes to the tongue that send a rough visual signal from a camera, in time the cortical regions connected to the tongue start becoming rough visual regions and limited tongue based vision is the result.
I stand corrected on prime factorization—I saw the exp(....) part and assumed exponential before reading into it more.
This is a good point, but note the huge difference between the abilities or efficiency of an entire human mind vs the efficiency of the brain’s architecture or the efficiency of the lower level components from which it is built—such as the laminar cortical circuit.
I think this discussion started concerning your original point:
The cortical algorithm appears to be a pretty powerful and efficient low level building block. In evolutionary terms it has been around for much longer than human brains and naturally we can expect that it is much closer to optimality in the design configuration space in terms of the components it is built from.
As we go up a level to higher level brain architectures that are more recent in evolutionary terms we should expect there to be more room for improvement.
The mammalian cortex is not specialized for particular tasks—this is the primary advantage of it’s architecture over it’s predecessors (at the cost of a much larger size than more specialized circuitry).
How do you reconcile this claim with the fact that some people are faceblind from an early age and never develop the ability to recognize faces? This would suggest that there’s at least one aspect of humans that is normally somewhat hard-wired.
I’ve read a great deal about the cortex, and my immediate reaction to your statement was “no, that’s just not how it works”. (strong priors)
About one minute later on the Prosopagnosia wikipedia article, I find the first reference to this idea (that of congenital Prosopagnosia):
The idea of congenital prosopagnosia appears to be a new theory supported by one researcher and one? study:
The last part about it being “commonly accompanied by other forms of visual agnosia” gives it away—this is not anything close to what you originally thought/claimed, even if this new research is actually correct.
Known cases of true prosopagnosia are caused by brain damage—what this research is describing is probably a disorder of the higher region (V4 I believe) which typically learns to recognize faces and other complex objects.
However, there is an easy way to cause prosopagnosia during development—prevent the creature from ever seeing faces.
I dont have the link on hand, but there have been experiments in cats where you mess with their vision—by using grating patterns or carefully controlled visual environments, and you can create cats that literally can’t even see vertical lines.
So even the simplest most basic thing which nature could hard-code—a vertical line feature detector, actually develops from the same extremely flexible general cortical circuit—the same circuit which can learn to represent everything from sounds to quantum mechanics.
Humans can represent a massive number of faces, and in general the brain’s vast information storage capacity over the genome (10^15 ish vs 10^9 ish) more or less require a generalized learning circuit.
The cortical circuits do basically nothing but fire randomly when you are born—you really are a blank slate in that respect (although obviously the rest of the brain has plenty of genetically fixed functionality).
Of course the arrangement of the brain’s regions with respect to sensory organs and it’s overall wiring architecture do naturally lead to the familiar specializations of brain regions, but really one should consider this a developmental attractor—information is colonizing each cortex anew, but the similar architecture and similarity of information ensures that two brains end up having largely overlapping colonizations.
There are all sorts of aspects of humans that are normally somewhat—or nearly entirely—hard-wired. The cortex just doesn’t tend to be. Even the parts of the cortex that are similarly specialised in most humans seem to be so due to what they are connected to. (As can be seen by looking at how the atypical cases have adapted differently.) It would surprise me if the inability to recognise faces was caused by a dysfunction in the cortex specifically.
Disclaimer: I disagree with nearly everything else Jacob has said in this thread. This position specifically appears to be well researched.
This is unlikely. We haven’t been selected based on sheer brain power or brain inefficiency. Humans have been selected by their ability to reproduce in a complicated environment. Efficient intelligence helps, but there’s selection for a lot of other things, such as good immune systems and decent muscle systems. A lot of the selection that was brain selection was probably simply around the fantastically complicated set of tasks involved in navigating human societies. Note that human brain size on average has decreased over the last 50,000 years. Humans are subject to a lot of different selection pressures.
(Tangent: This is related to how at a very vague level we should expect genetic algorithms to outperform evolution at optimizing tasks. Genetic algorithms can select for narrow task completion goals, rather than select in a constantly changing environment with competition and interaction between the various entities being bred.)
I largely agree with your point about human evolution, but my point was about the laminar cortical circuit which is shared in various forms across the entire mammalian lineage and has an analog in birds.
It’s a building block pattern that appears to have a long evolutionary history.
Yes, but there is a limit to this of course. We are, after all, talking about general intelligence.
It seems you’re arguing that our successors will develop a preference for simulating universes like ours over paradises. If that’s what you’re arguing, then what reason do we have to believe that this is probable?
If their preferences do not change significantly from ours, it seems highly unlikely that they will create simulations identical to our current existence. And out of the vast space of possible ways their preferences could change, selecting that direction in the absence of evidence is a serious case of privileging the hypothesis.