Probabilities were for example purposes only. I made them up because they were nice to calculate with and sounded halfway reasonable. I will not defend them. If you request that I come up with my real probability estimates, I will have to think harder.
Ah, well your more general point was well-made. I don’t think better numbers are really important. It’s all too fuzzy for me to be at all confident about.
I still retain my belief that it is implausible that we are in a universe simulation. If I am in a simulation, I expect that it is more likely that I am by myself (and that conscious or not, you are part of the simulation created in response to me), moderately more likely that there are a small group of humans being simulated with other humans and their environment dynamically generated, and overall very unlikely that the creators have bothered to simulate any part of physical reality that we aren’t directly observing (including other people). Ultimately, none of these seem likely enough for me to bother considering for very long.
The first part of your belief that “it is implausible that we are in a universe simulation” appears to be based on the argument:
If simulationism, then solipsism is likely.
Solipsism is unlikely, so . . .
Chain of logic aside, simulationism does not imply solipsism. Simulating N localized space-time patterns in one large simulation can be significantly cheaper than simulating N individual human simulations. So some simulated individuals may exist in small solipsist sims, but the great majority of conscious sims will find themselves in larger shared simulations.
Presumably a posthuman intelligence on earth would be interested in earth as a whole system, and would simulate this entire system. Simulating full human-mind equivalents is something of a sweet spot in the space of approximations.
There is a massive sweet spot, an extremely effecient method, of simulating a modern computer—which is to simulate it at the level of it’s turing equivalent circuit. Simulating it at a level below this—say at the molecular level, is just a massive waste of resources, while any simulation above this loses accuracy completely.
It is postulated that a similar simulation scale separation exists for human minds, which naturally relates to uploads and AI.
Simulating full human-mind equivalents is something of a sweet spot in the space of approximations.
I don’t understand why human-mind equivalents are special in this regard. This seems very anthropocentric, but I could certainly be misinterpreting what you said.
Simulating N localized space-time patterns in one large simulation can be significantly cheaper than simulating N individual human simulations.
Cheaper, but not necessarily more efficient. It matters which answers one is looking for, or which goals one is after. It seems unlikely to me that my life is directed well enough to achieve interesting goals or answer interesting questions that a superintelligence might pose, but it seems even more unlikely that simulating 6 billion humans, in the particular way they appear (to me) to exist is an efficient way to answer most questions either.
I’d like to stay away from telling God what to be interested in, but out of the infinite space of possibilities, Earth seems too banal and languorous to be the one in N that have been chosen for the purpose of simulation, especially if the basement universe has a different physics.
If the basement universe matches our physics, I’m betting on the side that says simulating all the minds on Earth and enough other stuff to make the simulation consistent is an expensive enough proposition that it won’t be worthwhile to do it many times. Maybe I’m wrong; there’s no particular reason why simulating all of humanity in the year of 2011 needs to take more than 10^18 J, so maybe there’s a “real” milky way that’s currently running 10^18 planet-scale sims. Even that doesn’t seem like a big enough number to convince me that we are likely to be one of those.
Simulating full human-mind equivalents is something of a sweet spot in the space of approximations.
I don’t understand why human-mind equivalents are special in this regard. This seems very anthropocentric, but I could certainly be misinterpreting what you said.
I meant there is probably some sweet spot in the space of [human-mind] approximations, because of scale separation, which I elaborated on a little later with the computer analogy.
Simulating N localized space-time patterns in one large simulation can be significantly cheaper than simulating N individual human simulations.
Cheaper, but not necessarily more efficient.
Cheaper implies more efficient, unless the individual human simulations somehow have a dramatically higher per capita utility.
A solipsist universe has extraneous patchwork complexity. Even assuming that all of the non-biological physical processes are grossly approximated (not unreasonable given current simulation theory in graphics), they still may add up to a cost exceeding that of one human mind.
But of course a world with just one mind is not an accurate simulation, so you now you need to populate it with a huge number of pseudo-minds which functionally are indistinguishable from the perspective of our sole real observer but somehow use much less computational resources.
Now imagine a graph of simulation accuracy vs computational cost of a pseudo-mind. Rather than being linear, I believe it is sharply exponential, or J-shaped with a single large spike near the scale separation point.
The jumping point is where the pseudo-mind becomes a real actual conscious observer of it’s own.
The rationale for this cost model and the scale separation point can be derived from what we know about simulating computers.
It seems unlikely to me that my life is directed well enough to achieve interesting goals or answer interesting questions that a superintelligence might pose, but it seems even more unlikely that simulating 6 billion humans, in the particular way they appear (to me) to exist is an efficient way to answer most questions either.
Perhaps not your life in particular, but human life on earth today?
Simulating 6 billion humans will probably be the only way to truly understand what happened today from the perspective of our future posthuman descendants. The alternatives are . . . creating new physical planets? Simulation will be vastly more efficient than that.
Earth seems too banal and languorous to be the one in N that have been chosen for the purpose of simulation, especially if the basement universe has a different physics.
The basement reality is highly unlikely to have different physics. The vast majority of simulations we create today are based on approximations of currently understood physics, and I don’t expect this to every change—simulations have utility for simulators.
so maybe there’s a “real” milky way that’s currently running 10^18 planet-scale sims. Even that doesn’t seem like a big enough number to convince me that we are likely to be one of those.
I’m a little confused about the 10^18 number.
From what I recall, at the limits of computation one kg of matter can hold roughly 10^30 bits, and a human mind is in the vicinity of 10^15 bits or less. So at the molecular limits a kg of matter could hold around a quadrillion souls—an entire human galactic civilization. A skyscraper of such matter could give you 10^8 kg .. and so on. Long before reaching physical limits, posthumans would be able to simulate many billions of entire earth histories. At the physical molecular limits, they could turn each of the moon’s roughly 10^22 kg into an entire human civilization, for a total of 10^37 minds.
The potential time scale compression are nearly as vast—with estimated speed limits at around 10^15 ops/bit/sec in ordinary matter at ordinary temperatures, vs at most 10^4 ops/bit/sec in human brains, although not dramatically higher than the 10^9 ops/bit/sec of today’s circuits. The potential speedup of more than 10^10 over biological brains allows for about one hundred years per second of sidereal time.
I meant there is probably some sweet spot in the space of [human-mind] approximations, because of scale separation, which I elaborated on a little later with the computer analogy.
I understand that for any mind, there is probably an “ideal simulation level” which has the fidelity of a more expensive simulation at a much lower cost, but I still don’t understand why human-mind equivalents are important here.
Cheaper implies more efficient, unless the individual human simulations somehow have a dramatically higher per capita utility.
Which seems pretty reasonable to me. Why should the value of simulating minds be linear rather than logarithmic in the number of minds?
A solipsist universe has extraneous patchwork complexity. Even assuming that all of the non-biological physical processes are grossly approximated (not unreasonable given current simulation theory in graphics), they still may add up to a cost exceeding that of one human mind.
Agreed, but I also think that the cost of simulating the relevant stuff necessary to simulate N minds might be close to linear in N.
Now imagine a graph of simulation accuracy vs computational cost of a pseudo-mind. Rather than being linear, I believe it is sharply exponential, or J-shaped with a single large spike near the scale separation point.
I agree, though as a minor note if cost is the Y-axis the graph has to have a vertical asymptote, so it has to grow much faster than exponential at the end. Regardless, I don’t think we can be confident that consciousness occurs at an inflection point or a noticeable bend.
The jumping point is where the pseudo-mind becomes a real actual conscious observer of it’s own.
I suspect that some pseudo-minds must be conscious observers some of the time, but that they can be turned off most of the time and just be updated offline with experiences that their conscious mind will integrate and patch up without noticing. I’m not sure this would work with many mind-types, but I think it would work with human minds, which have a strong bias to maintaining coherence, even at the cost of ignoring reality. If I’m being simulated, I suspect that this is happening even to me on a regular basis, and possibly happening much more often the less I interact with someone.
Perhaps not your life in particular, but human life on earth today?
Simulating 6 billion humans will probably be the only way to truly understand what happened today from the perspective of our future posthuman descendants. The alternatives are . . . creating new physical planets? Simulation will be vastly more efficient than that.
Updating on the condition that we closely match the ancestors of our simulators, I think it’s pretty reasonable that we could be chosen to be simulated. This is really the only plausible reason I can think of to chose us in particular. I’m still dubious as to the value doing so will have to our descendants.
I’m a little confused about the 10^18 number.
Actually, I made a mistake, so it’s reasonable to be confused. 20 W seems to be a reasonable upper limit to the cost of simulating a human mind. I don’t know how much lower the lower bound should be, but it might not be more than an order of magnitude less. This gives 10^11 W for six billion, (4x) 10^18 J for one year.
I don’t think it’s reasonable to expect all the matter in the domain of a future civilization to be used to its computational capacity. I think it’s much more likely that the energy output of the Milky Way is a reasonably likely bound to how much computation will go on there. This certainly doesn’t have to be the case, but I don’t see superintelligences annihilating matter at a dramatically faster rate in order to provide massively more power to the remainder of the matter around. The universe is going to die soon enough as it is. (I could be very short sighted about this) Anyway, energy output of the Milky Way is around 5x10^36 W. I divided this by Joules instead of by Watts, so the second number I gave was 10^18, when it should have been (5x) 10^24.
I maintain that energy, not quantum limits of computation in matter, will bound computational cost on the large scale. Throwing our moon into the Sun in order to get energy out of it is probably a better use of it as raw materials than turning it into circuitry. Likewise for time compression, convince me that power isn’t a problem.
I understand that for any mind, there is probably an “ideal simulation level” which has the fidelity of a more expensive simulation at a much lower cost, but I still don’t understand why human-mind equivalents are important here.
Simply because we are discussing simulating the historical period in which we currently exist.
Why should the value of simulating minds be linear rather than logarithmic in the number of minds?
The premise of the SA is that the posthuman ‘gods’ will be interested in simulating their history. That history is not dependent on a smattering of single humans isolated in boxes, but the history of the civilization as a whole system.
Agreed, but I also think that the cost of simulating the relevant stuff necessary to simulate N minds might be close to linear in N.
If the N minds were separated by vast gulfs of space and time this would be true, but we are talking about highly connected systems.
Imagine the flow of information in your brain. Imagine the flow of causality extending back in time, the flow of information weighted by it’s probabilistic utility in determining my current state.
The stuff in immediate vicinity to me is important, and the importance generally falls off according to an inverse square law with distance away from my brain. Moreover, even from the stuff near me at one time step, only a tiny portion of it is relevant. At this moment my brain is filtering out almost everything except the screen right in front of me, which can be causally determined by a program running on my computer, dependent on recent information in another computer in a server somewhere in the midwest a little bit ago, which was dependent on information flowing out from your brain previously . .. and so on.
So simulating me would more or less require your simulation as well, it’s very hard to isolate a mind. You might as well try to simulate just my left prefrontal cortex. The entire distinction of where one mind begins and ends is something of spatial illusion that disappears when you map out the full causal web.
Regardless, I don’t think we can be confident that consciousness occurs at an inflection point or a noticeable bend.
If you want to simulate some program running on one computer on a new machine, there is an exact vertical inflection wall in the space of approximations where you get a perfect simulation which is just the same program running on the new machine. This simulated program is in fact indistinguishable from the original.
I suspect that some pseudo-minds must be conscious observers some of the time, but that they can be turned off most of the time and just be updated offline
Yes, but because of the network effects mentioned earlier it would be difficult and costly to do this on a per mind basis. Really it’s best to think of the entire earth as a mind for simulation purposes.
Could you turn off part of cortex and replace it with a rough simulation some of the time without compromising the whole system? Perhaps sometimes, but I doubt that this can give a massive gain.
I’m still dubious as to the value doing so will have to our descendants.
Why do we currently simulate (think about) our history? To better understand ourselves and our future.
I believe there are several converging reasons to suspect that vaguely human-like minds will turn out to be a persistent pattern for a long time—perhaps as persistent as eukaryotic cells. Adapative radiation will create many specializations and variations, but the basic pattern of a roughly 10^15 bit mind and it’s general architecture may turn out to be a fecund replicator and building block for higher level pattern entities.
It seems plausible some of these posthumans will actually descend from biological humans alive today. They will be very interested in their ancestors, and especially the ancestors they new in their former life who died without being uploaded or preserved.
Humans have been thinking about this for a while. If you could upload and enter virtual heaven, you could have just about anything that you want. However, one thing you may very much desire would be reunification with former loved ones, dead ancestors, and so on.
So once you have enough computational power, I suspect there will be a desire to use it in an attempt to resurrect the dead.
20 W seems to be a reasonable upper limit to the cost of simulating a human mind. I don’t know how much lower the lower bound should be, but it might not be more than an order of magnitude less.
You are basically taking the current efficiency of human brains as the limit, which of course is ridiculous on several fronts. We may not reach the absolute limits of computation, but they are the starting point for the SA.
We already are within six orders of magnitude of the speed limit of ordinary matter (10^9 bit ops/sec vs 10^15), and there is every reason to suspect we will get roughly as close to the density limit.
I maintain that energy, not quantum limits of computation in matter, will bound computational cost on the large scale.
There are several measures—the number of bits storable per unit mass derives how many human souls you can store in memory per unit mass.
Energy relates to the bit operations per second and the speed of simulated time.
I was assuming computing at regular earth temperatures within the range of current brains and computers. At the limits of computation discussed earlier 1 kg of matter at normal temperatures implies an energy flow of around 1 to 20W and can simulate roughly 10^15 virtual humans 10^10 faster than current human rate of thought. This works out to about one hundred years per second.
So at the limits of computation, 1 kg of ordinary matter at room temperature should give about 10^25 human lifetimes per joule. One square meter of high efficiency solar panel could power several hundred kilograms of computational substrate.
So at the limits of computation, future posthuman civilizations could simulate truly astronomical number of human lifetimes in one second using less power and mass than our current civilization.
No need to dissemble planets. Using the whole surface of a planet gives a multiplier of 10^14 over a single kilogram. Using the entire mass only gives a further 10^8 multiple over that or so, and is much much more complex and costly to engineer. (when you start thinking of energy in terms of human souls, this becomes morally relevant)
If this posthuman civilization simulates human history for a billion years instead of a second, this gives another 10^16 multiplier.
Using much more reasonable middle of the road estimates:
Say tech may bottom out at a limit within half (in exponential terms) of the maximum—say 10^13 human lifetimes per kg per joule vs 10^25.
The posthuman civ stabilizes at around 10^10 1kg computers (not much more than we have today).
The posthuman civ engages in historical simulation for just one year. (10^7 seconds).
That is still 10^30 simulated human lifetimes, vs roughly 10^11 lifetimes in our current observational history.
Those are still astronomical odds for observing that we currently live in a sim.
This is very upsetting, I don’t have anything like the time I need to keep participating in this thread, but it remains interesting. I would like to respond completely, which means that I would like to set it aside, but I’m confident that if I do so I will never get back to it. Therefore, please forgive me for only responding to a fraction of what you’re saying.
If the N minds were separated by vast gulfs of space and time this would be true, but we are talking about highly connected systems.
I thought context made it clear that I was only talking about the non-mind stuff being simulated as being an additional cost perhaps nearly linear in N. Very little of what we directly observe overlaps except our interaction with each other, and this was all I was talking about.
Regardless, I don’t think we can be confident that consciousness occurs at an inflection point or a noticeable bend.
Why can’t a poor model (low fidelity) be conscious? We just don’t know enough about consciousness to answer this question.
Yes, but because of the network effects mentioned earlier it would be difficult and costly to do this on a per mind basis. Really it’s best to think of the entire earth as a mind for simulation purposes.
I really disagree, but I don’t have time to exchange each other’s posteriors, so assume this dropped.
However, one thing you may very much desire would be reunification with former loved ones, dead ancestors, and so on [...] So once you have enough computational power, I suspect there will be a desire to use it in an attempt to resurrect the dead.
I think this is evil, but I’m not willing to say whether the future intelligences will agree or care.
You are basically taking the current efficiency of human brains as the limit, which of course is ridiculous on several fronts. We may not reach the absolute limits of computation, but they are the starting point for the SA.
I said it was a reasonable upper bound, not a reasonable lower bound. That seems trivial.
I was assuming computing at regular earth temperatures within the range of current brains and computers. At the limits of computation discussed earlier 1 kg of matter at normal temperatures implies an energy flow of around 1 to 20W and can simulate roughly 10^15 virtual humans 10^10 faster than current human rate of thought. This works out to about one hundred years per second.
Most importantly, you’re assuming that all circuitry performs computation, which is clearly impossible. That leaves us to debate about how much of it can, but personally I see no reason that the computational minimum cost will closely (even in an exponential sense) be approached. I am interested in your reasoning why this should be the case though, so please give me what you can in the way of references that led you to this belief.
Lastly, but most importantly (to me), how strongly do you personally believe that a) you are a simulation and that b) all entities on Earth are full-featured simulations as well?
Conditioning on (b) being true, how long ago (in subjective time) do you think our simulation started, and how many times do you believe it has (or will be) replicated?
Very little of what we directly observe overlaps except our interaction with each other, and this was all I was talking about.
If I was to quantify your ‘very little’ I’d guess you mean say < 1% observational overlap.
Lets look at the rough storage cost first. Ignoring variable data priority through selective attention for the moment, the data resolution needs for a simulated earth can be related to photons incident on the retina and decreases with an inverse square law from the observer.
We can make a 2D simplification and use google earth as an example. If there was just one ‘real’ observer, you’d need full data fidelity for the surface area that observer would experience up close during his/her lifetime, and this cost dominates. Let’s say that’s S, S ~ 100 km^2.
Simulating an entire planet, the data cost is roughly fixed or capped—at 5x10^8 km^2.
So in this model simulating an entire earth with 5 billion people will have a base cost of 5x10^8 km^2, and simulating 5 billion worlds separately will have a cost of 5x10^9 * S.
So unless S is pathetically small (actually less than human visual distance), this implies a large extra cost to the solipsist approach. From my rough estimate of S the solipsist approach is 1,000 times more expensive. This also assumes that humans are randomly distributed, which of course is unrealistic. In reality human populations are tightly clustered which further increases the relative gain of shared simulation.
However, one thing you may very much desire would be reunification with former loved ones, dead ancestors, and so on [...] So once you have enough computational power, I suspect there will be a desire to use it in an attempt to resurrect the dead.
I think this is evil, but I’m not willing to say whether the future intelligences will agree or care.
Evil?
Why?
Most importantly, you’re assuming that all circuitry performs computation, which is clearly impossible.
I’m not sure what you mean by this. Does all of the circuitry of the brain perform computation? Over time, yes. The most efficient brain simulations will of course be emulations—circuits that are very similar to the brain but built on much smaller scales on a new substrate.
That leaves us to debate about how much of it can, but personally I see no reason that the computational minimum cost will closely (even in an exponential sense) be approached
My main reference for the ultimate limits is Seth Lloyd’s “Ultimate Physical Limits of Computation”. The Singularity is Near discusses much of this as well of course (but he mainly uses the more misleading ops per second, which is much less well defined).
Biological circuits switch at 10^3 to 10^4 bits flips/second. Our computers went from around that speed in WWII to the current speed plateau of around 10^9 bit flips/second reached early this century. The theoretical limit for regular molecular matter is around 10^15 bit flips/second. (A black hole could reach a much much higher speed limit, as discussed in Lloyd’s paper). There are experimental circuits that currently approach 10^12 bit flips/second.
In terms of density, we went from about 1 bit / kg around WWII to roughly 10^13 bits / kg today. The brain is about 10^15 bits / kg, so we will soon surpass it in circuit density. The juncture we are approaching (brain density) is about half-way to the maximum of 10^30 bits/kg. This has been analyzed extensively in the hardware community and it looks like we will approach these limits as well sometime this century. It is entirely practical to store 1 bit (or more) per molecule.
Lastly, but most importantly (to me), how strongly do you personally believe that a) you are a simulation and that b) all entities on Earth are full-featured simulations as well?
A and B are closely correlated. Its difficult to quantify my belief in A, but it’s probably greater than 50%.
I’ve thought a little about your last question but I don’t yet even see a route to estimating it. Such questions will probably require a more advanced understanding of simulation.
If there was just one ‘real’ observer, you’d need full data fidelity for the surface area that observer would experience up close during his/her lifetime, and this cost dominates. Let’s say that’s S, S ~ 100 km^2.
I feel like this would make you a terrible video game designer :-P. Why should we bother simulating things in full fidelity, all the time, just because they will eventually be seen? The only full-fidelity simulation we should need is the stuff being directly examined. Much rougher algorithms should suffice for things not being directly observed.
Most importantly, you’re assuming that all circuitry performs computation, which is clearly impossible.
I’m not sure what you mean by this. Does all of the circuitry of the brain perform computation? Over time, yes. The most efficient brain simulations will of course be emulations—circuits that are very similar to the brain but built on much smaller scales on a new substrate.
Heh, my ability to argue is getting worse and worse. You sure you want to continue this thread? What I meant to say (and entirely failed) is that there is an infrastructure cost; we can’t expect to compute with every particle, because we need lots of particles to make sure the others stay confined, get instructions, etc. Basically, not all matter can be a bit at the same time.
It is entirely practical to store 1 bit (or more) per molecule.
Again, infrastructure costs. Can you source this (also Lloyd?)?
For the rest, I’m aware of and don’t dispute the speeds and densities you mention. What I’m skeptical of is that we have evidence that they are practicable; this was what I was looking for. I don’t count previous success of Moore’s Law strong evidence of that we will continue getting better at computation until we hit physical limits. I’m particularly skeptical about how well we will ever do on power consumption (partially because it’s such a hard problem for us now).
I think this is evil, but I’m not willing to say whether the future intelligences will agree or care.
Evil? Why?
The idea that I did not have to live this life, that some entity or civilization has created the environment in which I’ve experienced so much misery, and that they will do it again and again makes me shake with impotent rage. I cannot express how much I would rather having never existed. The fact that they would do this and so much worse (because my life is an astoundingly far cry from the worst that people deal with), again, and again, to trillions upon trillions of living, feeling beings...I cannot express my sorrow. It literally brings me to tears.
This is not sadism; or it would be far worse. It is rather a total neglect of care, a relegation of my values in place of historical interest. However, I still consider this evil in the highest degree.
I do not reject the existence of evil, and therefore this provides no evidence against the hypothesis that I am simulated. However, if I believe that I have a high chance of being simulated, I should do all that I can to prevent such an entity from ever coming to exist with such power, on the off chance that I am one not simulated, and able to prevent such evil from unfolding.
Why should we bother simulating things in full fidelity, all the time, just because they will eventually be seen? The only full-fidelity simulation we should need is the stuff being directly examined. Much rougher algorithms should suffice for things not being directly observed.
Of course you’re on the right track here—and I discussed spatially variant fidelity simulation earlier. The rough surface area metric was a simplification of storage/data generation costs, which is a separate issue than computational cost.
If you want the most bare-bones efficient simulation, I imagine a reverse hierarchical induction approach that generates the reality directly from the belief network of the simulated observer, a technique modeled directly on human dreaming.
However, this is only most useful if the goal is to just generate an interesting reality. If the goal is to regenerate an entire historical period accurately, you cant start with the simulated observers—they are greater unknowns than the environment itself.
The solipsist issue may not have discernible consequences, but overall the computational scaling is sublinear for emulating more humans in a world and probably significant because of the large casual overlap of human minds via language.
It is entirely practical to store 1 bit (or more) per molecule.
Again, infrastructure costs. Can you source this (also Lloyd?)?
What I’m skeptical of is that we have evidence that they are practicable; this was what I was looking for.
The intellectual work required to show an ultimate theoretical limit is tractable, but showing that achieving said limit is impossible in practice is very difficult.
I’m pretty sure we won’t actually hit the physical limits exactly, it’s just a question of how close. If you look at our historical progress in speed and density to date, it suggests that we will probably go most of the way.
Another simple assessment related to the doomsday argument: I don’t know how long this Moore’s Law progression will carry on, but it’s lasted for 50 years now, so I give reasonable odds that it will last another 50. Simple, but surprisingly better than nothing.
A more powerful line of reasoning perhaps is this: as long as there is an economic incentive to continue Moore’s Law and room to push against the physical limits, ceteris paribus, we will make some progress and push towards those limits. Thus, eventually we will reach them.
I’m particularly skeptical about how well we will ever do on power consumption (partially because it’s such a hard problem for us now).
Power density depends on clock rate, which has plateaued. Power efficiency, in terms of ops/joule, increases directly with transistor density.
I think this is evil, but I’m not willing to say whether the future intelligences will agree or care.
Evil? Why?
I cannot express how much I would rather having never existed.
This is somewhat concerning, and I believe, atypical. Not existing is perhaps the worst thing I can possibly imagine, other than infinite torture.
It is rather a total neglect of care, a relegation of my values in place of historical interest.
I’m not sure if ‘historical interest’ is quite the right word. Historical recreation or resurrection might be more accurate.
A paradise designed to maximally suffice current human values and eliminate suffering is not a world which could possibly create or resurrect us.
You literally couldn’t have grown up in that world, the entire idea is a non sequitur. Your mind’s state is a causal chain rooted in the gritty reality of this world with all of it’s suffering.
Imagining that your creator could have assigned you to a different world is like imagining you could have grown up with different parents. You couldn’t have. That would be somebody else completely.
Of course, if said creator exists, and if said creator values what you value in the way you value it (dubious) it could whisk you away to paradise tomorrow.
But I wouldn’t count on that—perhaps said creator is still working on you or doesn’t think paradise is a useful place for you or could care less.
In the face of such uncertainty, we can only task ourselves with building paradise.
However, this is only most useful if the goal is to just generate an interesting reality. If the goal is to regenerate an entire historical period accurately, you cant start with the simulated observers—they are greater unknowns than the environment itself.
I believe we’re arguing along two paths here, and it is getting muddled. Applying to both, I think one can maintain the world-per-person sim much more cheaply than you originally suggested long before one hits the spot where the sim is no longer accurate to the world except where it intersects with the observer’s attention.
Second, from my perspective you’re begging the question, since I was talking about a variety of reasons for simulation and arguing that simulating a single entity seems as reasonable as many—but you seem only to be concerned with historical recreation, in which case it seems obvious to me that a large group of minds is necessary. If we’re only talking about that case, the arguments along this line about the per-mind cost just aren’t very relevant.
I have a 404 on your link, I’ll try later.
Another simple assessment related to the doomsday argument: I don’t know how long this Moore’s Law progression will carry on, but it’s lasted for 50 years now, so I give reasonable odds that it will last another 50. Simple, but surprisingly better than nothing.
Interesting, I haven’t heard that argument applied to Moore’s Law. Question: you arrive at a train crossing (there are no other cars on the road), and just as you get there, a train begins to cross before you can. Something goes wrong, and the train stops, and backs up, and goes forward, and stops again, and keeps doing this. (This actually happened to me). 10 minutes later, should you expect that you have around 10 minutes left? After those are passed, should your new expectation be that you have around 20 minutes left?
The answer is possibly yes. I think better results would be obtained by using a Jeffreys Prior. However, I’ve talked to a few statisticians about this problem, and no one has given me a clear answer. I don’t think they’re used to working with so little data.
A more powerful line of reasoning perhaps is this: as long as there is an economic incentive to continue Moore’s Law and room to push against the physical limits, ceteris paribus, we will make some progress and push towards those limits. Thus, eventually we will reach them.
Revise to say “and room to push against the practicable limits” and you will see where my argument lies despite my general agreement with this statement.
Power efficiency, in terms of ops/joule, increases directly with transistor density.
To my knowledge, this is incorrect. Increases in transistor density have dramatically increased circuit leakage (because of bumping into quantum tunneling), requiring more power per transistor in order to accurately distinguish one path from another. I saw a roundtable about proposed techniques for increasing processor efficiency. None of the attendees objected to the introduction, which mentioned that the increased waste heat from modern circuits was rising at a faster exponential than circuit density, and would render all modern circuit designs inoperable if there were to be logically extended without addressing the problem of quantum leakage.
I cannot express how much I would rather having never existed.
This is somewhat concerning, and I believe, atypical. Not existing is perhaps the worst thing I can possibly imagine, other than infinite torture.
If you didn’t exist in the first place, you wouldn’t care. Do you think you’ve done so much good for the world that your absence could be “the world thing you can possibly imagine, other than infinite torture”?
Regardless, I’m quite atypical in this regard, but not unique.
You literally couldn’t have grown up in that world, the entire idea is a non sequitur. Your mind’s state is a causal chain rooted in the gritty reality of this world with all of it’s suffering.
Imagining that your creator could have assigned you to a different world is like imagining you could have grown up with different parents. You couldn’t have. That would be somebody else completely.
And wouldn’t that be so much better.
You propose that not existing would be a terrible evil. But how much better, for all the trillions upon trillions you’re proposing must suffer for the creator’s whims, would it be to have that computational substrate be used to host entities that have amazingly positive, productive, maximally Fun lives? I know I couldn’t have existed in a paradise, but if I’m a sim, there are cycles that could be used for paradise that have been abandoned to create misery and strife.
Again, I think that this may be the world we really are in. I just can’t call it a moral one.
I was talking about a variety of reasons for simulation and arguing that simulating a single entity seems as reasonable as many—but you seem only to be concerned with historical recreation.
Historical recreation currently seems to be the best rationale for a superintelligence to simulate this timeslice, although there are probably other motivations as well.
Power efficiency, in terms of ops/joule, increases directly with transistor density.
To my knowledge, this is incorrect. Increases in transistor density have dramatically increased circuit leakage (because of bumping into quantum tunneling), requiring more power per transistor in order to accurately distinguish one path from another.
If that was actually the case, then there would be no point to moving to a new technology node!
Yes leakage is a problem at the new tech nodes, but of course power per transistor can not possibly be increasing. I think you mean power per surface area has increased.
Shrinking a circuit by half in each dimension makes the wires thinner, shorter and less resistant, decreasing power use per transistor just as you’d think. Leakage makes this decrease somewhat less than the shrinkage rate, but it doesn’t reverse the entire trend.
There are also other design trends that can compensate and overpower this to an extent, which is why we have a plethora of power efficient circuits in the modern handheld market.
“which mentioned that the increased waste heat from modern circuits was rising at a faster exponential than circuit density”
Do you remember when this was from or have a link? I could see that being true when speeds were also increasing, but that trend has stopped or reversed.
I recall seeing some slides from NVidia where they are claiming there next GPU architecture will cut power use per transistor dramatically as well at several times the rate of shrinkage.
You propose that not existing would be a terrible evil. But how much better, for all the trillions upon trillions you’re proposing must suffer for the creator’s whims, would it be to have that computational substrate be used to host entities that have amazingly positive, productive, maximally Fun lives?
Even if the goal is maximizing fun, creating some historical sims for the purpose of resurrecting the dead may serve that goal. But I really doubt that current-human-fun-maximization is an evolutionary stable goal system.
I imagine that future posthuman morality and goals will evolve into something quite different.
Knowledge is a universal feature of intelligence. Even the purely mathematical hypothetical superintelligence AIXI would end up creating tons of historical simulations—and that might be hopelessly brute force, but nonetheless superintelligences with a wide variety of goal systems would find utility in various types of simulation.
Historical recreation currently seems to be the best rationale for a superintelligence to simulate this timeslice, although there are probably other motivations as well.
Much of the information from the past is probably irretrievably lost to us. If the information input into the simulation were not precisely the same as the actual information from that point in history, the differences would quickly propagate so that the simulation would bear little resemblance to the history. Supposing the individuals in question did have access to all the information they’d need to simulate the past, they’d have no need for the simulation, because they’d already have complete informational access to the past. It suffers similar problems to your sandboxed anthropomorphic AI proposal; provided you have all the resources necessary to actually do it, it ceases to be a good idea.
There are other possible motivations, but it’s not clear that there are any others that are as good or better, so we have little reason to suppose it will ever happen.
Historical recreation currently seems to be the best rationale for a superintelligence to simulate this timeslice, although there are probably other motivations as well.
This seems to be overly restrictive, but I don’t mind confining the discussion to this hypothesis.
I think you mean power per surface area has increased.
Yes, you are correct.
Do you remember when this was from or have a link? I could see that being true when speeds were also increasing, but that trend has stopped or reversed.
The roundtable was at SC′08, a while after speeds had stabilized, and since it is a supercomputing conference, the focus was on massively parallel systems. It was part of this.
I really doubt that current-human-fun-maximization is an evolutionary stable goal system. I imagine that future posthuman morality and goals will evolve into something quite different.
Without needing to dispute this, I can remain exceptionally upset that whatever their future morality is, it is blind to suffering and willing to create innumerable beings that will suffer in order to gain historical knowledge. Does this really not bother you in the slightest?
The roundtable was at SC′08, a while after speeds had stabilized, and since it is a supercomputing conference, the focus was on massively parallel systems. It was part of this.
While the leakage issue is important and I want to read a little more about this reference, I don’t think that any single such current technical issue is nearly sufficient to change the general analysis. There have always been major issues on the horizon, the question is more of the increase in engineering difficulty as we progress vs the increase in our effective intelligence and simulation capacity.
In the specific case of leakage, even if it is a problem that persists far into the future, it just slightly lowers the growth exponent as we just somewhat lower the clock speeds. And even if leakage can never be fully prevented, eventually it itself can probably be exploited for computation.
I really doubt that current-human-fun-maximization is an evolutionary stable goal system. I imagine that future posthuman morality and goals will evolve into something quite different.
Without needing to dispute this, I can remain exceptionally upset that whatever their future morality is, it is blind to suffering and willing to create innumerable beings that will suffer in order to gain historical knowledge.
As I child I liked Mcdonalds, bread, plain pizza and nothing more—all other foods were poisonous. I was convinced that my parent’s denial of my right to eat these wonderful foods and condemn me to terrible suffering as a result was a sure sign of their utter lack of goodness.
Imagine if I could go back and fulfill that child’s wish to reduce it’s suffering. It would never then evolve into anything like my current self, and in fact may evolve into something that would suffer more or at the very least wish that it could be me.
Imagine if we could go back in time and alter our primate ancestors to reduce their suffering. The vast majority of such naive interventions would cripple their fitness and wipe out the lineage. There is probably a tiny set of sophisticated interventions that could simultaneously eliminate suffering and improve fitness, but these altered creatures would not develop into humans.
Our current existence is completely contingent on a great evolutionary epic of suffering on an astronomical scale. But suffering itself is just one little component of that vast mechanism, and forms no basis from which to judge the totality.
You made the general point earlier, which I very much agree with, about opportunity cost. Simulating humanity’s current time-line has an opportunity cost in the form of some paradise that could exist in it’s place. You seem to think that the paradise is clearly better, and I agree: from our current moral perspective.
In the end of the day morality is governed by evolution. There is an entire landscape of paradises that could exist, the question is what fitness advantage do they provide their creator? The more they diverge from reality, the less utility they have in advancing knowledge of reality towards closure.
It looks like earth will evolve into a vast planetary hierarchical superintelligence, but ultimately it will probably be just one of many, and still subject to evolutionary pressure.
In the specific case of leakage, even if it is a problem that persists far into the future, it just slightly lowers the growth exponent as we just somewhat lower the clock speeds.
I disagree; I think that problems like this, unresolved, may or may not decrease the base of our exponent, but will cap its growth earlier.
I don’t think that any single such current technical issue is nearly sufficient to change the general analysis. There have always been major issues on the horizon, the question is more of the increase in engineering difficulty as we progress vs the increase in our effective intelligence and simulation capacity.
On this point, we disagree, and I may be on the unpopular side of this agreement. I don’t see how past increases that have required technological revolutions can be considered more than weak evidence for future technological revolutions. I actually think it quite likely that increase in computational power per Joule will bottom out in ten to twenty years. I wouldn’t be too surprised if exponential increase lasts thirty years, but forty seems unlikely, and fifty even less likely.
Imagine if we could go back in time and alter our primate ancestors to reduce their suffering. The vast majority of such naive interventions would cripple their fitness and wipe out the lineage. There is probably a tiny set of sophisticated interventions that could simultaneously eliminate suffering and improve fitness, but these altered creatures would not develop into humans.
I don’t care. We aren’t talking about destroying the future of intelligence by going back in time. We’re talking about repeating history umpteen many times, creating suffering anew each time. It sounds to me like you are insisting that this suffering is worthwhile, even if the result of all of it will never be more than a data point in a historian’s database.
We live in a heartbreaking world. Under the assumption that we are not in a simulation, we can recognize facts like ‘suffering is decreasing over time’ and realize that it is our job to work to aid this progress. Under the assumption that we are in a simulation, we know that the capacity for this progress is already fully complete, and the agents who control it simply don’t care. If we are being simulated, it means that one or more entities have chosen to create unimaginable quantities of suffering for their own purposes—to your stated belief, for historical knowledge.
Your McDonald’s example doesn’t address this in the slightest. You were already a living, thinking being, and your parents took care of you in the right way in an attempt to make your future life better. They couldn’t have chosen before you were born to instead create someone who would be happier, smarter, wiser, and better in every way. If they could have, wouldn’t it be upsetting that they chose not to?
Given the choice between creating agents that have to endure suffering for generations upon generations, and creating agents that will have much more positive, productive lives, why are you arguing for the side that chooses the former? Of course the former and latter are entirely different entities, but that serves as no argument whatsoever for choosing the former!
A person running such a simulation could create a simulated afterlife, without suffering, where each simulated intelligence would go after dying in the simulated universe. It’s like a nice version of Pascal’s Wager, since there’s no wagering involved. Such an afterlife wouldn’t last infinitely long, but it could easily be made long enough to outweigh any suffering in the simulated universe.
So far the only person who seems dedicated to making such a simulation is jacob cannell, and he already seems to be having enough trouble separating the idea from cached theistic assumptions.
The simulated afterlife wouldn’t need to outweigh the suffering in the first universe according to our value system, only according to the value system of the aliens who set up the simulation.
I don’t see how past increases that have required technological revolutions can be considered more than weak evidence for future technological revolutions.
Technology doesn’t really advance through ‘revolutions’, it evolves. Some aspects of that evolution appear to be rather remarkably predictable.
That aside, the current predictions do posit a slow-down around 2020 for the general lithography process, but there are plenty of labs researching alternatives. As the slow-down approaches, their funding and progress will accelerate.
But there is a much more fundamental and important point to consider, which is that circuit shrinkage is just one dimension of improvement amongst several. As that route of improvement slows down, other routes will become more profitable.
For example, for AGI algorithms, current general purpose CPUs are inefficient by a factor of perhaps around 10^4. That is a decade of exponential gain right there just from architectural optimization. This route—neuromorphic hardware and it’s ilk—currently receives a tiny slice of the research budget, but this will accelerate as AGI advances and would accelerate even more if the primary route of improvement slowed.
Another route of improvement is exponentially reducing manufacturing cost. The bulk of the price of high-end processors pays for the vast amortized R&D cost of developing the manufacturing node within the timeframe that the node is economical. Refined silicon is cheap and getting cheaper, research is expensive. The per transistor cost of new high-end circuitry on the latest nodes for a CPU or GPU is 100 times more expensive than the per transistor cost of bulk circuitry produced on slightly older nodes.
So if moore’s law stopped today, the cost of circuitry would still decay down to the bulk cost. This is particularly relevant to neurmorphic AGI designs as they can use a mass of cheap repetitive circuitry, just like the brain. So we have many other factors that will kick in even as moore’s law slows.
I suspect that we will hit a slow ramping wall around or by 2020, but these other factors will kick in and human-level AGI will ramp up, and then this new population and speed explosion will drive the next S-curve using a largely new and vastly more complex process (such as molecular nano-tech) that is well beyond our capability or understanding.
I don’t care. We aren’t talking about destroying the future of intelligence by going back in time.
It’s more or less equivalent from the perspective of a historical sim. A historical sim is a recreation of some branch of the multiverse near your own incomplete history that you then run forward to meet your present.
It sounds to me like you are insisting that this suffering is worthwhile
My existence is fully contingent on the existence of my ancestors in all of their suffering glory. So from my perspective, yes their suffering was absolutely worthwhile, even if it wasn’t from their perspective.
Likewise, I think that it is our noble duty to solve AI, morality, and control a Singularity in order to eliminate suffering and live in paradise.
I also understand that after doing that we will over time evolve into beings quite unlike what we are now and eventually look back at our prior suffering and view it from an unimaginably different perspective, just as my earlier mcdonald’s loving child-self evolved into a being with a completely different view of it’s prior suffering.
your parents took care of you in the right way in an attempt to make your future life better.
It was right from both their and my current perspective, it was absolutely wrong from my perspective at the time.
They couldn’t have chosen before you were born to instead create someone who would be happier, smarter, wiser, and better in every way. If they could have, wouldn’t it be upsetting that they chose not to?
Of course! Just as we should create something better than ourselves. But ‘better’ is relative to a particular subjective utility function.
I understand that my current utility function works well now, that it is poorly tuned to evaluate the well-being of bacteria, just as poorly tuned to evaluate the well-being of future posthuman godlings, and most importantly—my utility function or morality will improve over time.
Given the choice between creating agents that have to endure suffering for generations upon generations, and creating agents that will have much more positive, productive lives, why are you arguing for the side that chooses the former?
Imagine you are the creator. How do you define ‘positive’ or ‘productive’? From your perspective, or theirs?
There are an infinite variety of uninteresting paradises. In some virtual humans do nothing but experience continuous rapturous bliss well outside the range of current drug-induced euphoria. There are complex agents that just set their reward functions to infinity and loop.
There are also a spectrum of very interesting paradises, all having the key differentiator that they evolve. I suspect that future godlings will devote most of their resources to creating these paradises.
I also suspect that evolution may operate again at an intergalactic or higher level, ensuring that paradises and all simulations somehow must pay for themselves.
At some point our descendants will either discover for certain they are in a sim and integrate up a level, or they will approach local closure and perhaps discover an intergalactic community. At that point we may have to compete with other singularity-civilizations, and we may have the opportunity to historically intervene on pre-singularity planets we encounter. We’d probably want to simulate any interventions before preceeding, don’t you think?
A historical recreation can develop into a new worldline with it’s own set of branching paradises that increase overall variation in a blossoming metaverse.
If you could create a new big bang, an entire new singularity and new universe, would you?
You seem to be arguing that you would not because it would include humans who suffer. I think this ends up being equivalent to arguing the universe should not exist.
At some point our descendants will either discover for certain they are in a sim, or they will approach local closure and perhaps discover an intergalactic community. At that point we may have to compete with other singularity-civilizations, and we may have the opportunity to historically intervene on pre-singularity planets we encounter. We’d probably want to simulate any interventions before preceeding, don’t you think?
If we had enough information to create an entire constructed reality of them in simulation, we’d have much more than we needed to just go ahead and intervene.
If you could create a new big bang, an entire new singularity and new universe, would you? You seem to be arguing that you would not because it would include humans who suffer. I think this ends up being equivalent to arguing the universe should not exist.
Some people would argue that it shouldn’t (this is an extreme of negative utilitarianism.) However, since we’re in no position to decide whether the universe gets to exist or not, the dispute is fairly irrelevant. If we’re in a position to decide between creating a universe like ours, creating one that’s much better, with more happiness and productivity and less suffering, and not creating one at all, though, I would have an extremely poor regard for the morality of someone who chose the first.
My existence is fully contingent on the existence of my ancestors in all of their suffering glory. So from my perspective, yes their suffering was absolutely worthwhile, even if it wasn’t from their perspective.
If my descendants think that all my suffering was worthwhile so that they could be born instead of someone else, then you know what? Fuck them. I certainly have a higher regard for my own ancestors. If they could have been happier, and given rise to a world as good as better than this one, then who am I to argue that they should have been unhappy so I could be born instead? If, as you point out
A historical recreation can develop into a new worldline with it’s own set of branching paradises that increase overall variation in a blossoming metaverse.
then why not skip the historical recreation and go straight to simulating the paradises?
For example, for AGI algorithms, current general purpose CPUs are inefficient by a factor of perhaps around 10^4. That is a decade of exponential gain right there just from architectural optimization.
I’m curious how you’ve reached this conclusion given how little we know about what AGI algorithms would look like.
For example, for AGI algorithms, current general purpose CPUs are inefficient by a factor of perhaps around 10^4. That is a decade of exponential gain right there just from architectural optimization.
I’m curious how you’ve reached this conclusion given how little we know about what AGI algorithms would look like.
The particular type of algorithm is actually not that important. There is a general speedup in moving from a general CPU-like architecture to a specialized ASIC—once you are willing to settle on the algorithms involved.
There is another significant speedup moving into analog computation.
Also, we know enough about the entire space of AI sub-problems to get a general idea of what AGI algorithms look like and the types of computations they need. Naturally the ideal hardware ends up looking much more like the brain than current von neumann machines—because the brain evolved to solve AI problems in an energy efficient manner.
If you know your are working in the space of probabilistic/bayesian like networks, exact digital computations are extremely wasteful. Using ten or hundreds of thousands of transistors to do an exact digital multiply is useful for scientific or financial calculations, but it’s a pointless waste when the algorithm just needs to do a vast number of probabilistic weighted summations, for example.
Thanks. Hefty read, but this one paragraph is worth quoting:
Statistical inference algorithms involve parsing large quantities of noisy (often analog)
data to extract digital meaning. Statistical inference algorithms are ubiquitous and
of great importance. Most of the neurons in your brain and a growing number of
CPU cycles on desk-tops are spent running statistical inference algorithms to perform compression, categorization, control, optimization, prediction, planning, and learning.
I had forgot that term, statistical inference algorithms, need to remember that.
Well, there’s also another quote worth quoting, and in fact the quote that is in my Mnemosyne database and which enabled me to look that thesis up so fast...
“In practice replacing digital computers with an alternative computing paradigm is a risky proposition.
Alternative computing architectures, such as parallel digital computers have not tended to be commercially viable, because Moore’s Law has consistently enabled conventional von Neumann architectures to render alternatives unnecessary.
Besides Moore’s Law, digital computing also benefits from mature tools and expertise for optimizing performance at all levels of the system: process technology, fundamental circuits, layout and algorithms.
Many engineers are simultaneously working to improve every aspect of digital technology, while alternative technologies like analog computing do not have the same kind of industry juggernaut pushing them forward.”
This is true in general but this particular statement appears out of date:
’Alternative computing architectures, such as parallel digital computers have not tended to be commercially viable”
That was true perhaps circa 2000, but we hit a speed/heat wall and since then everything has been going parallel.
You may see something similar happen eventually with analog computing once the market for statistical inference computation is large enough and or we approach other constraints similar to the speed/heat wall.
The particular type of algorithm is actually not that important. There is a general speedup in moving from a general CPU-like architecture to a specialized ASIC—once you are willing to settle on the algorithms involved.
Ok. But this prevents you from directly improving your algorithms. And if the learning mechanisms are to be highly flexible (like say those of a human brain) then the underlying algorithms may need to modify a lot even to just approximate being an intelligent entity. I do agree that given a fixed algorithm this would plausibly lead to some speed-up.
There is another significant speedup moving into analog computation.
A lot of things can’t be put into analog. For example, what if you need factor large numbers. And making analog and digital stuff interact is difficult.
Also, we know enough about the entire space of AI sub-problems to get a general idea of what AGI algorithms look like and the types of computations they need. Naturally the ideal hardware ends up looking much more like the brain than current von neumann machines—because the brain evolved to solve AI problems in an energy efficient manner.
This doesn’t follow. The brain evolved through a long path of natural selection. It isn’t at all obvious that the brain is even highly efficient at solving AI-type problems, especially given that humans have only needed to solve much of what we consider standard problems for a very short span of evolutionary history (and note that general mammal brain architecture looks very similar to ours).
Ok. But this prevents you from directly improving your algorithms.
Yes—which is part of the reason there is a big market for CPUs.
And if the learning mechanisms are to be highly flexible (like say those of a human brain) then the underlying algorithms may need to modify a lot even to just approximate being an intelligent entity.
Not necessarily. For example, the cortical circuit in the brain can be reduced to an algorithm which would include the learning mechanism built in. The learning can modify the network structure to a degree but largely adjusts synaptic weights. That can be described as (is equivalent to) a single fixed algorithm. That algorithm in turn can be encoded into an efficient circuit. The circuit would learn just as the brain does, no algorithmic changes ever needed past that point, as the self-modification is built into the algorithm.
A modern CPU is a jack-of all trades that is designed to do many things, most of which have little or nothing to do with the computational needs of AGI.
A lot of things can’t be put into analog. For example, what if you need factor large numbers. And making analog and digital stuff interact is difficult.
If the AGI need to factor large numbers, it can just use an attached CPU. Factoring large numbers is easy compared to reading this sentence about factoring large numbers and understanding what that actually means.
It isn’t at all obvious that the brain is even highly efficient at solving AI-type problems,
The brain has roughly 10^15 noisy synapses that can switch around 10^3 times per second and store perhaps a bit each as well. (computation and memory integrated)
My computer has about 10^9 exact digital transistors in it’s CPU & GPU that can switch around 10^9 times per second. It has around the same amount of separate memory and around 10^13 bits of much slower disk storage.
These systems have similar peak throughputs of about 10^18 bits/second, but they are specialized for very different types of computational problems. The brain is very slow but massively wide, the computer is very narrow but massively fast.
The brain is highly specialized and extremely adept at doing typical AGI stuff—vision, pattern recognition, inference, and so on—problems that are suited to massively wide but slow processing with huge memory demands.
Our computers are specialized and extremely adept at doing the whole spectrum of computational problems brains suck at—problems that involve long complex chains of exact computations, problems that require massive speed and precision but less bulk processing and memory.
So to me, yes it’s obvious that the brain is highly efficient at doing AGI-type stuff—almost because that’s how we define AGI-type stuff—its all the stuff that brains are currently much better than computers at!
Not necessarily. For example, the cortical circuit in the brain can be reduced to an algorithm which would include the learning mechanism built in. The learning can modify the network structure to a degree but largely adjusts synaptic weights. That can be described as (is equivalent to) a single fixed algorithm. That algorithm in turn can be encoded into an efficient circuit. The circuit would learn just as the brain does, no algorithmic changes ever needed past that point, as the self-modification is built into the algorithm.
This limits the amount of modification one can do. Moreover, the more flexible your algorithm the less you gain from hard-wiring it.
The brain is highly specialized and extremely adept at doing typical AGI stuff—vision, pattern recognition, inference, and so on—problems that are suited to massively wide but slow processing with huge memory demands.
No, we don’t know that the brain is “extremely adept” at these things. We just know that it is better than anything else that we know of. That’s not at all the same thing. The brain’s architecture is formed by a succession of modifications to much simpler entities. The successive, blind modification has been stuck with all sorts of holdovers from our early chordate ancestors and a lot from our more recent ancestors.
If the AGI need to factor large numbers, it can just use an attached CPU. Factoring large numbers is easy compared to reading this sentence about factoring large numbers and understanding what that actually means.
Easy is a misleading term in this context. I certainly can’t factor a forty digit number but for a computer that’s trivial. Moreover, some operations are only difficult because we don’t know an efficient algorithm. In any event, if your speedup is only occuring for the narrow set of tasks which humans can do decently such as vision, then you aren’t going to get a very impressive AGI. The ability to engage in face recognition if it takes you only a tiny amount of time that it would for a person to do is not an impressive ability.
The circuit would learn just as the brain does, no algorithmic changes ever needed past that point, as the self-modification is built into the algorithm.
This limits the amount of modification one can do.
Limits it compared to what?. Every circuit is equivalent to a program. The circuit of a general processor is equivalent to a program which simulates another circuit—the program which it keeps in memory.
Current Von Neumman processors are not the only circuits which have this simulation-flexibility. The brain has similar flexibility using very different mechanisms.
Finally, even if we later find out that lo and behold, the inference algorithm we hard-coded into our AGI circuits was actually not so great, and somebody comes along with a much better one . . . that is still not an argument for simulating the algorithm in software.
Moreover, the more flexible your algorithm the less you gain from hard-wiring it.
Not at all true. The class of statistical inference algorithms including Bayesian Networks and the cortex are both extremely flexible and greatly benefit from ‘hard-wiring’ it.
The brain is highly specialized and extremely adept at doing typical AGI stuff—vision, pattern recognition, inference, and so on—problems that are suited to massively wide but slow processing with huge memory demands.
No, we don’t know that the brain is “extremely adept” at these things. We just know that it is better than anything else that we know of.
This is like saying we don’t know that Usain Bolt is extremely adept at running, he’s just better than anything else that we know of. The latter sentence in each case of course is true, but it doesn’t impinge on the former.
But my larger point was that the brain and current computers occupy two very different regions in the space of possible circuit designs, and are rather clearly optimized for a different slice over the space of computational problems.
There are some routes that we can obviously improve on the brain at the hardware level. Electronic circuits are orders of magnitude faster, and eventually we can make them much denser and thus much more massive.
However, it is much more of an open question in computer science if we will ever be able to greatly improve on the statistical inference algorithm used in the cortex. It is quite possible that evolution had enough time to solve that problem completely—or at least reach some nearly global maxima.
The brain’s architecture is formed by a succession of modifications to much simpler entities.
Yes—this is an excellent strategy for solving complex optimization problems.
If the AGI need to factor large numbers, it can just use an attached CPU. Factoring large numbers is easy compared to reading this sentence about factoring large numbers and understanding what that actually means.
Easy is a misleading term in this context.
Yes, and on second thought—largely mistaken. To be more precise we should speak of computational complexity and bitops. The best known factorization algorithms are running time exponential for the number of input bits. That makes them ‘hard’ in the scalability sense. But factoring small primes is still easy in the absolute cost sense.
Factoring is also easy in the algorithmic sense, as the best algorithms are very simple and short. Physics is hard in the algorithmic sense, AGI seems to be quite hard, etc.
In any event, if your speedup is only occuring for the narrow set of tasks which humans can do decently such as vision, then you aren’t going to get a very impressive AGI
The cortex doesn’t have a specialized vision circuit—there appears to be just one general purpose circuit it uses for everything. The visual regions become visual regions on account of . . processing visual input data.
AGI hardware could take advantage of specialized statistical inference circuitry and still be highly general.
I’m having a hard time understanding what you really mean by saying “the narrow set of tasks which humans can do decently such as vision”. What about quantum mechanics, computer science, mathematics, game design, poetry, economics, sports, art, or comedy? One could probably fill a book with the narrow set of tasks that humans can do decently. Of course, that other section of the bookstore—filled with books about things computers can do decently, is growing at an exciting pace.
The ability to engage in face recognition if it takes you only a tiny amount of time that it would for a person to do is not an impressive ability.
I’m not sure what you mean by this or how it relates. If you could do face recognition that fast . . it’s not impressive?
The main computational cost of every main competing AGI route I’ve seen involves some sort of deep statistical inference, and this amounts to a large matrix multiplication possibly with some non-linear stepping or a normalization. Neural nets, bayesian nets, whatever—if you look at the mix of required instructions, it amounts to a massive repetition of simple operations that are well suited to hardware optimization.
Finally, even if we later find out that lo and behold, the inference algorithm we hard-coded into our AGI circuits was actually not so great, and somebody comes along with a much better one . . . that is still not an argument for simulating the algorithm in software.
If we have many generations of rapid improvement of the algorithms this will be much easier if one doesn’t need to make new hardware each time.
Not at all true. The class of statistical inference algorithms including Bayesian Networks and the cortex are both extremely flexible and greatly benefit from ‘hard-wiring’ it.
The general trend should still occur this way. I’m also not sure that you can reach that conclusion about the cortex given that we don’t have a very good understanding of how the brain’s algorithms function.
he cortex doesn’t have a specialized vision circuit—there appears to be just one general purpose circuit it uses for everything. The visual regions become visual regions on account of . . processing visual input data.
That seems plausibly correct but we don’t actually know that. Given how much humans rely on vision it isn’t at all implausible that there have been subtle genetic tweaks that make our visual regions more effective in processing visual data (I don’t know the literature in this area at all).
To be more precise we should speak of computational complexity and bitops. The best known factorization algorithms are running time exponential for the number of input bits.
Incorrect, the best factoring algorithms are subexponential. See for example the quadratic field sieve and the number field sieve both of which have subexponential running time. This has been true since at least the early 1980s (there are other now obsolete algorithms that were around before then that may have had slightly subexponential running time. I don’t know enough about them in detail to comment.)
But factoring small primes is still easy in the absolute cost sense.
Factoring primes is always easy. For any prime p, it has no non-trivial factorizations. You seem to be confusing factorization with primality testing. The second is much easier than the first; we’ve had Agrawal’s algorithm which is provably polynomial time for about a decade. Prior to that we had a lot of efficient tests that were empirically faster than our best factorization procedures. We can determine the primality of numbers much larger than those we can factor.
Factoring is also easy in the algorithmic sense, as the best algorithms are very simple and short.
Really? The general number field sieve is simple and short? Have you tried to understand it or write an implementation? Simple and short compared to what exactly?
I’m having a hard time understanding what you really mean by saying “the narrow set of tasks which humans can do decently such as vision”. What about quantum mechanics, computer science, mathematics, game design, poetry, economics, sports, art, or comedy? One could probably fill a book with the narrow set of tasks that humans can do decently.
There are some tasks where we can argue that humans are doing a good job by comparison to others in the animal kingdom. Vision is a good example of this (we have some of the best vision of any mammal.) The rest are tasks which no other entities can do very well, and we don’t have any good reason to think humans are anywhere near good at them in an absolute sense. Note also that most humans can’t do math very well (Apparently 10% or so of my calculus students right now can’t divide one fraction by another). And the vast majority of poetry is just awful. It isn’t even obvious to me that the “good” poetry isn’t labeled that way in part simply from social pressure.
I’m not sure what you mean by this or how it relates. If you could do face recognition that fast . . it’s not impressive?
A lot of the tasks that humans have specialized in are not generally bottlenecks for useful computation. Improved facial recognition isn’t going to help much with most of the interesting stuff, like recursive self-improvement, constructing new algorithms, making molecular nanotech, finding a theory of everything, figuring out how Fred and George tricked Rita, etc.
The main computational cost of every main competing AGI route I’ve seen involves some sort of deep statistical inference, and this amounts to a large matrix multiplication possibly with some non-linear stepping or a normalization. Neural nets, bayesian nets, whatever—if you look at the mix of required instructions, it amounts to a massive repetition of simple operations that are well suited to hardware optimization.
Incorrect, the best factoring algorithms are subexponential.
To clarify, subexponential does not mean polynomial, but super-polynomial.
(Interestingly, while factoring a given integer is hard, there is a way to get a random integer within [1..N] and its factorization quickly. See Adam Kalai’s paper Generating Random Factored Numbers, Easily (PDF).
This is mostly irrelevant, but think complexity theorists use a weird definition of exponential according to which GNFS might still be considered exponential—I know when they say “at most exponential” they mean O(e^(n^k)) rather than O(e^n), so it seems plausible that by “at least exponential” they might mean Omega(e^(n^k)) where now k can be less than 1.
They like keeping things invariant under polynomial transformations of the input, since that’s has been observed to be a somewhat “natural” class. This is one of the areas where it seems to not quite.
Hmm, interesting in the notation that Scott says is standard to complexity theory my earlier statement that factoring is “subexponential” is wrong even though it is slower growing than exponential. But apparently Greg Kuperberg is perfectly happy labeling something like 2^(n^(1/2)) as subexponential.
If we have many generations of rapid improvement of the algorithms this will be much easier if one doesn’t need to make new hard-ware each time.
Yes, and this tradeoff exists today with some rough mix between general processors and more specialized ASICs.
I think this will hold true for a while, but it is important to point out a few subpoints:
If moore’s law slows down this will shift the balance farther towards specialized processors.
Even most ‘general’ processors today are actually a mix of CISC and vector processing, with more and more performance coming from the less-general vector portion of the chip.
For most complex real world problems algorithms eventually tend to have much less room for improvement than hardware—even if algorithmic improvements intially dominate. After a while algorithmic improvements end within the best complexity class and then further improvements are just constants and are swamped by hardware improvement.
Modern GPUs for example have 16 or more vector processors for every general logic processor.
The brain is like a very slow processor with massively wide dedicated statistical inference circuitry.
As a result of all this (and the point at the end of my last post) I expect that future AGIs will be built out of a heterogeneous mix of processors but with the bulk being something like a wide-vector processor with alot of very specialized statistical inference circuitry.
This type of design will still have huge flexibility by having program-ability at the network architecture level—it could for example simulate humanish and various types of mammalian brains as well as a whole range of radically different mind architectures all built out of the same building blocks.
The cortex doesn’t have a specialized vision circuit—there appears to be just one general purpose circuit it uses for everything. The visual regions become visual regions on account of . . processing visual input data.
That seems plausibly correct but we don’t actually know that.
We have pretty good maps of the low-level circuitry in the cortex at this point and it’s clearly built out of a highly repetitive base circuit pattern, similar to how everything is built out of cells at a lower level. I don’t have a single good introductory link, but it’s called the laminar cortical pattern.
Given how much humans rely on vision it isn’t at all implausible that there have been subtle genetic tweaks that make our visual regions more effective in processing visual data (I don’t know the literature in this area at all).
Yes, there are slight variations, but slight is the keyword. The cortex is highly general—the ‘visual’ region develops very differently in deaf people, for example, creating a entirely different audio processing networks much more powerful than what most people have.
The flexibility is remarkable—if you hook up electrodes to the tongue that send a rough visual signal from a camera, in time the cortical regions connected to the tongue start becoming rough visual regions and limited tongue based vision is the result.
Incorrect, the best factoring algorithms are subexponential.
I stand corrected on prime factorization—I saw the exp(....) part and assumed exponential before reading into it more.
Vision is a good example of this (we have some of the best vision of any mammal.) The rest are tasks which no other entities can do very well, and we don’t have any good reason to think humans are anywhere near good at them in an absolute sense.
This is a good point, but note the huge difference between the abilities or efficiency of an entire human mind vs the efficiency of the brain’s architecture or the efficiency of the lower level components from which it is built—such as the laminar cortical circuit.
I think this discussion started concerning your original point:
It isn’t at all obvious that the brain is even highly efficient at solving AI-type problems, especially given that humans have only needed to solve much of what we consider standard problems for a very short span of evolutionary history (and note that general mammal brain architecture looks very similar to ours
The cortical algorithm appears to be a pretty powerful and efficient low level building block. In evolutionary terms it has been around for much longer than human brains and naturally we can expect that it is much closer to optimality in the design configuration space in terms of the components it is built from.
As we go up a level to higher level brain architectures that are more recent in evolutionary terms we should expect there to be more room for improvement.
A lot of the tasks that humans have specialized in are not generally bottlenecks for useful computation.
The mammalian cortex is not specialized for particular tasks—this is the primary advantage of it’s architecture over it’s predecessors (at the cost of a much larger size than more specialized circuitry).
The mammalian cortex is not specialized for particular tasks—this is the primary advantage of it’s architecture over it’s predecessors (at the cost of a much larger size than more specialized circuitry).
How do you reconcile this claim with the fact that some people are faceblind from an early age and never develop the ability to recognize faces? This would suggest that there’s at least one aspect of humans that is normally somewhat hard-wired.
I’ve read a great deal about the cortex, and my immediate reaction to your statement was “no, that’s just not how it works”. (strong priors)
About one minute later on the Prosopagnosia wikipedia article, I find the first reference to this idea (that of congenital Prosopagnosia):
The idea of congenital prosopagnosia appears to be a new theory supported by one researcher and one? study:
Dr Jane Whittaker, writing in 1999, described the case of a Mr. C. and referred to other similar cases (De Haan & Campbell, 1991, McConachie, 1976 and Temple, 1992).[7] The reported cases suggest that this form of the disorder may be heritable and much more common than previously thought (about 2.5% of the population may be affected), although this congenital disorder is commonly accompanied by other forms of visual agnosia, and may not be “pure” prosopagnosia
The last part about it being “commonly accompanied by other forms of visual agnosia” gives it away—this is not anything close to what you originally thought/claimed, even if this new research is actually correct.
Known cases of true prosopagnosia are caused by brain damage—what this research is describing is probably a disorder of the higher region (V4 I believe) which typically learns to recognize faces and other complex objects.
However, there is an easy way to cause prosopagnosia during development—prevent the creature from ever seeing faces.
I dont have the link on hand, but there have been experiments in cats where you mess with their vision—by using grating patterns or carefully controlled visual environments, and you can create cats that literally can’t even see vertical lines.
So even the simplest most basic thing which nature could hard-code—a vertical line feature detector, actually develops from the same extremely flexible general cortical circuit—the same circuit which can learn to represent everything from sounds to quantum mechanics.
Humans can represent a massive number of faces, and in general the brain’s vast information storage capacity over the genome (10^15 ish vs 10^9 ish) more or less require a generalized learning circuit.
The cortical circuits do basically nothing but fire randomly when you are born—you really are a blank slate in that respect (although obviously the rest of the brain has plenty of genetically fixed functionality).
Of course the arrangement of the brain’s regions with respect to sensory organs and it’s overall wiring architecture do naturally lead to the familiar specializations of brain regions, but really one should consider this a developmental attractor—information is colonizing each cortex anew, but the similar architecture and similarity of information ensures that two brains end up having largely overlapping colonizations.
How do you reconcile this claim with the fact that some people are faceblind from an early age and never develop the ability to recognize faces? This would suggest that there’s at least one aspect of humans that is normally somewhat hard-wired.
There are all sorts of aspects of humans that are normally somewhat—or nearly entirely—hard-wired. The cortex just doesn’t tend to be. Even the parts of the cortex that are similarly specialised in most humans seem to be so due to what they are connected to. (As can be seen by looking at how the atypical cases have adapted differently.) It would surprise me if the inability to recognise faces was caused by a dysfunction in the cortex specifically.
Disclaimer: I disagree with nearly everything else Jacob has said in this thread. This position specifically appears to be well researched.
However, it is much more of an open question in computer science if we will ever be able to greatly improve on the statistical inference algorithm used in the cortex. It is quite possible that evolution had enough time to solve that problem completely—or at least reach some nearly global maxima.
This is unlikely. We haven’t been selected based on sheer brain power or brain inefficiency. Humans have been selected by their ability to reproduce in a complicated environment. Efficient intelligence helps, but there’s selection for a lot of other things, such as good immune systems and decent muscle systems. A lot of the selection that was brain selection was probably simply around the fantastically complicated set of tasks involved in navigating human societies. Note that human brain size on average has decreased over the last 50,000 years. Humans are subject to a lot of different selection pressures.
(Tangent: This is related to how at a very vague level we should expect genetic algorithms to outperform evolution at optimizing tasks. Genetic algorithms can select for narrow task completion goals, rather than select in a constantly changing environment with competition and interaction between the various entities being bred.)
It is quite possible that evolution had enough time to solve that problem completely [statistical inference in the cortex] - or at least reach some nearly global maxima
This is unlikely. We haven’t been selected based on sheer brain power or brain inefficiency.
I largely agree with your point about human evolution, but my point was about the laminar cortical circuit which is shared in various forms across the entire mammalian lineage and has an analog in birds.
It’s a building block pattern that appears to have a long evolutionary history.
Genetic algorithms can select for narrow task completion goals, rather than select in a constantly changing environment with competition and interaction between the various entities being bred.
Yes, but there is a limit to this of course. We are, after all, talking about general intelligence.
You made the general point earlier, which I very much agree with, about opportunity cost. Simulating humanity’s current time-line has an opportunity cost in the form of some paradise that could exist in it’s place. You seem to think that the paradise is clearly better, and I agree: from our current moral perspective.
It seems you’re arguing that our successors will develop a preference for simulating universes like ours over paradises. If that’s what you’re arguing, then what reason do we have to believe that this is probable?
If their preferences do not change significantly from ours, it seems highly unlikely that they will create simulations identical to our current existence. And out of the vast space of possible ways their preferences could change, selecting that direction in the absence of evidence is a serious case of privileging the hypothesis.
To uploads, yes, but a faithful simulation of the universe, or even a small portion of it. would have to track a lot more variables than the processes of the human minds within it.
Optimal approximate simulation algorithms are all linear with respect to total observer sensory input. This relates to the philosophical issue of observer dependence in QM and whether or not the proverbial unobserved falling tree actually exists.
So the cost of simulating a matrix with N observers is not expected to be dramatically more than simulating the N observer minds alone—C*N. The phenomena of dreams is something of a practical proof.
Variables that aren’t being observed still have to be tracked, since they affect the things that are being observed.
Dreams are not a very good proof of concept given that they are not coherent simulations of any sort of reality, and can be recognized as artificial not only after the fact, but during with a bit of introspection and training.
In dreams, large amounts of data can be omitted or spontaneously introduced without the dreamer noticing anything is wrong unless they’re lucid. In reality, everything we observe can be examined for signs of its interactions with things that we haven’t observed, and that data adds up to pictures that are coherent and consistent with each other.
Probabilities were for example purposes only. I made them up because they were nice to calculate with and sounded halfway reasonable. I will not defend them. If you request that I come up with my real probability estimates, I will have to think harder.
Ah, well your more general point was well-made. I don’t think better numbers are really important. It’s all too fuzzy for me to be at all confident about.
I still retain my belief that it is implausible that we are in a universe simulation. If I am in a simulation, I expect that it is more likely that I am by myself (and that conscious or not, you are part of the simulation created in response to me), moderately more likely that there are a small group of humans being simulated with other humans and their environment dynamically generated, and overall very unlikely that the creators have bothered to simulate any part of physical reality that we aren’t directly observing (including other people). Ultimately, none of these seem likely enough for me to bother considering for very long.
The first part of your belief that “it is implausible that we are in a universe simulation” appears to be based on the argument:
If simulationism, then solipsism is likely.
Solipsism is unlikely, so . . .
Chain of logic aside, simulationism does not imply solipsism. Simulating N localized space-time patterns in one large simulation can be significantly cheaper than simulating N individual human simulations. So some simulated individuals may exist in small solipsist sims, but the great majority of conscious sims will find themselves in larger shared simulations.
Presumably a posthuman intelligence on earth would be interested in earth as a whole system, and would simulate this entire system. Simulating full human-mind equivalents is something of a sweet spot in the space of approximations.
There is a massive sweet spot, an extremely effecient method, of simulating a modern computer—which is to simulate it at the level of it’s turing equivalent circuit. Simulating it at a level below this—say at the molecular level, is just a massive waste of resources, while any simulation above this loses accuracy completely.
It is postulated that a similar simulation scale separation exists for human minds, which naturally relates to uploads and AI.
I don’t understand why human-mind equivalents are special in this regard. This seems very anthropocentric, but I could certainly be misinterpreting what you said.
Cheaper, but not necessarily more efficient. It matters which answers one is looking for, or which goals one is after. It seems unlikely to me that my life is directed well enough to achieve interesting goals or answer interesting questions that a superintelligence might pose, but it seems even more unlikely that simulating 6 billion humans, in the particular way they appear (to me) to exist is an efficient way to answer most questions either.
I’d like to stay away from telling God what to be interested in, but out of the infinite space of possibilities, Earth seems too banal and languorous to be the one in N that have been chosen for the purpose of simulation, especially if the basement universe has a different physics.
If the basement universe matches our physics, I’m betting on the side that says simulating all the minds on Earth and enough other stuff to make the simulation consistent is an expensive enough proposition that it won’t be worthwhile to do it many times. Maybe I’m wrong; there’s no particular reason why simulating all of humanity in the year of 2011 needs to take more than 10^18 J, so maybe there’s a “real” milky way that’s currently running 10^18 planet-scale sims. Even that doesn’t seem like a big enough number to convince me that we are likely to be one of those.
I meant there is probably some sweet spot in the space of [human-mind] approximations, because of scale separation, which I elaborated on a little later with the computer analogy.
Cheaper implies more efficient, unless the individual human simulations somehow have a dramatically higher per capita utility.
A solipsist universe has extraneous patchwork complexity. Even assuming that all of the non-biological physical processes are grossly approximated (not unreasonable given current simulation theory in graphics), they still may add up to a cost exceeding that of one human mind.
But of course a world with just one mind is not an accurate simulation, so you now you need to populate it with a huge number of pseudo-minds which functionally are indistinguishable from the perspective of our sole real observer but somehow use much less computational resources.
Now imagine a graph of simulation accuracy vs computational cost of a pseudo-mind. Rather than being linear, I believe it is sharply exponential, or J-shaped with a single large spike near the scale separation point.
The jumping point is where the pseudo-mind becomes a real actual conscious observer of it’s own.
The rationale for this cost model and the scale separation point can be derived from what we know about simulating computers.
Perhaps not your life in particular, but human life on earth today?
Simulating 6 billion humans will probably be the only way to truly understand what happened today from the perspective of our future posthuman descendants. The alternatives are . . . creating new physical planets? Simulation will be vastly more efficient than that.
The basement reality is highly unlikely to have different physics. The vast majority of simulations we create today are based on approximations of currently understood physics, and I don’t expect this to every change—simulations have utility for simulators.
I’m a little confused about the 10^18 number.
From what I recall, at the limits of computation one kg of matter can hold roughly 10^30 bits, and a human mind is in the vicinity of 10^15 bits or less. So at the molecular limits a kg of matter could hold around a quadrillion souls—an entire human galactic civilization. A skyscraper of such matter could give you 10^8 kg .. and so on. Long before reaching physical limits, posthumans would be able to simulate many billions of entire earth histories. At the physical molecular limits, they could turn each of the moon’s roughly 10^22 kg into an entire human civilization, for a total of 10^37 minds.
The potential time scale compression are nearly as vast—with estimated speed limits at around 10^15 ops/bit/sec in ordinary matter at ordinary temperatures, vs at most 10^4 ops/bit/sec in human brains, although not dramatically higher than the 10^9 ops/bit/sec of today’s circuits. The potential speedup of more than 10^10 over biological brains allows for about one hundred years per second of sidereal time.
I understand that for any mind, there is probably an “ideal simulation level” which has the fidelity of a more expensive simulation at a much lower cost, but I still don’t understand why human-mind equivalents are important here.
Which seems pretty reasonable to me. Why should the value of simulating minds be linear rather than logarithmic in the number of minds?
Agreed, but I also think that the cost of simulating the relevant stuff necessary to simulate N minds might be close to linear in N.
I agree, though as a minor note if cost is the Y-axis the graph has to have a vertical asymptote, so it has to grow much faster than exponential at the end. Regardless, I don’t think we can be confident that consciousness occurs at an inflection point or a noticeable bend.
I suspect that some pseudo-minds must be conscious observers some of the time, but that they can be turned off most of the time and just be updated offline with experiences that their conscious mind will integrate and patch up without noticing. I’m not sure this would work with many mind-types, but I think it would work with human minds, which have a strong bias to maintaining coherence, even at the cost of ignoring reality. If I’m being simulated, I suspect that this is happening even to me on a regular basis, and possibly happening much more often the less I interact with someone.
Updating on the condition that we closely match the ancestors of our simulators, I think it’s pretty reasonable that we could be chosen to be simulated. This is really the only plausible reason I can think of to chose us in particular. I’m still dubious as to the value doing so will have to our descendants.
Actually, I made a mistake, so it’s reasonable to be confused. 20 W seems to be a reasonable upper limit to the cost of simulating a human mind. I don’t know how much lower the lower bound should be, but it might not be more than an order of magnitude less. This gives 10^11 W for six billion, (4x) 10^18 J for one year.
I don’t think it’s reasonable to expect all the matter in the domain of a future civilization to be used to its computational capacity. I think it’s much more likely that the energy output of the Milky Way is a reasonably likely bound to how much computation will go on there. This certainly doesn’t have to be the case, but I don’t see superintelligences annihilating matter at a dramatically faster rate in order to provide massively more power to the remainder of the matter around. The universe is going to die soon enough as it is. (I could be very short sighted about this) Anyway, energy output of the Milky Way is around 5x10^36 W. I divided this by Joules instead of by Watts, so the second number I gave was 10^18, when it should have been (5x) 10^24.
I maintain that energy, not quantum limits of computation in matter, will bound computational cost on the large scale. Throwing our moon into the Sun in order to get energy out of it is probably a better use of it as raw materials than turning it into circuitry. Likewise for time compression, convince me that power isn’t a problem.
Simply because we are discussing simulating the historical period in which we currently exist.
The premise of the SA is that the posthuman ‘gods’ will be interested in simulating their history. That history is not dependent on a smattering of single humans isolated in boxes, but the history of the civilization as a whole system.
If the N minds were separated by vast gulfs of space and time this would be true, but we are talking about highly connected systems.
Imagine the flow of information in your brain. Imagine the flow of causality extending back in time, the flow of information weighted by it’s probabilistic utility in determining my current state.
The stuff in immediate vicinity to me is important, and the importance generally falls off according to an inverse square law with distance away from my brain. Moreover, even from the stuff near me at one time step, only a tiny portion of it is relevant. At this moment my brain is filtering out almost everything except the screen right in front of me, which can be causally determined by a program running on my computer, dependent on recent information in another computer in a server somewhere in the midwest a little bit ago, which was dependent on information flowing out from your brain previously . .. and so on.
So simulating me would more or less require your simulation as well, it’s very hard to isolate a mind. You might as well try to simulate just my left prefrontal cortex. The entire distinction of where one mind begins and ends is something of spatial illusion that disappears when you map out the full causal web.
If you want to simulate some program running on one computer on a new machine, there is an exact vertical inflection wall in the space of approximations where you get a perfect simulation which is just the same program running on the new machine. This simulated program is in fact indistinguishable from the original.
Yes, but because of the network effects mentioned earlier it would be difficult and costly to do this on a per mind basis. Really it’s best to think of the entire earth as a mind for simulation purposes.
Could you turn off part of cortex and replace it with a rough simulation some of the time without compromising the whole system? Perhaps sometimes, but I doubt that this can give a massive gain.
Why do we currently simulate (think about) our history? To better understand ourselves and our future.
I believe there are several converging reasons to suspect that vaguely human-like minds will turn out to be a persistent pattern for a long time—perhaps as persistent as eukaryotic cells. Adapative radiation will create many specializations and variations, but the basic pattern of a roughly 10^15 bit mind and it’s general architecture may turn out to be a fecund replicator and building block for higher level pattern entities.
It seems plausible some of these posthumans will actually descend from biological humans alive today. They will be very interested in their ancestors, and especially the ancestors they new in their former life who died without being uploaded or preserved.
Humans have been thinking about this for a while. If you could upload and enter virtual heaven, you could have just about anything that you want. However, one thing you may very much desire would be reunification with former loved ones, dead ancestors, and so on.
So once you have enough computational power, I suspect there will be a desire to use it in an attempt to resurrect the dead.
You are basically taking the current efficiency of human brains as the limit, which of course is ridiculous on several fronts. We may not reach the absolute limits of computation, but they are the starting point for the SA.
We already are within six orders of magnitude of the speed limit of ordinary matter (10^9 bit ops/sec vs 10^15), and there is every reason to suspect we will get roughly as close to the density limit.
There are several measures—the number of bits storable per unit mass derives how many human souls you can store in memory per unit mass.
Energy relates to the bit operations per second and the speed of simulated time.
I was assuming computing at regular earth temperatures within the range of current brains and computers. At the limits of computation discussed earlier 1 kg of matter at normal temperatures implies an energy flow of around 1 to 20W and can simulate roughly 10^15 virtual humans 10^10 faster than current human rate of thought. This works out to about one hundred years per second.
So at the limits of computation, 1 kg of ordinary matter at room temperature should give about 10^25 human lifetimes per joule. One square meter of high efficiency solar panel could power several hundred kilograms of computational substrate.
So at the limits of computation, future posthuman civilizations could simulate truly astronomical number of human lifetimes in one second using less power and mass than our current civilization.
No need to dissemble planets. Using the whole surface of a planet gives a multiplier of 10^14 over a single kilogram. Using the entire mass only gives a further 10^8 multiple over that or so, and is much much more complex and costly to engineer. (when you start thinking of energy in terms of human souls, this becomes morally relevant)
If this posthuman civilization simulates human history for a billion years instead of a second, this gives another 10^16 multiplier.
Using much more reasonable middle of the road estimates:
Say tech may bottom out at a limit within half (in exponential terms) of the maximum—say 10^13 human lifetimes per kg per joule vs 10^25.
The posthuman civ stabilizes at around 10^10 1kg computers (not much more than we have today).
The posthuman civ engages in historical simulation for just one year. (10^7 seconds).
That is still 10^30 simulated human lifetimes, vs roughly 10^11 lifetimes in our current observational history.
Those are still astronomical odds for observing that we currently live in a sim.
This is very upsetting, I don’t have anything like the time I need to keep participating in this thread, but it remains interesting. I would like to respond completely, which means that I would like to set it aside, but I’m confident that if I do so I will never get back to it. Therefore, please forgive me for only responding to a fraction of what you’re saying.
I thought context made it clear that I was only talking about the non-mind stuff being simulated as being an additional cost perhaps nearly linear in N. Very little of what we directly observe overlaps except our interaction with each other, and this was all I was talking about.
Why can’t a poor model (low fidelity) be conscious? We just don’t know enough about consciousness to answer this question.
I really disagree, but I don’t have time to exchange each other’s posteriors, so assume this dropped.
I think this is evil, but I’m not willing to say whether the future intelligences will agree or care.
I said it was a reasonable upper bound, not a reasonable lower bound. That seems trivial.
Most importantly, you’re assuming that all circuitry performs computation, which is clearly impossible. That leaves us to debate about how much of it can, but personally I see no reason that the computational minimum cost will closely (even in an exponential sense) be approached. I am interested in your reasoning why this should be the case though, so please give me what you can in the way of references that led you to this belief.
Lastly, but most importantly (to me), how strongly do you personally believe that a) you are a simulation and that b) all entities on Earth are full-featured simulations as well?
Conditioning on (b) being true, how long ago (in subjective time) do you think our simulation started, and how many times do you believe it has (or will be) replicated?
If I was to quantify your ‘very little’ I’d guess you mean say < 1% observational overlap.
Lets look at the rough storage cost first. Ignoring variable data priority through selective attention for the moment, the data resolution needs for a simulated earth can be related to photons incident on the retina and decreases with an inverse square law from the observer.
We can make a 2D simplification and use google earth as an example. If there was just one ‘real’ observer, you’d need full data fidelity for the surface area that observer would experience up close during his/her lifetime, and this cost dominates. Let’s say that’s S, S ~ 100 km^2.
Simulating an entire planet, the data cost is roughly fixed or capped—at 5x10^8 km^2.
So in this model simulating an entire earth with 5 billion people will have a base cost of 5x10^8 km^2, and simulating 5 billion worlds separately will have a cost of 5x10^9 * S.
So unless S is pathetically small (actually less than human visual distance), this implies a large extra cost to the solipsist approach. From my rough estimate of S the solipsist approach is 1,000 times more expensive. This also assumes that humans are randomly distributed, which of course is unrealistic. In reality human populations are tightly clustered which further increases the relative gain of shared simulation.
Evil?
Why?
I’m not sure what you mean by this. Does all of the circuitry of the brain perform computation? Over time, yes. The most efficient brain simulations will of course be emulations—circuits that are very similar to the brain but built on much smaller scales on a new substrate.
My main reference for the ultimate limits is Seth Lloyd’s “Ultimate Physical Limits of Computation”. The Singularity is Near discusses much of this as well of course (but he mainly uses the more misleading ops per second, which is much less well defined).
Biological circuits switch at 10^3 to 10^4 bits flips/second. Our computers went from around that speed in WWII to the current speed plateau of around 10^9 bit flips/second reached early this century. The theoretical limit for regular molecular matter is around 10^15 bit flips/second. (A black hole could reach a much much higher speed limit, as discussed in Lloyd’s paper). There are experimental circuits that currently approach 10^12 bit flips/second.
In terms of density, we went from about 1 bit / kg around WWII to roughly 10^13 bits / kg today. The brain is about 10^15 bits / kg, so we will soon surpass it in circuit density. The juncture we are approaching (brain density) is about half-way to the maximum of 10^30 bits/kg. This has been analyzed extensively in the hardware community and it looks like we will approach these limits as well sometime this century. It is entirely practical to store 1 bit (or more) per molecule.
A and B are closely correlated. Its difficult to quantify my belief in A, but it’s probably greater than 50%.
I’ve thought a little about your last question but I don’t yet even see a route to estimating it. Such questions will probably require a more advanced understanding of simulation.
I feel like this would make you a terrible video game designer :-P. Why should we bother simulating things in full fidelity, all the time, just because they will eventually be seen? The only full-fidelity simulation we should need is the stuff being directly examined. Much rougher algorithms should suffice for things not being directly observed.
Heh, my ability to argue is getting worse and worse. You sure you want to continue this thread? What I meant to say (and entirely failed) is that there is an infrastructure cost; we can’t expect to compute with every particle, because we need lots of particles to make sure the others stay confined, get instructions, etc. Basically, not all matter can be a bit at the same time.
Again, infrastructure costs. Can you source this (also Lloyd?)?
For the rest, I’m aware of and don’t dispute the speeds and densities you mention. What I’m skeptical of is that we have evidence that they are practicable; this was what I was looking for. I don’t count previous success of Moore’s Law strong evidence of that we will continue getting better at computation until we hit physical limits. I’m particularly skeptical about how well we will ever do on power consumption (partially because it’s such a hard problem for us now).
The idea that I did not have to live this life, that some entity or civilization has created the environment in which I’ve experienced so much misery, and that they will do it again and again makes me shake with impotent rage. I cannot express how much I would rather having never existed. The fact that they would do this and so much worse (because my life is an astoundingly far cry from the worst that people deal with), again, and again, to trillions upon trillions of living, feeling beings...I cannot express my sorrow. It literally brings me to tears.
This is not sadism; or it would be far worse. It is rather a total neglect of care, a relegation of my values in place of historical interest. However, I still consider this evil in the highest degree.
I do not reject the existence of evil, and therefore this provides no evidence against the hypothesis that I am simulated. However, if I believe that I have a high chance of being simulated, I should do all that I can to prevent such an entity from ever coming to exist with such power, on the off chance that I am one not simulated, and able to prevent such evil from unfolding.
Of course you’re on the right track here—and I discussed spatially variant fidelity simulation earlier. The rough surface area metric was a simplification of storage/data generation costs, which is a separate issue than computational cost.
If you want the most bare-bones efficient simulation, I imagine a reverse hierarchical induction approach that generates the reality directly from the belief network of the simulated observer, a technique modeled directly on human dreaming.
However, this is only most useful if the goal is to just generate an interesting reality. If the goal is to regenerate an entire historical period accurately, you cant start with the simulated observers—they are greater unknowns than the environment itself.
The solipsist issue may not have discernible consequences, but overall the computational scaling is sublinear for emulating more humans in a world and probably significant because of the large casual overlap of human minds via language.
Physical Limits of Computation
The intellectual work required to show an ultimate theoretical limit is tractable, but showing that achieving said limit is impossible in practice is very difficult.
I’m pretty sure we won’t actually hit the physical limits exactly, it’s just a question of how close. If you look at our historical progress in speed and density to date, it suggests that we will probably go most of the way.
Another simple assessment related to the doomsday argument: I don’t know how long this Moore’s Law progression will carry on, but it’s lasted for 50 years now, so I give reasonable odds that it will last another 50. Simple, but surprisingly better than nothing.
A more powerful line of reasoning perhaps is this: as long as there is an economic incentive to continue Moore’s Law and room to push against the physical limits, ceteris paribus, we will make some progress and push towards those limits. Thus, eventually we will reach them.
Power density depends on clock rate, which has plateaued. Power efficiency, in terms of ops/joule, increases directly with transistor density.
This is somewhat concerning, and I believe, atypical. Not existing is perhaps the worst thing I can possibly imagine, other than infinite torture.
I’m not sure if ‘historical interest’ is quite the right word. Historical recreation or resurrection might be more accurate.
A paradise designed to maximally suffice current human values and eliminate suffering is not a world which could possibly create or resurrect us.
You literally couldn’t have grown up in that world, the entire idea is a non sequitur. Your mind’s state is a causal chain rooted in the gritty reality of this world with all of it’s suffering.
Imagining that your creator could have assigned you to a different world is like imagining you could have grown up with different parents. You couldn’t have. That would be somebody else completely.
Of course, if said creator exists, and if said creator values what you value in the way you value it (dubious) it could whisk you away to paradise tomorrow.
But I wouldn’t count on that—perhaps said creator is still working on you or doesn’t think paradise is a useful place for you or could care less.
In the face of such uncertainty, we can only task ourselves with building paradise.
I believe we’re arguing along two paths here, and it is getting muddled. Applying to both, I think one can maintain the world-per-person sim much more cheaply than you originally suggested long before one hits the spot where the sim is no longer accurate to the world except where it intersects with the observer’s attention.
Second, from my perspective you’re begging the question, since I was talking about a variety of reasons for simulation and arguing that simulating a single entity seems as reasonable as many—but you seem only to be concerned with historical recreation, in which case it seems obvious to me that a large group of minds is necessary. If we’re only talking about that case, the arguments along this line about the per-mind cost just aren’t very relevant.
I have a 404 on your link, I’ll try later.
Interesting, I haven’t heard that argument applied to Moore’s Law. Question: you arrive at a train crossing (there are no other cars on the road), and just as you get there, a train begins to cross before you can. Something goes wrong, and the train stops, and backs up, and goes forward, and stops again, and keeps doing this. (This actually happened to me). 10 minutes later, should you expect that you have around 10 minutes left? After those are passed, should your new expectation be that you have around 20 minutes left?
The answer is possibly yes. I think better results would be obtained by using a Jeffreys Prior. However, I’ve talked to a few statisticians about this problem, and no one has given me a clear answer. I don’t think they’re used to working with so little data.
Revise to say “and room to push against the practicable limits” and you will see where my argument lies despite my general agreement with this statement.
To my knowledge, this is incorrect. Increases in transistor density have dramatically increased circuit leakage (because of bumping into quantum tunneling), requiring more power per transistor in order to accurately distinguish one path from another. I saw a roundtable about proposed techniques for increasing processor efficiency. None of the attendees objected to the introduction, which mentioned that the increased waste heat from modern circuits was rising at a faster exponential than circuit density, and would render all modern circuit designs inoperable if there were to be logically extended without addressing the problem of quantum leakage.
If you didn’t exist in the first place, you wouldn’t care. Do you think you’ve done so much good for the world that your absence could be “the world thing you can possibly imagine, other than infinite torture”?
Regardless, I’m quite atypical in this regard, but not unique.
And wouldn’t that be so much better.
You propose that not existing would be a terrible evil. But how much better, for all the trillions upon trillions you’re proposing must suffer for the creator’s whims, would it be to have that computational substrate be used to host entities that have amazingly positive, productive, maximally Fun lives? I know I couldn’t have existed in a paradise, but if I’m a sim, there are cycles that could be used for paradise that have been abandoned to create misery and strife.
Again, I think that this may be the world we really are in. I just can’t call it a moral one.
Historical recreation currently seems to be the best rationale for a superintelligence to simulate this timeslice, although there are probably other motivations as well.
If that was actually the case, then there would be no point to moving to a new technology node!
Yes leakage is a problem at the new tech nodes, but of course power per transistor can not possibly be increasing. I think you mean power per surface area has increased.
Shrinking a circuit by half in each dimension makes the wires thinner, shorter and less resistant, decreasing power use per transistor just as you’d think. Leakage makes this decrease somewhat less than the shrinkage rate, but it doesn’t reverse the entire trend.
There are also other design trends that can compensate and overpower this to an extent, which is why we have a plethora of power efficient circuits in the modern handheld market.
“which mentioned that the increased waste heat from modern circuits was rising at a faster exponential than circuit density”
Do you remember when this was from or have a link? I could see that being true when speeds were also increasing, but that trend has stopped or reversed.
I recall seeing some slides from NVidia where they are claiming there next GPU architecture will cut power use per transistor dramatically as well at several times the rate of shrinkage.
Even if the goal is maximizing fun, creating some historical sims for the purpose of resurrecting the dead may serve that goal. But I really doubt that current-human-fun-maximization is an evolutionary stable goal system.
I imagine that future posthuman morality and goals will evolve into something quite different.
Knowledge is a universal feature of intelligence. Even the purely mathematical hypothetical superintelligence AIXI would end up creating tons of historical simulations—and that might be hopelessly brute force, but nonetheless superintelligences with a wide variety of goal systems would find utility in various types of simulation.
Much of the information from the past is probably irretrievably lost to us. If the information input into the simulation were not precisely the same as the actual information from that point in history, the differences would quickly propagate so that the simulation would bear little resemblance to the history. Supposing the individuals in question did have access to all the information they’d need to simulate the past, they’d have no need for the simulation, because they’d already have complete informational access to the past. It suffers similar problems to your sandboxed anthropomorphic AI proposal; provided you have all the resources necessary to actually do it, it ceases to be a good idea.
There are other possible motivations, but it’s not clear that there are any others that are as good or better, so we have little reason to suppose it will ever happen.
This seems to be overly restrictive, but I don’t mind confining the discussion to this hypothesis.
Yes, you are correct.
The roundtable was at SC′08, a while after speeds had stabilized, and since it is a supercomputing conference, the focus was on massively parallel systems. It was part of this.
Without needing to dispute this, I can remain exceptionally upset that whatever their future morality is, it is blind to suffering and willing to create innumerable beings that will suffer in order to gain historical knowledge. Does this really not bother you in the slightest?
ETA: still 404
While the leakage issue is important and I want to read a little more about this reference, I don’t think that any single such current technical issue is nearly sufficient to change the general analysis. There have always been major issues on the horizon, the question is more of the increase in engineering difficulty as we progress vs the increase in our effective intelligence and simulation capacity.
In the specific case of leakage, even if it is a problem that persists far into the future, it just slightly lowers the growth exponent as we just somewhat lower the clock speeds. And even if leakage can never be fully prevented, eventually it itself can probably be exploited for computation.
As I child I liked Mcdonalds, bread, plain pizza and nothing more—all other foods were poisonous. I was convinced that my parent’s denial of my right to eat these wonderful foods and condemn me to terrible suffering as a result was a sure sign of their utter lack of goodness.
Imagine if I could go back and fulfill that child’s wish to reduce it’s suffering. It would never then evolve into anything like my current self, and in fact may evolve into something that would suffer more or at the very least wish that it could be me.
Imagine if we could go back in time and alter our primate ancestors to reduce their suffering. The vast majority of such naive interventions would cripple their fitness and wipe out the lineage. There is probably a tiny set of sophisticated interventions that could simultaneously eliminate suffering and improve fitness, but these altered creatures would not develop into humans.
Our current existence is completely contingent on a great evolutionary epic of suffering on an astronomical scale. But suffering itself is just one little component of that vast mechanism, and forms no basis from which to judge the totality.
You made the general point earlier, which I very much agree with, about opportunity cost. Simulating humanity’s current time-line has an opportunity cost in the form of some paradise that could exist in it’s place. You seem to think that the paradise is clearly better, and I agree: from our current moral perspective.
In the end of the day morality is governed by evolution. There is an entire landscape of paradises that could exist, the question is what fitness advantage do they provide their creator? The more they diverge from reality, the less utility they have in advancing knowledge of reality towards closure.
It looks like earth will evolve into a vast planetary hierarchical superintelligence, but ultimately it will probably be just one of many, and still subject to evolutionary pressure.
I disagree; I think that problems like this, unresolved, may or may not decrease the base of our exponent, but will cap its growth earlier.
On this point, we disagree, and I may be on the unpopular side of this agreement. I don’t see how past increases that have required technological revolutions can be considered more than weak evidence for future technological revolutions. I actually think it quite likely that increase in computational power per Joule will bottom out in ten to twenty years. I wouldn’t be too surprised if exponential increase lasts thirty years, but forty seems unlikely, and fifty even less likely.
I don’t care. We aren’t talking about destroying the future of intelligence by going back in time. We’re talking about repeating history umpteen many times, creating suffering anew each time. It sounds to me like you are insisting that this suffering is worthwhile, even if the result of all of it will never be more than a data point in a historian’s database.
We live in a heartbreaking world. Under the assumption that we are not in a simulation, we can recognize facts like ‘suffering is decreasing over time’ and realize that it is our job to work to aid this progress. Under the assumption that we are in a simulation, we know that the capacity for this progress is already fully complete, and the agents who control it simply don’t care. If we are being simulated, it means that one or more entities have chosen to create unimaginable quantities of suffering for their own purposes—to your stated belief, for historical knowledge.
Your McDonald’s example doesn’t address this in the slightest. You were already a living, thinking being, and your parents took care of you in the right way in an attempt to make your future life better. They couldn’t have chosen before you were born to instead create someone who would be happier, smarter, wiser, and better in every way. If they could have, wouldn’t it be upsetting that they chose not to?
Given the choice between creating agents that have to endure suffering for generations upon generations, and creating agents that will have much more positive, productive lives, why are you arguing for the side that chooses the former? Of course the former and latter are entirely different entities, but that serves as no argument whatsoever for choosing the former!
A person running such a simulation could create a simulated afterlife, without suffering, where each simulated intelligence would go after dying in the simulated universe. It’s like a nice version of Pascal’s Wager, since there’s no wagering involved. Such an afterlife wouldn’t last infinitely long, but it could easily be made long enough to outweigh any suffering in the simulated universe.
Or you could skip the part with all the suffering. That would be a lot easier.
In general, I agree. I just wanted to offer a more creative alternative for someone truly dedicated to operating such a simulation.
So far the only person who seems dedicated to making such a simulation is jacob cannell, and he already seems to be having enough trouble separating the idea from cached theistic assumptions.
I don’t think that’s how it works.
How much future happiness would you need in order to choose to endure 50 years of torture?
That depends if happiness without torture is an option. The options are better/worse, not good/bad.
The simulated afterlife wouldn’t need to outweigh the suffering in the first universe according to our value system, only according to the value system of the aliens who set up the simulation.
Technology doesn’t really advance through ‘revolutions’, it evolves. Some aspects of that evolution appear to be rather remarkably predictable.
That aside, the current predictions do posit a slow-down around 2020 for the general lithography process, but there are plenty of labs researching alternatives. As the slow-down approaches, their funding and progress will accelerate.
But there is a much more fundamental and important point to consider, which is that circuit shrinkage is just one dimension of improvement amongst several. As that route of improvement slows down, other routes will become more profitable.
For example, for AGI algorithms, current general purpose CPUs are inefficient by a factor of perhaps around 10^4. That is a decade of exponential gain right there just from architectural optimization. This route—neuromorphic hardware and it’s ilk—currently receives a tiny slice of the research budget, but this will accelerate as AGI advances and would accelerate even more if the primary route of improvement slowed.
Another route of improvement is exponentially reducing manufacturing cost. The bulk of the price of high-end processors pays for the vast amortized R&D cost of developing the manufacturing node within the timeframe that the node is economical. Refined silicon is cheap and getting cheaper, research is expensive. The per transistor cost of new high-end circuitry on the latest nodes for a CPU or GPU is 100 times more expensive than the per transistor cost of bulk circuitry produced on slightly older nodes.
So if moore’s law stopped today, the cost of circuitry would still decay down to the bulk cost. This is particularly relevant to neurmorphic AGI designs as they can use a mass of cheap repetitive circuitry, just like the brain. So we have many other factors that will kick in even as moore’s law slows.
I suspect that we will hit a slow ramping wall around or by 2020, but these other factors will kick in and human-level AGI will ramp up, and then this new population and speed explosion will drive the next S-curve using a largely new and vastly more complex process (such as molecular nano-tech) that is well beyond our capability or understanding.
It’s more or less equivalent from the perspective of a historical sim. A historical sim is a recreation of some branch of the multiverse near your own incomplete history that you then run forward to meet your present.
My existence is fully contingent on the existence of my ancestors in all of their suffering glory. So from my perspective, yes their suffering was absolutely worthwhile, even if it wasn’t from their perspective.
Likewise, I think that it is our noble duty to solve AI, morality, and control a Singularity in order to eliminate suffering and live in paradise.
I also understand that after doing that we will over time evolve into beings quite unlike what we are now and eventually look back at our prior suffering and view it from an unimaginably different perspective, just as my earlier mcdonald’s loving child-self evolved into a being with a completely different view of it’s prior suffering.
It was right from both their and my current perspective, it was absolutely wrong from my perspective at the time.
Of course! Just as we should create something better than ourselves. But ‘better’ is relative to a particular subjective utility function.
I understand that my current utility function works well now, that it is poorly tuned to evaluate the well-being of bacteria, just as poorly tuned to evaluate the well-being of future posthuman godlings, and most importantly—my utility function or morality will improve over time.
Imagine you are the creator. How do you define ‘positive’ or ‘productive’? From your perspective, or theirs?
There are an infinite variety of uninteresting paradises. In some virtual humans do nothing but experience continuous rapturous bliss well outside the range of current drug-induced euphoria. There are complex agents that just set their reward functions to infinity and loop.
There are also a spectrum of very interesting paradises, all having the key differentiator that they evolve. I suspect that future godlings will devote most of their resources to creating these paradises.
I also suspect that evolution may operate again at an intergalactic or higher level, ensuring that paradises and all simulations somehow must pay for themselves.
At some point our descendants will either discover for certain they are in a sim and integrate up a level, or they will approach local closure and perhaps discover an intergalactic community. At that point we may have to compete with other singularity-civilizations, and we may have the opportunity to historically intervene on pre-singularity planets we encounter. We’d probably want to simulate any interventions before preceeding, don’t you think?
A historical recreation can develop into a new worldline with it’s own set of branching paradises that increase overall variation in a blossoming metaverse.
If you could create a new big bang, an entire new singularity and new universe, would you?
You seem to be arguing that you would not because it would include humans who suffer. I think this ends up being equivalent to arguing the universe should not exist.
If we had enough information to create an entire constructed reality of them in simulation, we’d have much more than we needed to just go ahead and intervene.
Some people would argue that it shouldn’t (this is an extreme of negative utilitarianism.) However, since we’re in no position to decide whether the universe gets to exist or not, the dispute is fairly irrelevant. If we’re in a position to decide between creating a universe like ours, creating one that’s much better, with more happiness and productivity and less suffering, and not creating one at all, though, I would have an extremely poor regard for the morality of someone who chose the first.
If my descendants think that all my suffering was worthwhile so that they could be born instead of someone else, then you know what? Fuck them. I certainly have a higher regard for my own ancestors. If they could have been happier, and given rise to a world as good as better than this one, then who am I to argue that they should have been unhappy so I could be born instead? If, as you point out
then why not skip the historical recreation and go straight to simulating the paradises?
I’m curious how you’ve reached this conclusion given how little we know about what AGI algorithms would look like.
The particular type of algorithm is actually not that important. There is a general speedup in moving from a general CPU-like architecture to a specialized ASIC—once you are willing to settle on the algorithms involved.
There is another significant speedup moving into analog computation.
Also, we know enough about the entire space of AI sub-problems to get a general idea of what AGI algorithms look like and the types of computations they need. Naturally the ideal hardware ends up looking much more like the brain than current von neumann machines—because the brain evolved to solve AI problems in an energy efficient manner.
If you know your are working in the space of probabilistic/bayesian like networks, exact digital computations are extremely wasteful. Using ten or hundreds of thousands of transistors to do an exact digital multiply is useful for scientific or financial calculations, but it’s a pointless waste when the algorithm just needs to do a vast number of probabilistic weighted summations, for example.
Cite for last paragraph about analog probability: http://phm.cba.mit.edu/theses/03.07.vigoda.pdf
Thanks. Hefty read, but this one paragraph is worth quoting:
I had forgot that term, statistical inference algorithms, need to remember that.
Well, there’s also another quote worth quoting, and in fact the quote that is in my Mnemosyne database and which enabled me to look that thesis up so fast...
This is true in general but this particular statement appears out of date:
’Alternative computing architectures, such as parallel digital computers have not tended to be commercially viable”
That was true perhaps circa 2000, but we hit a speed/heat wall and since then everything has been going parallel.
You may see something similar happen eventually with analog computing once the market for statistical inference computation is large enough and or we approach other constraints similar to the speed/heat wall.
Ok. But this prevents you from directly improving your algorithms. And if the learning mechanisms are to be highly flexible (like say those of a human brain) then the underlying algorithms may need to modify a lot even to just approximate being an intelligent entity. I do agree that given a fixed algorithm this would plausibly lead to some speed-up.
A lot of things can’t be put into analog. For example, what if you need factor large numbers. And making analog and digital stuff interact is difficult.
This doesn’t follow. The brain evolved through a long path of natural selection. It isn’t at all obvious that the brain is even highly efficient at solving AI-type problems, especially given that humans have only needed to solve much of what we consider standard problems for a very short span of evolutionary history (and note that general mammal brain architecture looks very similar to ours).
EDIT: why the downvotes?
Yes—which is part of the reason there is a big market for CPUs.
Not necessarily. For example, the cortical circuit in the brain can be reduced to an algorithm which would include the learning mechanism built in. The learning can modify the network structure to a degree but largely adjusts synaptic weights. That can be described as (is equivalent to) a single fixed algorithm. That algorithm in turn can be encoded into an efficient circuit. The circuit would learn just as the brain does, no algorithmic changes ever needed past that point, as the self-modification is built into the algorithm.
A modern CPU is a jack-of all trades that is designed to do many things, most of which have little or nothing to do with the computational needs of AGI.
If the AGI need to factor large numbers, it can just use an attached CPU. Factoring large numbers is easy compared to reading this sentence about factoring large numbers and understanding what that actually means.
The brain has roughly 10^15 noisy synapses that can switch around 10^3 times per second and store perhaps a bit each as well. (computation and memory integrated)
My computer has about 10^9 exact digital transistors in it’s CPU & GPU that can switch around 10^9 times per second. It has around the same amount of separate memory and around 10^13 bits of much slower disk storage.
These systems have similar peak throughputs of about 10^18 bits/second, but they are specialized for very different types of computational problems. The brain is very slow but massively wide, the computer is very narrow but massively fast.
The brain is highly specialized and extremely adept at doing typical AGI stuff—vision, pattern recognition, inference, and so on—problems that are suited to massively wide but slow processing with huge memory demands.
Our computers are specialized and extremely adept at doing the whole spectrum of computational problems brains suck at—problems that involve long complex chains of exact computations, problems that require massive speed and precision but less bulk processing and memory.
So to me, yes it’s obvious that the brain is highly efficient at doing AGI-type stuff—almost because that’s how we define AGI-type stuff—its all the stuff that brains are currently much better than computers at!
This limits the amount of modification one can do. Moreover, the more flexible your algorithm the less you gain from hard-wiring it.
No, we don’t know that the brain is “extremely adept” at these things. We just know that it is better than anything else that we know of. That’s not at all the same thing. The brain’s architecture is formed by a succession of modifications to much simpler entities. The successive, blind modification has been stuck with all sorts of holdovers from our early chordate ancestors and a lot from our more recent ancestors.
Easy is a misleading term in this context. I certainly can’t factor a forty digit number but for a computer that’s trivial. Moreover, some operations are only difficult because we don’t know an efficient algorithm. In any event, if your speedup is only occuring for the narrow set of tasks which humans can do decently such as vision, then you aren’t going to get a very impressive AGI. The ability to engage in face recognition if it takes you only a tiny amount of time that it would for a person to do is not an impressive ability.
Limits it compared to what?. Every circuit is equivalent to a program. The circuit of a general processor is equivalent to a program which simulates another circuit—the program which it keeps in memory.
Current Von Neumman processors are not the only circuits which have this simulation-flexibility. The brain has similar flexibility using very different mechanisms.
Finally, even if we later find out that lo and behold, the inference algorithm we hard-coded into our AGI circuits was actually not so great, and somebody comes along with a much better one . . . that is still not an argument for simulating the algorithm in software.
Not at all true. The class of statistical inference algorithms including Bayesian Networks and the cortex are both extremely flexible and greatly benefit from ‘hard-wiring’ it.
This is like saying we don’t know that Usain Bolt is extremely adept at running, he’s just better than anything else that we know of. The latter sentence in each case of course is true, but it doesn’t impinge on the former.
But my larger point was that the brain and current computers occupy two very different regions in the space of possible circuit designs, and are rather clearly optimized for a different slice over the space of computational problems.
There are some routes that we can obviously improve on the brain at the hardware level. Electronic circuits are orders of magnitude faster, and eventually we can make them much denser and thus much more massive.
However, it is much more of an open question in computer science if we will ever be able to greatly improve on the statistical inference algorithm used in the cortex. It is quite possible that evolution had enough time to solve that problem completely—or at least reach some nearly global maxima.
Yes—this is an excellent strategy for solving complex optimization problems.
Yes, and on second thought—largely mistaken. To be more precise we should speak of computational complexity and bitops. The best known factorization algorithms are running time exponential for the number of input bits. That makes them ‘hard’ in the scalability sense. But factoring small primes is still easy in the absolute cost sense.
Factoring is also easy in the algorithmic sense, as the best algorithms are very simple and short. Physics is hard in the algorithmic sense, AGI seems to be quite hard, etc.
The cortex doesn’t have a specialized vision circuit—there appears to be just one general purpose circuit it uses for everything. The visual regions become visual regions on account of . . processing visual input data.
AGI hardware could take advantage of specialized statistical inference circuitry and still be highly general.
I’m having a hard time understanding what you really mean by saying “the narrow set of tasks which humans can do decently such as vision”. What about quantum mechanics, computer science, mathematics, game design, poetry, economics, sports, art, or comedy? One could probably fill a book with the narrow set of tasks that humans can do decently. Of course, that other section of the bookstore—filled with books about things computers can do decently, is growing at an exciting pace.
I’m not sure what you mean by this or how it relates. If you could do face recognition that fast . . it’s not impressive?
The main computational cost of every main competing AGI route I’ve seen involves some sort of deep statistical inference, and this amounts to a large matrix multiplication possibly with some non-linear stepping or a normalization. Neural nets, bayesian nets, whatever—if you look at the mix of required instructions, it amounts to a massive repetition of simple operations that are well suited to hardware optimization.
If we have many generations of rapid improvement of the algorithms this will be much easier if one doesn’t need to make new hardware each time.
The general trend should still occur this way. I’m also not sure that you can reach that conclusion about the cortex given that we don’t have a very good understanding of how the brain’s algorithms function.
That seems plausibly correct but we don’t actually know that. Given how much humans rely on vision it isn’t at all implausible that there have been subtle genetic tweaks that make our visual regions more effective in processing visual data (I don’t know the literature in this area at all).
Incorrect, the best factoring algorithms are subexponential. See for example the quadratic field sieve and the number field sieve both of which have subexponential running time. This has been true since at least the early 1980s (there are other now obsolete algorithms that were around before then that may have had slightly subexponential running time. I don’t know enough about them in detail to comment.)
Factoring primes is always easy. For any prime p, it has no non-trivial factorizations. You seem to be confusing factorization with primality testing. The second is much easier than the first; we’ve had Agrawal’s algorithm which is provably polynomial time for about a decade. Prior to that we had a lot of efficient tests that were empirically faster than our best factorization procedures. We can determine the primality of numbers much larger than those we can factor.
Really? The general number field sieve is simple and short? Have you tried to understand it or write an implementation? Simple and short compared to what exactly?
There are some tasks where we can argue that humans are doing a good job by comparison to others in the animal kingdom. Vision is a good example of this (we have some of the best vision of any mammal.) The rest are tasks which no other entities can do very well, and we don’t have any good reason to think humans are anywhere near good at them in an absolute sense. Note also that most humans can’t do math very well (Apparently 10% or so of my calculus students right now can’t divide one fraction by another). And the vast majority of poetry is just awful. It isn’t even obvious to me that the “good” poetry isn’t labeled that way in part simply from social pressure.
A lot of the tasks that humans have specialized in are not generally bottlenecks for useful computation. Improved facial recognition isn’t going to help much with most of the interesting stuff, like recursive self-improvement, constructing new algorithms, making molecular nanotech, finding a theory of everything, figuring out how Fred and George tricked Rita, etc.
This seems to be a good point.
To clarify, subexponential does not mean polynomial, but super-polynomial.
(Interestingly, while factoring a given integer is hard, there is a way to get a random integer within [1..N] and its factorization quickly. See Adam Kalai’s paper Generating Random Factored Numbers, Easily (PDF).
Interesting. I had not seen that paper before. That’s very cute.
This is mostly irrelevant, but think complexity theorists use a weird definition of exponential according to which GNFS might still be considered exponential—I know when they say “at most exponential” they mean O(e^(n^k)) rather than O(e^n), so it seems plausible that by “at least exponential” they might mean Omega(e^(n^k)) where now k can be less than 1.
EDIT: Nope, I’m wrong about this. That seems kind of inconsistent.
They like keeping things invariant under polynomial transformations of the input, since that’s has been observed to be a somewhat “natural” class. This is one of the areas where it seems to not quite.
Hmm, interesting in the notation that Scott says is standard to complexity theory my earlier statement that factoring is “subexponential” is wrong even though it is slower growing than exponential. But apparently Greg Kuperberg is perfectly happy labeling something like 2^(n^(1/2)) as subexponential.
Yes, and this tradeoff exists today with some rough mix between general processors and more specialized ASICs.
I think this will hold true for a while, but it is important to point out a few subpoints:
If moore’s law slows down this will shift the balance farther towards specialized processors.
Even most ‘general’ processors today are actually a mix of CISC and vector processing, with more and more performance coming from the less-general vector portion of the chip.
For most complex real world problems algorithms eventually tend to have much less room for improvement than hardware—even if algorithmic improvements intially dominate. After a while algorithmic improvements end within the best complexity class and then further improvements are just constants and are swamped by hardware improvement.
Modern GPUs for example have 16 or more vector processors for every general logic processor.
The brain is like a very slow processor with massively wide dedicated statistical inference circuitry.
As a result of all this (and the point at the end of my last post) I expect that future AGIs will be built out of a heterogeneous mix of processors but with the bulk being something like a wide-vector processor with alot of very specialized statistical inference circuitry.
This type of design will still have huge flexibility by having program-ability at the network architecture level—it could for example simulate humanish and various types of mammalian brains as well as a whole range of radically different mind architectures all built out of the same building blocks.
We have pretty good maps of the low-level circuitry in the cortex at this point and it’s clearly built out of a highly repetitive base circuit pattern, similar to how everything is built out of cells at a lower level. I don’t have a single good introductory link, but it’s called the laminar cortical pattern.
Yes, there are slight variations, but slight is the keyword. The cortex is highly general—the ‘visual’ region develops very differently in deaf people, for example, creating a entirely different audio processing networks much more powerful than what most people have.
The flexibility is remarkable—if you hook up electrodes to the tongue that send a rough visual signal from a camera, in time the cortical regions connected to the tongue start becoming rough visual regions and limited tongue based vision is the result.
I stand corrected on prime factorization—I saw the exp(....) part and assumed exponential before reading into it more.
This is a good point, but note the huge difference between the abilities or efficiency of an entire human mind vs the efficiency of the brain’s architecture or the efficiency of the lower level components from which it is built—such as the laminar cortical circuit.
I think this discussion started concerning your original point:
The cortical algorithm appears to be a pretty powerful and efficient low level building block. In evolutionary terms it has been around for much longer than human brains and naturally we can expect that it is much closer to optimality in the design configuration space in terms of the components it is built from.
As we go up a level to higher level brain architectures that are more recent in evolutionary terms we should expect there to be more room for improvement.
The mammalian cortex is not specialized for particular tasks—this is the primary advantage of it’s architecture over it’s predecessors (at the cost of a much larger size than more specialized circuitry).
How do you reconcile this claim with the fact that some people are faceblind from an early age and never develop the ability to recognize faces? This would suggest that there’s at least one aspect of humans that is normally somewhat hard-wired.
I’ve read a great deal about the cortex, and my immediate reaction to your statement was “no, that’s just not how it works”. (strong priors)
About one minute later on the Prosopagnosia wikipedia article, I find the first reference to this idea (that of congenital Prosopagnosia):
The idea of congenital prosopagnosia appears to be a new theory supported by one researcher and one? study:
The last part about it being “commonly accompanied by other forms of visual agnosia” gives it away—this is not anything close to what you originally thought/claimed, even if this new research is actually correct.
Known cases of true prosopagnosia are caused by brain damage—what this research is describing is probably a disorder of the higher region (V4 I believe) which typically learns to recognize faces and other complex objects.
However, there is an easy way to cause prosopagnosia during development—prevent the creature from ever seeing faces.
I dont have the link on hand, but there have been experiments in cats where you mess with their vision—by using grating patterns or carefully controlled visual environments, and you can create cats that literally can’t even see vertical lines.
So even the simplest most basic thing which nature could hard-code—a vertical line feature detector, actually develops from the same extremely flexible general cortical circuit—the same circuit which can learn to represent everything from sounds to quantum mechanics.
Humans can represent a massive number of faces, and in general the brain’s vast information storage capacity over the genome (10^15 ish vs 10^9 ish) more or less require a generalized learning circuit.
The cortical circuits do basically nothing but fire randomly when you are born—you really are a blank slate in that respect (although obviously the rest of the brain has plenty of genetically fixed functionality).
Of course the arrangement of the brain’s regions with respect to sensory organs and it’s overall wiring architecture do naturally lead to the familiar specializations of brain regions, but really one should consider this a developmental attractor—information is colonizing each cortex anew, but the similar architecture and similarity of information ensures that two brains end up having largely overlapping colonizations.
There are all sorts of aspects of humans that are normally somewhat—or nearly entirely—hard-wired. The cortex just doesn’t tend to be. Even the parts of the cortex that are similarly specialised in most humans seem to be so due to what they are connected to. (As can be seen by looking at how the atypical cases have adapted differently.) It would surprise me if the inability to recognise faces was caused by a dysfunction in the cortex specifically.
Disclaimer: I disagree with nearly everything else Jacob has said in this thread. This position specifically appears to be well researched.
This is unlikely. We haven’t been selected based on sheer brain power or brain inefficiency. Humans have been selected by their ability to reproduce in a complicated environment. Efficient intelligence helps, but there’s selection for a lot of other things, such as good immune systems and decent muscle systems. A lot of the selection that was brain selection was probably simply around the fantastically complicated set of tasks involved in navigating human societies. Note that human brain size on average has decreased over the last 50,000 years. Humans are subject to a lot of different selection pressures.
(Tangent: This is related to how at a very vague level we should expect genetic algorithms to outperform evolution at optimizing tasks. Genetic algorithms can select for narrow task completion goals, rather than select in a constantly changing environment with competition and interaction between the various entities being bred.)
I largely agree with your point about human evolution, but my point was about the laminar cortical circuit which is shared in various forms across the entire mammalian lineage and has an analog in birds.
It’s a building block pattern that appears to have a long evolutionary history.
Yes, but there is a limit to this of course. We are, after all, talking about general intelligence.
It seems you’re arguing that our successors will develop a preference for simulating universes like ours over paradises. If that’s what you’re arguing, then what reason do we have to believe that this is probable?
If their preferences do not change significantly from ours, it seems highly unlikely that they will create simulations identical to our current existence. And out of the vast space of possible ways their preferences could change, selecting that direction in the absence of evidence is a serious case of privileging the hypothesis.
To uploads, yes, but a faithful simulation of the universe, or even a small portion of it. would have to track a lot more variables than the processes of the human minds within it.
Optimal approximate simulation algorithms are all linear with respect to total observer sensory input. This relates to the philosophical issue of observer dependence in QM and whether or not the proverbial unobserved falling tree actually exists.
So the cost of simulating a matrix with N observers is not expected to be dramatically more than simulating the N observer minds alone—C*N. The phenomena of dreams is something of a practical proof.
Variables that aren’t being observed still have to be tracked, since they affect the things that are being observed.
Dreams are not a very good proof of concept given that they are not coherent simulations of any sort of reality, and can be recognized as artificial not only after the fact, but during with a bit of introspection and training.
In dreams, large amounts of data can be omitted or spontaneously introduced without the dreamer noticing anything is wrong unless they’re lucid. In reality, everything we observe can be examined for signs of its interactions with things that we haven’t observed, and that data adds up to pictures that are coherent and consistent with each other.