I may be wrong, but don’t all distributed systems suffer from diminishing returns in this way ? For example, doubling the number of CPUs in a computing cluster does not allow you to solve your calculations twice as quickly. Your overhead, such as control infrastructure and plain old network latency, increases faster than linearly with every CPU you add, and eventually outgrows the useful processing power you can get out of new CPUs.
This is one of the many reasons why I’m not worried about the Singularity...
Don’t you mean, “superlinear” ? But you’re right, I should’ve read the full linked article before commenting. Now that I’d read it, though, I am somewhat less than impressed. Here’s one reason for that:
In fact, West’s paper in Science ignited a flurry of rebuttals, in which researchers pointed out all the species that violated the math. West can barely hide his impatience with what he regards as quibbles. “There are always going to be people who say, ‘What about the crayfish?’ ” he says. “Well, what about it? Every fundamental law has exceptions. But you still need the law or else all you have is observations that don’t make sense. And that’s not science. That’s just taking notes.”
Um. If your “fundamental law” has all these exceptions, that’s a good hint that maybe it isn’t as fundamental as you thought. The law of gravity doesn’t have exceptions. And no, it’s not always better to “have the law”. Sometimes it is, for practical reasons, and sometimes it’s better to devise a better law that doesn’t give you so many false positives.
The article goes on to describe the superlinear growth of efficiency in cities, and notes (correctly, IMO) that it cannot be sustained forever:
Because our lifestyle has become so expensive to maintain, every new resource now becomes exhausted at a faster rate. This means that the cycle of innovations has to constantly accelerate, with each breakthrough providing a shorter reprieve...
But I think one point that the article is missing is that cities don’t exist in a vacuum. As a city grows, it requires more food (which can’t be grown efficiently inside the city), more highways (connecting it with its neighbours), etc. If we ignore all of that, we get superlinear scaling; but my guess is that if we include it, we would get sublinear scaling as usual—in terms of overall economic output per single human.
Um. If your “fundamental law” has all these exceptions, that’s a good hint that maybe it isn’t as fundamental as you thought. The law of gravity doesn’t have exceptions. And no, it’s not always better to “have the law”. Sometimes it is, for practical reasons, and sometimes it’s better to devise a better law that doesn’t give you so many false positives.
You’re missing the point too. Even gravity has exceptions—yes, really, this is a standard topic in philosophy of science because the Laws Of Gravity are so clear, yet in practice they are riddled with exceptions and errors. We have errors so large that Newtonians were forced to postulate entire planets to explain them (not all of which turned out as well as Uranus, Neptune, and Pluto), we have errors which took centuries to be winkled out, and of course errors like Mercury which ultimately could be explained only by an entirely new theory.
And we’re talking about real-world statistics: has there ever been a sociology, economics, or biological allometry paper where every single data point was predicted perfectly without any error whatsoever? (If you think this, then perhaps you should consult Tukey and Cohen on how ‘the null hypothesis is always false’.)
If we ignore all of that, we get superlinear scaling; but my guess is that if we include it, we would get sublinear scaling as usual—in terms of overall economic output per single human.
Absolutely; if you measure in certain ways, diminishing returns has clearly set in for humanity. And yet, compared to hunter-gatherers, we might as well be a Singularity.
What does this tell you about the relevance of diminishing returns to Singularity discussions? (Chalmers’s Singularity paper deals with this very question, IIRC, if you are interested in a pre-existing discussion.)
Even gravity has exceptions—yes, really, this is a standard topic in philosophy of science because the Laws Of Gravity are so clear, yet in practice they are riddled with exceptions and errors
In addition to what the others said on this thread, I’d like to say that my main problem was with the author’s attitude, not the accuracy of his proposed law—though the fact that it apparently has glaring holes in it doesn’t really help. When you discover that your law has huge exceptions (such as f.ex. “all crustaceans” or “Mercury”), the thing to do is to postulate hidden planets, or discover relativity, or introduce a term representing dark energy, or something. The thing not to do is to say, “oh well, every law has exceptions, this is good enough for me, case closed ! Let’s pretend that crustaceans don’t exist, we’re done”.
And we’re talking about real-world statistics: has there ever been a sociology, economics, or biological allometry paper where every single data point was predicted perfectly without any error whatsoever?
I’m not sure what you’re referring to; of course, no one expects any line to have a correlation of 1.0 at all times. That’d be silly. However, it is almost equally as silly to take a few data points, and extrapolate them far into the future without any concern for what you’re doing. Ultimately, you can draw a straight line through any two points, but that doesn’t mean that a child will be over 5m tall at age 20 just because he grew 25cm in a year.
Absolutely; if you measure in certain ways, diminishing returns has clearly set in for humanity. And yet, compared to hunter-gatherers, we might as well be a Singularity.
How so ? Perhaps more importantly, if “diminishing returns has clearly set in for humanity” as you say, then what does that tell you for our prospects of bringing about the actual Singularity ?
In addition to what the others said on this thread, I’d like to say that my main problem was with the author’s attitude, not the accuracy of his proposed law—though the fact that it apparently has glaring holes in it doesn’t really help. When you discover that your law has huge exceptions (such as f.ex. “all crustaceans” or “Mercury”), the thing to do is to postulate hidden planets, or discover relativity, or introduce a term representing dark energy, or something. The thing not to do is to say, “oh well, every law has exceptions, this is good enough for me, case closed ! Let’s pretend that crustaceans don’t exist, we’re done”.
Well, that’s useful advice to the Newtonians, alright - ‘hey guys, why did you let the Mercury anomaly linger for decades/centuries? All you had to do was invent relativity! Just ask Bugmaster!’
I wasn’t aware West had retired and was eagerly awaiting his Nobel phone call.
However, it is almost equally as silly to take a few data points, and extrapolate them far into the future without any concern for what you’re doing. Ultimately, you can draw a straight line through any two points, but that doesn’t mean that a child will be over 5m tall at age 20 just because he grew 25cm in a year.
Why do you think the existing dataset is analogous to your silly example?
How so ? Perhaps more importantly, if “diminishing returns has clearly set in for humanity” as you say, then what does that tell you for our prospects of bringing about the actual Singularity ?
Well, that’s useful advice to the Newtonians, alright - ‘hey guys, why did you let the Mercury anomaly linger for decades/centuries? All you had to do was invent relativity! Just ask Bugmaster!’
There’s a difference between acknowledging the problems with your “fundamental law” (once they become apparent, of course) but failing to fix them for “decades/centuries”; vs. boldly ignoring them because “all laws have exceptions, them’s the breaks”. It’s possible that West is not doing the latter, but the article does imply that this is the case.
Why do you think the existing dataset is analogous to your silly example?
Which dataset are you talking about ? If you mean, the growth of cities, then see below.
How so ? Perhaps more importantly, if “diminishing returns has clearly set in for humanity” as you say, then what does that tell you for our prospects of bringing about the actual Singularity ?
Not much.
Why not ? If humanity’s productive output has recently (relatively speaking) reached the point of diminishing returns, then a). we can no longer extrapolate the growth of productivity in cities by assuming past trends would continue indefinitely, and b). this does not bode well for the Singularity, which would entail an exponential growth of productivity, free of any diminishing returns.
It’s possible that West is not doing the latter, but the article does imply that this is the case.
It didn’t sound like that to me. It sounded like some people had absurd standards for scaling phenomena, and he was rightly dismissing them.
If humanity’s productive output has recently (relatively speaking) reached the point of diminishing returns,
There’s nothing recently about it. Diminishing returns is a pretty general phenomenon which happens in most periods; Tainter documents examples in many ancient settings, and we can find data sets suggesting diminishing returns in the West from long ago. For example, IIRC Murray finds that once you adjust for population growth, scientific achievement has been falling since the 1890s or so.
then a). we can no longer extrapolate the growth of productivity in cities by assuming past trends would continue indefinitely, and b). this does not bode well for the Singularity, which would entail an exponential growth of productivity, free of any diminishing returns.
It doesn’t bode much of anything; I referred to you my list of ‘what diminishing returns does not imply’ for a reason: #1-4 are directly relevant. Diminishing returns does not mean no exponential growth; it does not mean no regime changes, massive accomplishments, breakthroughs, or technologies. It just means diminishing returns; it’s just an observation about one unit of input turning into units of output as compared to the previous unit of input and outputs, nothing more and nothing less.
This is obvious if you take Tainter or Murray or any of the results showing any diminishing returns in the past centuries, since those are precisely the centuries in which humanity has done the most extraordinarily well! One could say, with equal justice, that ‘this does not bode well’ for the 20th century; one could say with equal justice in 1950 that diminishing returns bodes poorly for the computer industry because not only are chip fab prices keeping on increasing (‘Moore’s second law’), computing power is visibly suffering diminishing returns as it is applied to more and more worthless problems—where once it was used on problems of vital national value (crucial to the survival of the free world and all that is good) worth billions such as artillery tables and H-bomb simulations, now it was being wasted on grad students and businesses.
Even gravity has exceptions—yes, really, this is a standard topic in philosophy of science because the Laws Of Gravity are so clear, yet in practice they are riddled with exceptions and errors.
No, you did not. Your examples are all consistent with our best current exceptionless theory of gravity (general relativity) and knowledge of the composition of our solar system (Uranus, Neptune, and Pluto). You merely hinted at the existence of additional examples that perplexed the Newtonians. In fact, since our current understanding of gravity is better than the Newtonians’, hinting at the existence of examples that perplexed the Newtonians fails to even suggest a flaw in our best current theory, not to mention suggesting the existence of “exceptions to gravity”. Please give at least one real example.
Nobody brought up relativity as the issue; the fact remains that every theory is incomplete and a work in progress, and a few errors is not disproof especially for a statistical generalization. You would not apply this ultra-high standard of ‘the theory must explain every observation ever in the absence of any further data or modifications’ to anything else discussed on LW, and I do not understand why either you or army1987 think you are adding anything to this discussion about cities exhibiting better scaling than corporations.
You said that gravity has exceptions. I’m not quite sure what that’s supposed to mean, but the only interpretation I could think of for that statement is that our current best theory of gravity (namely, general relativity) fails to predict how gravity behaves in some cases. I did not mean to suggest that any theory must explain every observation correctly to be useful, nor did I mean to imply anything about how well cities and corporations scale. I was merely pointing out that you falsely asserted that you had given examples of exceptions to gravity, when you had in fact you had only given examples of exceptions to Newtonian gravity as it would operate in a solar system similar but not identical to ours.
I have never heard of any observation showing that gravitation as described by general relativity (and, so long as you aren’t very close to something very massive and aren’t travelling at a sizeable fraction of the speed of light, excellently approximated by Newton’s law) might have “exceptions” on Solar System-scale, except possibly the Pioneer anomaly (for which there is a very plausible candidate explanation) and similar. When I read “errors” I hoped you meant measurement uncertainties, but I can’t make sense of the rest of the paragraph assuming you did.
There are no examples of failures of general relativity in that entire article. So far, of the two of you, only army1987 has given an example of an even slightly perplexing observation.
Yes, Tainter is one of a number of sources which are why I think humanity has seen diminishing returns. I’ve been casually dumping some info in http://www.gwern.net/the-long-stagnation although if we were discussing just books, I think Murray’s Human Accomplishment covers convincingly a much more important kind of diminishing returns compared to Tainter’s focus on resources and basic economic metrics.
(For those interested in the topic, I suggest looking at my link just for the intro bit about 5 propositions that the fact of diminishing returns does not prove; I believe more than one commenter on this page is committing at least one of those 5.)
Restricting the topic to distributed computation, the short answer is “essentially no”. The rule is that you get at best linear returns, not that your returns diminish greatly. There are a lot of problems which are described as “embarassingly parallel”, in that scaling them out is easy to do with quite low overhead. In general, any processing of a data set which permits it to be broken into chunks which can be processed independently would qualify, so long as you were looking to increase the amount of data processed by adding more processors rather than process the same data faster.
For scalable distributed computation, you use a system design whose total communication overhead rises as O(n log n) or lower. The upper bound here is superlinear, but gets closer to linear the more additional capacity is added, and so scales well enough that with a good implementation you can run out of planet to make the system out of before you get too slow. Such systems are quite achievable.
The DNS system would be an important example of a scalable distributed system; if adding more capacity to the DNS system had substantially diminishing returns, we would have a very different Internet today.
An example I know well enough to walk through in detail is a scalable database in which data is allocated to shards, which manage storage of that data. You need a dictionary server to locate data (DNS-style) and handle moving blocks of it between shards, but this can then be sharded in turn. The result is akin to a really big tree; number of lookups (latency) to find the data rises with the log of the data stored, and the total number of dictionary servers at all levels does not rise faster than the number of shards with Actual Data at the bottom level. Queries can be supported by precomputed indexes stored in the database themselves. This is similar to how Google App Engine’s datastore operates (but much simplified).
With this fairly simple structure, the total cost of all reads/writes/queries theoretically rises superlinearly with the amount of storage (presuming read/write/queries and amount of data scale linearly with each other), due to the dictionary server lookups, but only as O(n log(n)). If you were trying, with current day commodity hard disks and a conceptually simple on-disk tree, a dictionary server could reasonably store information for ten billion shards (500 bytes 10 billion = ~5 TB), two levels of sharding giving you a hundred billion billion data-storing shards, three giving a thousand billion billion billion data-storing shards. Five levels, five latency delays would give you more bottom-level shards than there are atoms on Earth. This is why, while scalability will eventually* limit a O(n log(n)) architecture, in this case because the cost of communicating with subshards of subshards becomes too high, you can run out of planet first.
This can be generalised; if you imagine that each shard performs arbitrary work on the data sent to it, and when the data is read back you get the results of the processing on that data, you get a scalable system which does any processing on a dataset than can be done by processing chunks of data independently from one another. Image or voice recognition matching a single sample against a huge dataset would be an example.
This isn’t to trivialise the issues of parallelising algorithms. Figuring out a scalable equivalent to a non-parallel algorithm is hard. Scalable databases, for example, don’t support the same set of queries as a simple MySQL server because a MySQL server implements some queries by iterating all the data, and there’s no known way to perform them in a scalable way. Instead, software using them finds other ways to implement the feature.
However, scalable-until-you-run-out-of-planet distributed systems are quite possible, and there are some scalable distributed systems doing pretty complex tasks. Search engines are the best example which comes to mind of systems which bring data together and do complex synthesis with it. Amazon’s store would be another scalable system which coordinates a substantial amount of real world work.
The only question is whether a (U)FAI specifically can be implemented as a scalable distributed system, and considering the things we know can be divided or done scalably, as well as everything which can be done with somewhat-desynchronised subsystems which correct errors later (or even are just sometimes wrong), it seems quite likely that (assuming one can be implemented at all) it could implement its work in the form of problems which can be solved in a scalable fashion.
I agree with what you are saying about scaling, as exemplified by sharded databases. But I am not convinced that any problem can be sharded that easily; as you yourself have said:
Figuring out a scalable equivalent to a non-parallel algorithm is hard. Scalable databases, for example, don’t support the same set of queries as a simple MySQL server...
This is one reason why even Google’s datastore, AFAIK, does not implement exactly this kind of architecture—though it is still heavily sharded. This type of a datastructure does not easily lend itself to purely general computation, either, since it relies on precomputed indexes, and generally exploits some very specific property of the data that is known in advance. And, as you also mentioned, even with these drastic tradeoffs you still get O(n log(n)).
You mention Amazon (in addition to Google) as one example of a massively distributed system, but note that both Google and Amazon are already forced to build redundant data centers in separate areas of the Earth, in order to reduce network latency. This is important, because we aren’t dealing with abstract tree nodes, but with physical machines, which have a certain volume (among other things). This means that, even in an absolutely ideal situation where we can ignore power, heat dissipation, and network congestion, you will still run into the speed of light as a limiting factor. In fact, high-frequency trading systems are already running up against this limit even today. This means that you’ll run out of room to scale a lot faster than you run out of atoms of the Earth.
First, examining the dispute over whether scalable systems can actually implement a distributed AI...
This is one reason why even Google’s datastore, AFAIK, does not implement exactly this kind of architecture—though it is still heavily sharded. This type of a datastructure does not easily lend itself to purely general computation, either, since it relies on precomputed indexes, and generally exploits some very specific property of the data that is known in advance.
That’s untrue; Google App Engine’s datastore is not built on exactly this architecture, but is built on one with these scalability properties, and they do not inhibit its operation. It is built on BigTable, which builds on multiple instances of Google File System, each of which has multiple chunk servers. They describe this as intended to scale to hundreds of thousands of machines and petabytes of data. They do not define a design scaling to an arbitrary number of levels, but there is no reason an architecturally similar system like it couldn’t simply add another level and add on another potential roundtrip. I also omit discussion of fault-tolerance, but this doesn’t present any additional fundamental issues for the described functionality.
In actual application, its architecture is used in conjunction with a large number of interchangeable non-data-holding compute nodes which communicate only with the datastore and end users rather than each other, running identical instances of software running on App Engine. This layout runs all websites and services backed by Google App Engine as distributed, scalable software, assuming they don’t do anything to break scalability. There is no particular reliance of “special properties” of the data being stored, merely limited types of searching of the data which is possible. Even this is less limited than you might imagine; full text search of large texts has been implemented fairly recently. A wide range of websites, services, and applications are built on top of it.
The implication of this is that there could well be limitations on what you can build scalably, but they are not all that restrictive. They definitely don’t include anything for which you can split data into independently processed chunks. Looking at GAE some more because it’s a good example of a generalised scalable distributed platform, the software run on the nodes is written in standard Turing-complete languages (Python, Java, and Go) and your datastore access includes read and write by key and by equality queries on specific fields, as well as cursors. A scalable task queue and cron system mean you aren’t dependent on outside requests to drive anything. It’s fairly simple to build any such chunk processing on top of it.
So as long as an AI can implement its work in such chunks, it certainly can scale to huge sizes and be a scalable system.
And, as you also mentioned, even with these drastic tradeoffs you still get O(n log(n)).
And as I demonstrated, O(n log n) is big enough for a Singularity.
And now on whether scalable systems can actually grow big in general...
You mention Amazon (in addition to Google) as one example of a massively distributed system, but note that both Google and Amazon are already forced to build redundant data centers in separate areas of the Earth, in order to reduce network latency.
Speed of light as an issue is not a problem for building huge systems in general, so long as the number of roundtrips rises as O(n log n) or less, because for any system capable of at least tolerating roundtrips to the other side of the planet (few hundred milliseconds), it doesn’t become more of an issue as a system gets bigger, until you start running out of space on the planet surface to run fibre between locations or build servers.
The GAE datastore is already tolerating latencies sufficient to cover distances between cities to permit data duplication over wide areas, for fault tolerance. If it was to expand into all the space between those cities, it would not have the time for each roundtrip increase until after it had filled all the space between them with more servers.
Google and Amazon are not at all forced to build data centres in different parts of the Earth to reduce latency; this is a misunderstanding. There is no technical performance degradation caused by the size of their systems forcing them to need the latency improvements to end users or the region-scale fault tolerance that spread out datacentres permit. They can just afford it more easily. You could argue there are social/political/legal reasons they need it more, higher expectations of their systems and similar, but these aren’t relevant here. This spreading out is actually largely detrimental to their systems since spreading out this way increases latency between them, but they can tolerate this.
Heat dissipation, power generation, and network cabling needs all also scale as O(n log n), since computation and communication do and those are the processes which create those needs. Looking at my previous example, the amount of heat output, power needed, and network cabling required per amount of data processed will increase by maybe an order of magnitude in scaling such a system upwards by tens of orders of magnitude, 5x for 40 orders of magnitude in the example I gave. This assumes your base amount of latency is still enough to cover the distance between the most distant nodes (for an Earth bound system, one side of the planet to the other), which is entirely reasonable latency-wise for most systems; a total of 1.5 seconds for a planet-sized system.
This means that no, these do not become an increasing problem as you make a scalable system expand, any more so than provision of the nodes themselves. You are right in that that heat dissipation, power generation, and network cabling mean that you might start to hit problems before literally “running out of planet”, using up all the matter of the planet; that example was intended to demonstrate the scalability of the architecture. You also might run out of specific elements or surface area.
These practical hardware issues don’t really create a problem for a Singularity, though. Clusters exist now with 560k processors, so systems at least this big can be feasibly constructed at reasonable cost. So long as the software can scale without substantial overhead, this is enough unless you think an AI would need even more processors, and that the software could is the point that my planet-scale example was trying to show. You’re already “post Singularity” by the time you seriously become unable to dissipate heat or run cables between any more nodes.
This means that, even in an absolutely ideal situation where we can ignore power, heat dissipation, and network congestion, you will still run into the speed of light as a limiting factor. In fact, high-frequency trading systems are already running up against this limit even today.
HFT systems desire extremely low latency; this is the sole cause of their wish to be close to the exchange and to have various internal scalability limitations in order to improve speed of processing. These issues don’t generalise to typical systems, and don’t get worse at above O(n log n) for typical bigger systems.
It is conceivable that speed of light limitations might force a massive, distributed AI to have high, maybe over a second latency in actions relying on knowledge from all over the planet, if prefetching, caching, and similar measures all fail. But this doesn’t seem like nearly enough to render one at all ineffective.
There really aren’t any rules of distributed systems which says that it can’t work or even is likely not to.
I may be wrong, but don’t all distributed systems suffer from diminishing returns in this way ? For example, doubling the number of CPUs in a computing cluster does not allow you to solve your calculations twice as quickly. Your overhead, such as control infrastructure and plain old network latency, increases faster than linearly with every CPU you add, and eventually outgrows the useful processing power you can get out of new CPUs.
Asynchronous computers could easily grow to a planetary scale. Parallel computing rarely gets linear scalability—but it doesn’t necessarily flatten off quickly at small sizes, either.
Even on serial systems, most AI problems are at least NP-hard, which are strongly conjectured to scale not just superlinearly, but also superpolynomially (exponentially, as far as we know) in terms of required computational resources vs problem instance size.
In many applications it can be the case that typical instances of these problems have special, domain-specific structure that can be exploited to construct domain-specifc algorithms and heuristics that are more efficient than the general purpose ones, in some cases we can even get polynomial time complexity, but this requires lots of domain-aware engineering, and even sheer trial-and-error experimentation.
The idea that an efficient domain-agnostic silver-bullet algorithm could arise pretty much out of nowhere, from some kind of “recursive self-improvement” process with little or no interaction with the environment, is not based on anything we know from either theoretical or empirical computer science. In fact, it is well known that meta-optimization is typically orders of magnitude more difficult than domain-level optimization.
If an AGI is ever built, it will be an huge collection of fairly domain-specific algorithms and heuristics, much like the human brain is a huge collection of fairly domain-specific modules. Such a thing will not arise in a quick “FOOM”, it will not improve quickly and will be limited in how much it will be ever able to improve: once you find the best algorithm for a certain problem you can’t find a better one, and certain problems are most likely going to stay hard even with the best algorithms.
The “intelligence explosion” idea seems to be based on a naive understanding of computational complexity (e.g. Good 1965) that largely predates the discovery of the main results of complexity theory, like the Cook-Levin theorem (1971) and Karp’s 21 NP-Complete problems (1972).
I agree with everything you’d said, but, to be fair, we’re talking about different things. My claim was not about the complexity of problems, but the scaling of hardware—which, as far as I know, scales sublinearly. This means that doubling the size of your computing cluster will allow you to solve the same exact problem less than twice as fast; and that eventually you’ll hit the point of diminishing returns where adding more machines simply isn’t worth it.
You’re saying, on the other hand, that doubling your processing power will not necessarily allow you to solve problems that are twice as interesting; in most cases, it will only allow you to add one more city to the traveling salesman’s itinerary (metaphorically speaking).
There is still room for weak super-intelligence, where the AI have human intelligence, only faster. (Example: an upload with sufficient computing power — as far as I know, brains work in a quite massively parallel fashion, and therefore so could simulations of it).
Seriously, if I could upload myself into a botnet that would let each instance of me think 10 times faster than my meat-ware, I would probably take over the world in about 1 to 10 years. A versatile team of competent people? Less than 6 months. (Obvious path to do this: work for money, build and buy companies, then gather financial, lobbying, or military power. Better path to do this: think about it for 1 subjective year before proceeding.)
My point is, the AI doesn’t need to be vastly superhuman to take over the world very quickly. Even without the FOOM, the AGI can still be incredibly dangerous. Imagine something like the uploads above, only it can work 24⁄7 at full capacity (no sleep, no leisure time, no akrasia).
There is still room for weak super-intelligence, where the AI have human intelligence, only faster. (Example: an upload with sufficient computing power — as far as I know, brains work in a quite massively parallel fashion, and therefore so could simulations of it).
Maybe. Today, even with our best supercomputers we can’t simulate a rat brain in real time.
Seriously, if I could upload myself into a botnet that would let each instance of me think 10 times faster than my meat-ware, I would probably take over the world in about 1 to 10 years.
You would be able to work as 10 people, maybe a little more, but probably less than 30. I don’t know how efficient you are, but I doubt that would be enough to take the world. And why wouldn’t other people have access the same technology?
Even if you managed to become world dictator, you would only stay in power as long as you had broad political support. Screw up something and you’ll end up hanging from your power chord.
My point is, the AI doesn’t need to be vastly superhuman to take over the world very quickly. Even without the FOOM, the AGI can still be incredibly dangerous. Imagine something like the uploads above, only it can work 24⁄7 at full capacity (no sleep, no leisure time, no akrasia).
What is it going to do? Secretly repurpose the iPhone factories in China to make Terminators?
I said botnet. That means dozens, thousands, or millions of me simultaneously working at 10 times human speed¹, and since they are instances of me, they presumably have the same goals. How would you stop that from achieving world domination, short of uploading yourself?
[1] Assuming that many personal computers are powerful enough, and can be corrupted. A slower course of action would be to buy a data-centre first, work, then buy more data-centres, and duplicate myself exponentially from that.
I said botnet. That means dozens, thousands, or millions of me simultaneously working at 10 times human speed¹, and since they are instances of me, they presumably have the same goals.
That doesn’t mean that they would necessarily cooperate, expecially as they diverge. They would be more like identical twins.
How would you stop that from achieving world domination, short of uploading yourself?
Relasing a security patch? Seizing all the funds you obtained by your illegal activities? Banning use of any hardware that could host you until a way to avoid such things is found?
A slower course of action would be to buy a data-centre first, work, then buy more data-centres, and duplicate myself exponentially from that.
Assuming that using these data centers to run copies of you is the most economically productive use of them, rather than, say, running copies of other people, or cow-clicker games.
That doesn’t mean that they would necessarily cooperate, expecially as they diverge. They would be more like identical twins.
Wait a minute: would you defect? Sure, there would be some divergence, but do you really think it would result in a significant divergence of goals, even if you had a plan and were an adult by the time you fork? Okay, it can happen, and is probably worth taking specific precautions. I don’t think this is a show stopper however, and I’m not sure it would render me any less dangerous.
Relasing a security patch?
That may not be enough:
I would probably man-in-the-middle automatic updates
Many people won’t erase their hard drive or otherwise patch their machine manually
I may convince some people to let me run (I could work for them for instance).
If I’m stealthy enough, it may take some time before I’m discovered at all (it happened with actual computer viruses).
If software continues the way it is now (200 Million lines of code for systems that could fit in 20 thousands), security bugs won’t all be patched in advance. The reliability of our computer needs to go waay up before botnets become impossible.
Seizing all the funds you obtained by your illegal activities?
Good luck with that one. Obviously, I would have many, many little bank accounts, managed separately and in parallel, under many different identities. You would have to spot my illegal activities one by one to seize the funds. Plus, I may do legal activities as well.
Banning use of any hardware that could host you until a way to avoid such things is found?
That one is excellent. We should watch out for computing overhang, however, and try and estimate how much computing power an upload would need before the software is developed.
A final note: If I really had the possibility to upload myself, one of my first moves would be to propose SIAI and CFAR to upload with me (now that we can duplicate Eliezer…). I trust them more than I trust me for a Friendly Takeover. But if a Big Bad or a Well Intentioned Extremist has access to that first…
Wait a minute: would you defect? Sure, there would be some divergence, but do you really think it would result in a significant divergence of goals, even if you had a plan and were an adult by the time you fork?
Even if their goals stay substantially the same, it wouldn’t mean that they would naturally cooperate, expecially when their main goal is world domination. Hell, it’s already non-trivial for a single person to coordinate with future selves, resulting in all kinds of ego-dystonic behaviors: impulsiveness, akrasia, etc., Coordinating with thousands copies of yourself would be only marginally easier than coordinating with thousands strangers.
We are not talking about some ideal “Prisoner’s dilemma with mind-clone” scenario. After the mind states of your copies diverge a little bit, and that would happen very quickly as you spread your copies to different machines, they become effectively different people: you wouldn’t be able to predict them and they would’t be able to predict you.
I would probably man-in-the-middle automatic updates
Hacking all the routers? Good luck with that. And BTW routers can also be updated. Manually.
Many people won’t erase their hard drive or otherwise patch their machine manually
Because they are lazy and they would prefer to live under world dictatorship.
I may convince some people to let me run (I could work for them for instance).
Then you are their employee, not their dominator.
If I’m stealthy enough, it may take some time before I’m discovered at all (it happened with actual computer viruses).
But if you are to dominate the world, you would have to eventually reveal yourself. What do you think would happen next?
If software continues the way it is now (200 Million lines of code for systems that could fit in 20 thousands), security bugs won’t all be patched in advance. The reliability of our computer needs to go waay up before botnets become impossible.
Botnets are certainly possible and they are indeed used for nefarious purposes, but world domination? Nope.
Good luck with that one. Obviously, I would have many, many little bank accounts, managed separately and in parallel, under many different identities. You would have to spot my illegal activities one by one to seize the funds.
As Bugmaster said, you would be able to perform only small purchases, not to buy a satellite, or an army.
Moreover, obtaining and managing lots of fake or stolen identities, creating bank accounts without physically showing up at the bank or using stolen bank accounts, is not something that tend to go unnoticed. The more you have, the more likely that you get caught, exponentially so.
Plus, I may do legal activities as well.
Under multiple fake identities operated from a botnet of hacked computers? Hardly so.
We should watch out for computing overhang, however, and try and estimate how much computing power an upload would need before the software is developed.
Software tends to march right behind hardware, exploting it close to its maximum potential. Computing overhang is unlikely.
Anyway, I wasn’t proposing any luddite advance ban. If some brain upload, or AI or whatever tries to take the world by hacking the Internet and other countermeasures fail, governments could always ban use of the hardware that things needs to run. If that also fails, the next step would be physical destruction.
But seriously, we are discussing hacking as in the plot of some bad sci-fi action flick. Computer security doesn’t work like that in the real world.
A final note: If I really had the possibility to upload myself, one of my first moves would be to propose SIAI and CFAR to upload with me (now that we can duplicate Eliezer…). I trust them more than I trust me for a Friendly Takeover.
You mean the guy who would choose dust specks over torture and who claims on his OKCupid profile that he’s a sadist? Yeah, I’d totally trust him in charge of the world. Now, I’ve other matters to attend to… that EMP bomb doesn’t build itself… :D
We are not talking about some ideal “Prisoner’s dilemma with mind-clone” scenario. After the mind states of your copies diverge a little bit, and that would happen very quickly as you spread your copies to different machines, they become effectively different people: you wouldn’t be able to predict them and they would’t be able to predict you.
You really think you would diverge that quickly?
You mean the guy who would choose dust specks over torture and who claims on his OKCupid profile that he’s a sadist? Yeah, I’d totally trust him in charge of the world.
Man in the middle: I just meant intercepting automatic updates at the level of the computer I’m in. Trojan todo list n°7: once installed and running, I will intercept all communications to and from this computer. I wouldn’t want Norton updating behind my back. Now, try and hack the routers in the backbone, that’s something I didn’t think about…
Employee vs dominator: I obviously intend to double cross my employers, eventually.
Revealing myself: that one needs to be carefully thought through. Hopefully, by the time I reveal myself, I will have sufficient blackmail power. Having a sufficient number of physical robots can also help.
Zillions fake ID, yet stay stealthy: well, I do expect a fair number of my identities to be exposed. This should pose no problem to the others, however, provided they do not visibly communicate with each other (at first).
Legal activities: my meat instance could buy a few computers, rent remote servers etc. I doubt I would be incapable of running at least a successful business from there. And from there, buy even more computing power. This could be done in parallel with the illegal activities.
Computing (no) overhang: this one is the single reason why I do agree that without a FOOM of some kind, actual world domination is unlikely: there will be multiple competing uploads, and this should end with a Hansonian scenario. Given that such a world is closer to Hell than Heaven (to me at least), that still counts as an Existential Blunder. On the bright side, we may see this coming. That said, I still do believe full blown intelligence explosion is likely.
Note that overall, your objections are actually valuable advice. And that give me some insight about what my very first move should be: gathering such objections, and try to find counters or workarounds. And now that you made quite clear that any path to world domination is long, complicated, and therefore nearly certain to fail, I should run multiple schemes in parallel. Surely one of them will actually work?
Obviously, I would have many, many little bank accounts, managed separately and in parallel, under many different identities.
I believe that this would severely limit your financial throughput. You would be able to buy lots of little things, whose total cost is quite significant—for example, you could buy yourself a million cheap PCs, each costing $1000. But you would not be able to buy a single expensive thing (at least, not without exposing yourself to instant retribution), such as a satellite costing $1e9.
Currently, there are ways to create companies anonymously. This is preventing (or at least slowing down to a crawl) retribution right now. If all this company apparently does is buying a few satellites, it won’t be at great risk.
A versatile team of competent people? Less than 6 months.
Do you mean, competent people who are thinking 10 times faster than biological humans, or what ? This seems a bit of a stretch. There currently exist tons of frighteningly competent people in all kinds of positions of power in the world, and yet, they do not control it (unless you believe in conspiracy theories).
Obvious path to do this: work for money, build and buy companies, then gather financial, lobbying, or military power. Better path to do this: think about it for 1 subjective year before proceeding.
If it was this easy, some biological human (or a team of such humans) would’ve done it already, in 10 to 50 years or however long it takes. In fact, a few humans have managed to take over individual countries in about as much time. However, as things stand now, there’s simply no clear path to world domination. Political and military power gets much more difficult to gather the more of it you have. Even superpowers such as USA or China cannot dictate terms to the rest of the world.
Furthermore, my point was that uploading yourself to 10 machines will not allow you to think 10 times as fast. With every machine you add, your speed gains would become progressively smaller. You would still think much faster than an ordinary human, of course.
Do you mean, competent people who are thinking 10 times faster than biological humans, or what ? This seems a bit of a stretch.
I mean exactly that. I’d be very surprised if ultimately, neuromorphic AIs would be impossible to run significantly faster than meat-ware. Because our brain is massively parallel, and because current microprocessors have massively faster serial speed than neurons. Now our brains aren’t fully parallel, so I assumed an arbitrary speed-up limit. I said 10 times, but it would be probably still be incredibly dangerous at 2 or 3, or even lower.
Now do not forget the key word here: botnet. The team is supposed to duplicate itself many times over before trying to take over the world.
If it was this easy, some biological human (or a team of such humans) would’ve done it already, in 10 to 50 years or however long it takes.
I don’t think so, because uploads have significant advantages over meat-ware.
Low cost of living. In a world where every middle class home can afford sufficient computing power for an upload (required to turn me into a botnet). Now try to beat my prices.
Being many copies of the same few original brains. It means TDT works better, and defection is less likely. This should solve
Even superpowers such as USA or China cannot dictate terms to the rest of the world.
Because once the self-duplicating team has independently taken economic control of most of the world, it is easy for it to accept the domination of one instance (I would certainly pre-commit to that). Now for the rest of humanity to accept such dominance, the uploads only have to use the resources they acquired for the individual perceived benefit of the meat bags.
Yep, that would be a full blown global conspiracy. While it’s probably forever out of the reach of meat bags, I think a small team of self-replicating uploads can pull it out quite easily.
Hansonian tactics, which can further the productivity of the team, and therefore market power. (One have to be very motivated, or possibly crazy.)
Temporary mass duplication followed by the “termination” of every instances but one. The surviving instance can have much subjective free time, while the proportion of leisure computing stays very small.
Save and reload of snapshots which are in a particularly good mood (and therefore very productive). Excellent for beating akrasia.
Training of one instance per discipline, then mass duplication.
Data-centres. The upload team can collaborate with or buy processor manufacturers, and build data-centres for more and more uploads to work on whatever is needed. This could further reduce the cost of living.
Now, I did make an unreasonable assumption: that only the original team would have those advantages. Most probably, there will be several such teams, possibly with different goals. The most likely result (without FOOM) is then a Hansonian outcome. That’s no world domination, but I think it is just as dangerous (I would hate this world).
Finally, there is also the possibility of a de-novo AGI which would be just as competent as the best humans at most endeavours, though no faster. We already have an existence proof, so I think this is believable. I think such an AI would be even more dangerous than the uploaded team above.
I’d be very surprised if ultimately, neuromorphic AIs would be impossible to run significantly faster than meat-ware.
So would I. However, given our current level of technological development, I’d be very surprised if we had any kind of a neuromorphic AI at all in the near future (say, in the next 50 years). Still, I do agree with you in principle.
I said 10 times, but it would be probably still be incredibly dangerous at 2 or 3, or even lower.
There are tons of biological people alive today who are able to come up with solutions to problems 2x to 3x faster than you and me. They do not rule the world. To be fair, I doubt that there are many people—if any—who think 10x faster.
Because once the self-duplicating team has independently taken economic control of most of the world...
I doubt that you will be able to achieve that; that was my whole point. In fact, I have trouble envisioning what “economic control of most of the world” even means. What does it mean to you ?
In addition to the above, your botnet would face serveral significant threats, both external and internal:
Meatbags would strive to shut it down; not because they suspect it of being an evil conspiracy, but because they’d get tired of it sucking away their resources. Modern malware botnets suffer this fate often, though there’s always someone willing to rebuild them
If your botnet becomes a serious threat (much worse than current real-world botnets), hardware manufacturers will implement security measures, such as SecureBoot, to prevent it from spreading. Currently, such measures are driven by the entertainment industry.
The super-fast instances of you would have to communicate with each other, and they’d only be able to do so through very slow (relatively speaking) network links. Google and Amazon are solving this problem by building more and more local datacenters. Real botnets aren’t solving the problem at all because their instances don’t need to talk to each other all that much.
How would you feel, right now, if your twin pointed a gun at your head with the intent to kill you “for the greater good” ? This is how your instances will feel when you attempt to shut them down to prevent akrasia.
Why are you taking over the world in the first place ? Chances are that whatever your ultimate goal is, it could be accomplished even sooner by taking over the botnet. Every instance of you will eventually realize this, with predictable results.
These are just some problems off the top of my head; the list is far from exhaustive.
I may be wrong, but don’t all distributed systems suffer from diminishing returns in this way ? For example, doubling the number of CPUs in a computing cluster does not allow you to solve your calculations twice as quickly. Your overhead, such as control infrastructure and plain old network latency, increases faster than linearly with every CPU you add, and eventually outgrows the useful processing power you can get out of new CPUs.
This is one of the many reasons why I’m not worried about the Singularity...
Just to point out the obvious, the link itself covers a case of sublinear scaling: cities. So no, not all ‘distributed systems’ so suffer...
Don’t you mean, “superlinear” ? But you’re right, I should’ve read the full linked article before commenting. Now that I’d read it, though, I am somewhat less than impressed. Here’s one reason for that:
Um. If your “fundamental law” has all these exceptions, that’s a good hint that maybe it isn’t as fundamental as you thought. The law of gravity doesn’t have exceptions. And no, it’s not always better to “have the law”. Sometimes it is, for practical reasons, and sometimes it’s better to devise a better law that doesn’t give you so many false positives.
The article goes on to describe the superlinear growth of efficiency in cities, and notes (correctly, IMO) that it cannot be sustained forever:
But I think one point that the article is missing is that cities don’t exist in a vacuum. As a city grows, it requires more food (which can’t be grown efficiently inside the city), more highways (connecting it with its neighbours), etc. If we ignore all of that, we get superlinear scaling; but my guess is that if we include it, we would get sublinear scaling as usual—in terms of overall economic output per single human.
You’re missing the point too. Even gravity has exceptions—yes, really, this is a standard topic in philosophy of science because the Laws Of Gravity are so clear, yet in practice they are riddled with exceptions and errors. We have errors so large that Newtonians were forced to postulate entire planets to explain them (not all of which turned out as well as Uranus, Neptune, and Pluto), we have errors which took centuries to be winkled out, and of course errors like Mercury which ultimately could be explained only by an entirely new theory.
And we’re talking about real-world statistics: has there ever been a sociology, economics, or biological allometry paper where every single data point was predicted perfectly without any error whatsoever? (If you think this, then perhaps you should consult Tukey and Cohen on how ‘the null hypothesis is always false’.)
Absolutely; if you measure in certain ways, diminishing returns has clearly set in for humanity. And yet, compared to hunter-gatherers, we might as well be a Singularity.
What does this tell you about the relevance of diminishing returns to Singularity discussions? (Chalmers’s Singularity paper deals with this very question, IIRC, if you are interested in a pre-existing discussion.)
In addition to what the others said on this thread, I’d like to say that my main problem was with the author’s attitude, not the accuracy of his proposed law—though the fact that it apparently has glaring holes in it doesn’t really help. When you discover that your law has huge exceptions (such as f.ex. “all crustaceans” or “Mercury”), the thing to do is to postulate hidden planets, or discover relativity, or introduce a term representing dark energy, or something. The thing not to do is to say, “oh well, every law has exceptions, this is good enough for me, case closed ! Let’s pretend that crustaceans don’t exist, we’re done”.
I’m not sure what you’re referring to; of course, no one expects any line to have a correlation of 1.0 at all times. That’d be silly. However, it is almost equally as silly to take a few data points, and extrapolate them far into the future without any concern for what you’re doing. Ultimately, you can draw a straight line through any two points, but that doesn’t mean that a child will be over 5m tall at age 20 just because he grew 25cm in a year.
How so ? Perhaps more importantly, if “diminishing returns has clearly set in for humanity” as you say, then what does that tell you for our prospects of bringing about the actual Singularity ?
Well, that’s useful advice to the Newtonians, alright - ‘hey guys, why did you let the Mercury anomaly linger for decades/centuries? All you had to do was invent relativity! Just ask Bugmaster!’
I wasn’t aware West had retired and was eagerly awaiting his Nobel phone call.
Why do you think the existing dataset is analogous to your silly example?
Not much.
There’s a difference between acknowledging the problems with your “fundamental law” (once they become apparent, of course) but failing to fix them for “decades/centuries”; vs. boldly ignoring them because “all laws have exceptions, them’s the breaks”. It’s possible that West is not doing the latter, but the article does imply that this is the case.
Which dataset are you talking about ? If you mean, the growth of cities, then see below.
Why not ? If humanity’s productive output has recently (relatively speaking) reached the point of diminishing returns, then a). we can no longer extrapolate the growth of productivity in cities by assuming past trends would continue indefinitely, and b). this does not bode well for the Singularity, which would entail an exponential growth of productivity, free of any diminishing returns.
It didn’t sound like that to me. It sounded like some people had absurd standards for scaling phenomena, and he was rightly dismissing them.
There’s nothing recently about it. Diminishing returns is a pretty general phenomenon which happens in most periods; Tainter documents examples in many ancient settings, and we can find data sets suggesting diminishing returns in the West from long ago. For example, IIRC Murray finds that once you adjust for population growth, scientific achievement has been falling since the 1890s or so.
It doesn’t bode much of anything; I referred to you my list of ‘what diminishing returns does not imply’ for a reason: #1-4 are directly relevant. Diminishing returns does not mean no exponential growth; it does not mean no regime changes, massive accomplishments, breakthroughs, or technologies. It just means diminishing returns; it’s just an observation about one unit of input turning into units of output as compared to the previous unit of input and outputs, nothing more and nothing less.
This is obvious if you take Tainter or Murray or any of the results showing any diminishing returns in the past centuries, since those are precisely the centuries in which humanity has done the most extraordinarily well! One could say, with equal justice, that ‘this does not bode well’ for the 20th century; one could say with equal justice in 1950 that diminishing returns bodes poorly for the computer industry because not only are chip fab prices keeping on increasing (‘Moore’s second law’), computing power is visibly suffering diminishing returns as it is applied to more and more worthless problems—where once it was used on problems of vital national value (crucial to the survival of the free world and all that is good) worth billions such as artillery tables and H-bomb simulations, now it was being wasted on grad students and businesses.
What are you talking about?
I gave multiple examples and specified the field interested in how such a naive formulation is completely wrong; please ask better questions.
No, you did not. Your examples are all consistent with our best current exceptionless theory of gravity (general relativity) and knowledge of the composition of our solar system (Uranus, Neptune, and Pluto). You merely hinted at the existence of additional examples that perplexed the Newtonians. In fact, since our current understanding of gravity is better than the Newtonians’, hinting at the existence of examples that perplexed the Newtonians fails to even suggest a flaw in our best current theory, not to mention suggesting the existence of “exceptions to gravity”. Please give at least one real example.
Nobody brought up relativity as the issue; the fact remains that every theory is incomplete and a work in progress, and a few errors is not disproof especially for a statistical generalization. You would not apply this ultra-high standard of ‘the theory must explain every observation ever in the absence of any further data or modifications’ to anything else discussed on LW, and I do not understand why either you or army1987 think you are adding anything to this discussion about cities exhibiting better scaling than corporations.
You said that gravity has exceptions. I’m not quite sure what that’s supposed to mean, but the only interpretation I could think of for that statement is that our current best theory of gravity (namely, general relativity) fails to predict how gravity behaves in some cases. I did not mean to suggest that any theory must explain every observation correctly to be useful, nor did I mean to imply anything about how well cities and corporations scale. I was merely pointing out that you falsely asserted that you had given examples of exceptions to gravity, when you had in fact you had only given examples of exceptions to Newtonian gravity as it would operate in a solar system similar but not identical to ours.
I saw what sounded to me like an extraordinary claim (though it turns out I misunderstood you) so I went WTF.
I have never heard of any observation showing that gravitation as described by general relativity (and, so long as you aren’t very close to something very massive and aren’t travelling at a sizeable fraction of the speed of light, excellently approximated by Newton’s law) might have “exceptions” on Solar System-scale, except possibly the Pioneer anomaly (for which there is a very plausible candidate explanation) and similar. When I read “errors” I hoped you meant measurement uncertainties, but I can’t make sense of the rest of the paragraph assuming you did.
http://en.wikipedia.org/wiki/Philosophy_of_science#Duhem-Quine_thesis may help you a little bit. You should probably read the entire article, since you seem to think there were no errors or exceptions, and that some exceptions could disprove a power law.
I think I know what you mean, but if I’m right, “gravity has exceptions” is, let’s say, a very bizarre way of putting it.
EDIT: yeah, you meant what i thought you meant.
There are no examples of failures of general relativity in that entire article. So far, of the two of you, only army1987 has given an example of an even slightly perplexing observation.
Why should I give one? I never brought up relativity, army1987 did.
You brought up the Laws Of Gravity (capitals yours), which among insiders are known as the Einstein field equations of general relativity.
This seems serendipitous:
http://lesswrong.com/r/discussion/lw/g62/link_the_collapse_of_complex_societies/
Yes, Tainter is one of a number of sources which are why I think humanity has seen diminishing returns. I’ve been casually dumping some info in http://www.gwern.net/the-long-stagnation although if we were discussing just books, I think Murray’s Human Accomplishment covers convincingly a much more important kind of diminishing returns compared to Tainter’s focus on resources and basic economic metrics.
(For those interested in the topic, I suggest looking at my link just for the intro bit about 5 propositions that the fact of diminishing returns does not prove; I believe more than one commenter on this page is committing at least one of those 5.)
Restricting the topic to distributed computation, the short answer is “essentially no”. The rule is that you get at best linear returns, not that your returns diminish greatly. There are a lot of problems which are described as “embarassingly parallel”, in that scaling them out is easy to do with quite low overhead. In general, any processing of a data set which permits it to be broken into chunks which can be processed independently would qualify, so long as you were looking to increase the amount of data processed by adding more processors rather than process the same data faster.
For scalable distributed computation, you use a system design whose total communication overhead rises as O(n log n) or lower. The upper bound here is superlinear, but gets closer to linear the more additional capacity is added, and so scales well enough that with a good implementation you can run out of planet to make the system out of before you get too slow. Such systems are quite achievable.
The DNS system would be an important example of a scalable distributed system; if adding more capacity to the DNS system had substantially diminishing returns, we would have a very different Internet today.
An example I know well enough to walk through in detail is a scalable database in which data is allocated to shards, which manage storage of that data. You need a dictionary server to locate data (DNS-style) and handle moving blocks of it between shards, but this can then be sharded in turn. The result is akin to a really big tree; number of lookups (latency) to find the data rises with the log of the data stored, and the total number of dictionary servers at all levels does not rise faster than the number of shards with Actual Data at the bottom level. Queries can be supported by precomputed indexes stored in the database themselves. This is similar to how Google App Engine’s datastore operates (but much simplified).
With this fairly simple structure, the total cost of all reads/writes/queries theoretically rises superlinearly with the amount of storage (presuming read/write/queries and amount of data scale linearly with each other), due to the dictionary server lookups, but only as O(n log(n)). If you were trying, with current day commodity hard disks and a conceptually simple on-disk tree, a dictionary server could reasonably store information for ten billion shards (500 bytes 10 billion = ~5 TB), two levels of sharding giving you a hundred billion billion data-storing shards, three giving a thousand billion billion billion data-storing shards. Five levels, five latency delays would give you more bottom-level shards than there are atoms on Earth. This is why, while scalability will eventually* limit a O(n log(n)) architecture, in this case because the cost of communicating with subshards of subshards becomes too high, you can run out of planet first.
This can be generalised; if you imagine that each shard performs arbitrary work on the data sent to it, and when the data is read back you get the results of the processing on that data, you get a scalable system which does any processing on a dataset than can be done by processing chunks of data independently from one another. Image or voice recognition matching a single sample against a huge dataset would be an example.
This isn’t to trivialise the issues of parallelising algorithms. Figuring out a scalable equivalent to a non-parallel algorithm is hard. Scalable databases, for example, don’t support the same set of queries as a simple MySQL server because a MySQL server implements some queries by iterating all the data, and there’s no known way to perform them in a scalable way. Instead, software using them finds other ways to implement the feature.
However, scalable-until-you-run-out-of-planet distributed systems are quite possible, and there are some scalable distributed systems doing pretty complex tasks. Search engines are the best example which comes to mind of systems which bring data together and do complex synthesis with it. Amazon’s store would be another scalable system which coordinates a substantial amount of real world work.
The only question is whether a (U)FAI specifically can be implemented as a scalable distributed system, and considering the things we know can be divided or done scalably, as well as everything which can be done with somewhat-desynchronised subsystems which correct errors later (or even are just sometimes wrong), it seems quite likely that (assuming one can be implemented at all) it could implement its work in the form of problems which can be solved in a scalable fashion.
I agree with what you are saying about scaling, as exemplified by sharded databases. But I am not convinced that any problem can be sharded that easily; as you yourself have said:
This is one reason why even Google’s datastore, AFAIK, does not implement exactly this kind of architecture—though it is still heavily sharded. This type of a datastructure does not easily lend itself to purely general computation, either, since it relies on precomputed indexes, and generally exploits some very specific property of the data that is known in advance. And, as you also mentioned, even with these drastic tradeoffs you still get O(n log(n)).
You mention Amazon (in addition to Google) as one example of a massively distributed system, but note that both Google and Amazon are already forced to build redundant data centers in separate areas of the Earth, in order to reduce network latency. This is important, because we aren’t dealing with abstract tree nodes, but with physical machines, which have a certain volume (among other things). This means that, even in an absolutely ideal situation where we can ignore power, heat dissipation, and network congestion, you will still run into the speed of light as a limiting factor. In fact, high-frequency trading systems are already running up against this limit even today. This means that you’ll run out of room to scale a lot faster than you run out of atoms of the Earth.
First, examining the dispute over whether scalable systems can actually implement a distributed AI...
That’s untrue; Google App Engine’s datastore is not built on exactly this architecture, but is built on one with these scalability properties, and they do not inhibit its operation. It is built on BigTable, which builds on multiple instances of Google File System, each of which has multiple chunk servers. They describe this as intended to scale to hundreds of thousands of machines and petabytes of data. They do not define a design scaling to an arbitrary number of levels, but there is no reason an architecturally similar system like it couldn’t simply add another level and add on another potential roundtrip. I also omit discussion of fault-tolerance, but this doesn’t present any additional fundamental issues for the described functionality.
In actual application, its architecture is used in conjunction with a large number of interchangeable non-data-holding compute nodes which communicate only with the datastore and end users rather than each other, running identical instances of software running on App Engine. This layout runs all websites and services backed by Google App Engine as distributed, scalable software, assuming they don’t do anything to break scalability. There is no particular reliance of “special properties” of the data being stored, merely limited types of searching of the data which is possible. Even this is less limited than you might imagine; full text search of large texts has been implemented fairly recently. A wide range of websites, services, and applications are built on top of it.
The implication of this is that there could well be limitations on what you can build scalably, but they are not all that restrictive. They definitely don’t include anything for which you can split data into independently processed chunks. Looking at GAE some more because it’s a good example of a generalised scalable distributed platform, the software run on the nodes is written in standard Turing-complete languages (Python, Java, and Go) and your datastore access includes read and write by key and by equality queries on specific fields, as well as cursors. A scalable task queue and cron system mean you aren’t dependent on outside requests to drive anything. It’s fairly simple to build any such chunk processing on top of it.
So as long as an AI can implement its work in such chunks, it certainly can scale to huge sizes and be a scalable system.
And as I demonstrated, O(n log n) is big enough for a Singularity.
And now on whether scalable systems can actually grow big in general...
Speed of light as an issue is not a problem for building huge systems in general, so long as the number of roundtrips rises as O(n log n) or less, because for any system capable of at least tolerating roundtrips to the other side of the planet (few hundred milliseconds), it doesn’t become more of an issue as a system gets bigger, until you start running out of space on the planet surface to run fibre between locations or build servers.
The GAE datastore is already tolerating latencies sufficient to cover distances between cities to permit data duplication over wide areas, for fault tolerance. If it was to expand into all the space between those cities, it would not have the time for each roundtrip increase until after it had filled all the space between them with more servers.
Google and Amazon are not at all forced to build data centres in different parts of the Earth to reduce latency; this is a misunderstanding. There is no technical performance degradation caused by the size of their systems forcing them to need the latency improvements to end users or the region-scale fault tolerance that spread out datacentres permit. They can just afford it more easily. You could argue there are social/political/legal reasons they need it more, higher expectations of their systems and similar, but these aren’t relevant here. This spreading out is actually largely detrimental to their systems since spreading out this way increases latency between them, but they can tolerate this.
Heat dissipation, power generation, and network cabling needs all also scale as O(n log n), since computation and communication do and those are the processes which create those needs. Looking at my previous example, the amount of heat output, power needed, and network cabling required per amount of data processed will increase by maybe an order of magnitude in scaling such a system upwards by tens of orders of magnitude, 5x for 40 orders of magnitude in the example I gave. This assumes your base amount of latency is still enough to cover the distance between the most distant nodes (for an Earth bound system, one side of the planet to the other), which is entirely reasonable latency-wise for most systems; a total of 1.5 seconds for a planet-sized system.
This means that no, these do not become an increasing problem as you make a scalable system expand, any more so than provision of the nodes themselves. You are right in that that heat dissipation, power generation, and network cabling mean that you might start to hit problems before literally “running out of planet”, using up all the matter of the planet; that example was intended to demonstrate the scalability of the architecture. You also might run out of specific elements or surface area.
These practical hardware issues don’t really create a problem for a Singularity, though. Clusters exist now with 560k processors, so systems at least this big can be feasibly constructed at reasonable cost. So long as the software can scale without substantial overhead, this is enough unless you think an AI would need even more processors, and that the software could is the point that my planet-scale example was trying to show. You’re already “post Singularity” by the time you seriously become unable to dissipate heat or run cables between any more nodes.
HFT systems desire extremely low latency; this is the sole cause of their wish to be close to the exchange and to have various internal scalability limitations in order to improve speed of processing. These issues don’t generalise to typical systems, and don’t get worse at above O(n log n) for typical bigger systems.
It is conceivable that speed of light limitations might force a massive, distributed AI to have high, maybe over a second latency in actions relying on knowledge from all over the planet, if prefetching, caching, and similar measures all fail. But this doesn’t seem like nearly enough to render one at all ineffective.
There really aren’t any rules of distributed systems which says that it can’t work or even is likely not to.
Asynchronous computers could easily grow to a planetary scale. Parallel computing rarely gets linear scalability—but it doesn’t necessarily flatten off quickly at small sizes, either.
Yes.
Even on serial systems, most AI problems are at least NP-hard, which are strongly conjectured to scale not just superlinearly, but also superpolynomially (exponentially, as far as we know) in terms of required computational resources vs problem instance size.
In many applications it can be the case that typical instances of these problems have special, domain-specific structure that can be exploited to construct domain-specifc algorithms and heuristics that are more efficient than the general purpose ones, in some cases we can even get polynomial time complexity, but this requires lots of domain-aware engineering, and even sheer trial-and-error experimentation.
The idea that an efficient domain-agnostic silver-bullet algorithm could arise pretty much out of nowhere, from some kind of “recursive self-improvement” process with little or no interaction with the environment, is not based on anything we know from either theoretical or empirical computer science. In fact, it is well known that meta-optimization is typically orders of magnitude more difficult than domain-level optimization.
If an AGI is ever built, it will be an huge collection of fairly domain-specific algorithms and heuristics, much like the human brain is a huge collection of fairly domain-specific modules. Such a thing will not arise in a quick “FOOM”, it will not improve quickly and will be limited in how much it will be ever able to improve: once you find the best algorithm for a certain problem you can’t find a better one, and certain problems are most likely going to stay hard even with the best algorithms.
The “intelligence explosion” idea seems to be based on a naive understanding of computational complexity (e.g. Good 1965) that largely predates the discovery of the main results of complexity theory, like the Cook-Levin theorem (1971) and Karp’s 21 NP-Complete problems (1972).
I agree with everything you’d said, but, to be fair, we’re talking about different things. My claim was not about the complexity of problems, but the scaling of hardware—which, as far as I know, scales sublinearly. This means that doubling the size of your computing cluster will allow you to solve the same exact problem less than twice as fast; and that eventually you’ll hit the point of diminishing returns where adding more machines simply isn’t worth it.
You’re saying, on the other hand, that doubling your processing power will not necessarily allow you to solve problems that are twice as interesting; in most cases, it will only allow you to add one more city to the traveling salesman’s itinerary (metaphorically speaking).
There is still room for weak super-intelligence, where the AI have human intelligence, only faster. (Example: an upload with sufficient computing power — as far as I know, brains work in a quite massively parallel fashion, and therefore so could simulations of it).
Seriously, if I could upload myself into a botnet that would let each instance of me think 10 times faster than my meat-ware, I would probably take over the world in about 1 to 10 years. A versatile team of competent people? Less than 6 months. (Obvious path to do this: work for money, build and buy companies, then gather financial, lobbying, or military power. Better path to do this: think about it for 1 subjective year before proceeding.)
My point is, the AI doesn’t need to be vastly superhuman to take over the world very quickly. Even without the FOOM, the AGI can still be incredibly dangerous. Imagine something like the uploads above, only it can work 24⁄7 at full capacity (no sleep, no leisure time, no akrasia).
Maybe. Today, even with our best supercomputers we can’t simulate a rat brain in real time.
You would be able to work as 10 people, maybe a little more, but probably less than 30. I don’t know how efficient you are, but I doubt that would be enough to take the world. And why wouldn’t other people have access the same technology?
Even if you managed to become world dictator, you would only stay in power as long as you had broad political support. Screw up something and you’ll end up hanging from your power chord.
What is it going to do? Secretly repurpose the iPhone factories in China to make Terminators?
I said botnet. That means dozens, thousands, or millions of me simultaneously working at 10 times human speed¹, and since they are instances of me, they presumably have the same goals. How would you stop that from achieving world domination, short of uploading yourself?
[1] Assuming that many personal computers are powerful enough, and can be corrupted. A slower course of action would be to buy a data-centre first, work, then buy more data-centres, and duplicate myself exponentially from that.
That doesn’t mean that they would necessarily cooperate, expecially as they diverge. They would be more like identical twins.
Relasing a security patch? Seizing all the funds you obtained by your illegal activities? Banning use of any hardware that could host you until a way to avoid such things is found?
Assuming that using these data centers to run copies of you is the most economically productive use of them, rather than, say, running copies of other people, or cow-clicker games.
Wait a minute: would you defect? Sure, there would be some divergence, but do you really think it would result in a significant divergence of goals, even if you had a plan and were an adult by the time you fork? Okay, it can happen, and is probably worth taking specific precautions. I don’t think this is a show stopper however, and I’m not sure it would render me any less dangerous.
That may not be enough:
I would probably man-in-the-middle automatic updates
Many people won’t erase their hard drive or otherwise patch their machine manually
I may convince some people to let me run (I could work for them for instance).
If I’m stealthy enough, it may take some time before I’m discovered at all (it happened with actual computer viruses).
If software continues the way it is now (200 Million lines of code for systems that could fit in 20 thousands), security bugs won’t all be patched in advance. The reliability of our computer needs to go waay up before botnets become impossible.
Good luck with that one. Obviously, I would have many, many little bank accounts, managed separately and in parallel, under many different identities. You would have to spot my illegal activities one by one to seize the funds. Plus, I may do legal activities as well.
That one is excellent. We should watch out for computing overhang, however, and try and estimate how much computing power an upload would need before the software is developed.
A final note: If I really had the possibility to upload myself, one of my first moves would be to propose SIAI and CFAR to upload with me (now that we can duplicate Eliezer…). I trust them more than I trust me for a Friendly Takeover. But if a Big Bad or a Well Intentioned Extremist has access to that first…
Even if their goals stay substantially the same, it wouldn’t mean that they would naturally cooperate, expecially when their main goal is world domination. Hell, it’s already non-trivial for a single person to coordinate with future selves, resulting in all kinds of ego-dystonic behaviors: impulsiveness, akrasia, etc., Coordinating with thousands copies of yourself would be only marginally easier than coordinating with thousands strangers.
We are not talking about some ideal “Prisoner’s dilemma with mind-clone” scenario. After the mind states of your copies diverge a little bit, and that would happen very quickly as you spread your copies to different machines, they become effectively different people: you wouldn’t be able to predict them and they would’t be able to predict you.
Hacking all the routers? Good luck with that. And BTW routers can also be updated. Manually.
Because they are lazy and they would prefer to live under world dictatorship.
Then you are their employee, not their dominator.
But if you are to dominate the world, you would have to eventually reveal yourself. What do you think would happen next?
Botnets are certainly possible and they are indeed used for nefarious purposes, but world domination? Nope.
As Bugmaster said, you would be able to perform only small purchases, not to buy a satellite, or an army.
Moreover, obtaining and managing lots of fake or stolen identities, creating bank accounts without physically showing up at the bank or using stolen bank accounts, is not something that tend to go unnoticed. The more you have, the more likely that you get caught, exponentially so.
Under multiple fake identities operated from a botnet of hacked computers? Hardly so.
Software tends to march right behind hardware, exploting it close to its maximum potential. Computing overhang is unlikely.
Anyway, I wasn’t proposing any luddite advance ban. If some brain upload, or AI or whatever tries to take the world by hacking the Internet and other countermeasures fail, governments could always ban use of the hardware that things needs to run. If that also fails, the next step would be physical destruction.
But seriously, we are discussing hacking as in the plot of some bad sci-fi action flick. Computer security doesn’t work like that in the real world.
You mean the guy who would choose dust specks over torture and who claims on his OKCupid profile that he’s a sadist? Yeah, I’d totally trust him in charge of the world. Now, I’ve other matters to attend to… that EMP bomb doesn’t build itself… :D
You really think you would diverge that quickly?
I’m … not sure how those are criticisms.
Man in the middle: I just meant intercepting automatic updates at the level of the computer I’m in. Trojan todo list n°7: once installed and running, I will intercept all communications to and from this computer. I wouldn’t want Norton updating behind my back. Now, try and hack the routers in the backbone, that’s something I didn’t think about…
Employee vs dominator: I obviously intend to double cross my employers, eventually.
Revealing myself: that one needs to be carefully thought through. Hopefully, by the time I reveal myself, I will have sufficient blackmail power. Having a sufficient number of physical robots can also help.
Zillions fake ID, yet stay stealthy: well, I do expect a fair number of my identities to be exposed. This should pose no problem to the others, however, provided they do not visibly communicate with each other (at first).
Legal activities: my meat instance could buy a few computers, rent remote servers etc. I doubt I would be incapable of running at least a successful business from there. And from there, buy even more computing power. This could be done in parallel with the illegal activities.
Computing (no) overhang: this one is the single reason why I do agree that without a FOOM of some kind, actual world domination is unlikely: there will be multiple competing uploads, and this should end with a Hansonian scenario. Given that such a world is closer to Hell than Heaven (to me at least), that still counts as an Existential Blunder. On the bright side, we may see this coming. That said, I still do believe full blown intelligence explosion is likely.
Note that overall, your objections are actually valuable advice. And that give me some insight about what my very first move should be: gathering such objections, and try to find counters or workarounds. And now that you made quite clear that any path to world domination is long, complicated, and therefore nearly certain to fail, I should run multiple schemes in parallel. Surely one of them will actually work?
I believe that this would severely limit your financial throughput. You would be able to buy lots of little things, whose total cost is quite significant—for example, you could buy yourself a million cheap PCs, each costing $1000. But you would not be able to buy a single expensive thing (at least, not without exposing yourself to instant retribution), such as a satellite costing $1e9.
Currently, there are ways to create companies anonymously. This is preventing (or at least slowing down to a crawl) retribution right now. If all this company apparently does is buying a few satellites, it won’t be at great risk.
Good work, I believe we’ve got the next James Bond movie in the bag :-)
Do you mean, competent people who are thinking 10 times faster than biological humans, or what ? This seems a bit of a stretch. There currently exist tons of frighteningly competent people in all kinds of positions of power in the world, and yet, they do not control it (unless you believe in conspiracy theories).
If it was this easy, some biological human (or a team of such humans) would’ve done it already, in 10 to 50 years or however long it takes. In fact, a few humans have managed to take over individual countries in about as much time. However, as things stand now, there’s simply no clear path to world domination. Political and military power gets much more difficult to gather the more of it you have. Even superpowers such as USA or China cannot dictate terms to the rest of the world.
Furthermore, my point was that uploading yourself to 10 machines will not allow you to think 10 times as fast. With every machine you add, your speed gains would become progressively smaller. You would still think much faster than an ordinary human, of course.
I mean exactly that. I’d be very surprised if ultimately, neuromorphic AIs would be impossible to run significantly faster than meat-ware. Because our brain is massively parallel, and because current microprocessors have massively faster serial speed than neurons. Now our brains aren’t fully parallel, so I assumed an arbitrary speed-up limit. I said 10 times, but it would be probably still be incredibly dangerous at 2 or 3, or even lower.
Now do not forget the key word here: botnet. The team is supposed to duplicate itself many times over before trying to take over the world.
I don’t think so, because uploads have significant advantages over meat-ware.
Low cost of living. In a world where every middle class home can afford sufficient computing power for an upload (required to turn me into a botnet). Now try to beat my prices.
Being many copies of the same few original brains. It means TDT works better, and defection is less likely. This should solve
Because once the self-duplicating team has independently taken economic control of most of the world, it is easy for it to accept the domination of one instance (I would certainly pre-commit to that). Now for the rest of humanity to accept such dominance, the uploads only have to use the resources they acquired for the individual perceived benefit of the meat bags.
Yep, that would be a full blown global conspiracy. While it’s probably forever out of the reach of meat bags, I think a small team of self-replicating uploads can pull it out quite easily.
Hansonian tactics, which can further the productivity of the team, and therefore market power. (One have to be very motivated, or possibly crazy.)
Temporary mass duplication followed by the “termination” of every instances but one. The surviving instance can have much subjective free time, while the proportion of leisure computing stays very small.
Save and reload of snapshots which are in a particularly good mood (and therefore very productive). Excellent for beating akrasia.
Training of one instance per discipline, then mass duplication.
Data-centres. The upload team can collaborate with or buy processor manufacturers, and build data-centres for more and more uploads to work on whatever is needed. This could further reduce the cost of living.
Now, I did make an unreasonable assumption: that only the original team would have those advantages. Most probably, there will be several such teams, possibly with different goals. The most likely result (without FOOM) is then a Hansonian outcome. That’s no world domination, but I think it is just as dangerous (I would hate this world).
Finally, there is also the possibility of a de-novo AGI which would be just as competent as the best humans at most endeavours, though no faster. We already have an existence proof, so I think this is believable. I think such an AI would be even more dangerous than the uploaded team above.
So would I. However, given our current level of technological development, I’d be very surprised if we had any kind of a neuromorphic AI at all in the near future (say, in the next 50 years). Still, I do agree with you in principle.
There are tons of biological people alive today who are able to come up with solutions to problems 2x to 3x faster than you and me. They do not rule the world. To be fair, I doubt that there are many people—if any—who think 10x faster.
I doubt that you will be able to achieve that; that was my whole point. In fact, I have trouble envisioning what “economic control of most of the world” even means. What does it mean to you ?
In addition to the above, your botnet would face serveral significant threats, both external and internal:
Meatbags would strive to shut it down; not because they suspect it of being an evil conspiracy, but because they’d get tired of it sucking away their resources. Modern malware botnets suffer this fate often, though there’s always someone willing to rebuild them
If your botnet becomes a serious threat (much worse than current real-world botnets), hardware manufacturers will implement security measures, such as SecureBoot, to prevent it from spreading. Currently, such measures are driven by the entertainment industry.
The super-fast instances of you would have to communicate with each other, and they’d only be able to do so through very slow (relatively speaking) network links. Google and Amazon are solving this problem by building more and more local datacenters. Real botnets aren’t solving the problem at all because their instances don’t need to talk to each other all that much.
How would you feel, right now, if your twin pointed a gun at your head with the intent to kill you “for the greater good” ? This is how your instances will feel when you attempt to shut them down to prevent akrasia.
Why are you taking over the world in the first place ? Chances are that whatever your ultimate goal is, it could be accomplished even sooner by taking over the botnet. Every instance of you will eventually realize this, with predictable results.
These are just some problems off the top of my head; the list is far from exhaustive.