If the calendar recorded every event of comparable significance to “formation of the galaxy” and “formation of the solar system,” there would be hundreds of sextillions of them on the calendar before the emergence of life on Earth.
No. The calendar represents a statistical clustering of pattern changes that maps them into a small set of the most significant. If you actually think there are “hundreds of sextillions of events” that are remotely as significant as the formation of galaxies, then we have a very wide inferential distance or you are adopting a contrarian stance. The appearance of galaxies is one event, having sextillion additional galaxies doesn’t add an iota of complexity to the universe.
Complexity is difficult to define or measure as it relates to actual statistical structural representation and deep compression that requires intelligence. But any group of sophisticated enough intelligences can roughly agree on what the patterns are and will make similar calendars—minus some outliers, contrarians, etc.
The formation of the Milky Way is listed as a single event, as is the formation of the Solar system. There are hundreds of sextillions of stars, with more being created all the time, and plenty more that have died in the past.
The calendar contains the births of Buddha, Jesus and Mohammad. Even if we were supposing that these were events of comparable significance to the evolution of life itself, do you honestly think each one adds appreciably to the complexity of the universe, that they could not simply be compressed into “Birth of religious figures,” whereas the formation of every star system in the universe is compressible into a single complexifying event?
If you think that events like the cave paintings are of comparable significance to the formation of galaxies in general, we’re dealing with a vast gulf of inferential distance.
The formation of the Milky Way is listed as a single event, as is the formation of the Solar system. There are hundreds of sextillions of stars, with more being created all the time, and plenty more that have died in the past.
Again the electron is one pattern, and it’s appearance is a single complexity increasing event, not N events where N is the number of electrons formed. The same for stars, galaxies, or anything else that we have a word to describe.
And once again the increase in complexity in the second half of the U shape is a localizing effect. It is happening here on earth and is probably happening in countless other hotspots throughout the universe.
Even if we were supposing that these were events of comparable significance to the evolution of life itself, do you honestly think each one adds appreciably to the complexity of the universe, that they could not simply be compressed into “Birth of religious figures,”
It is expected that the calendar will contain events of widely differing importance, and the second half acceleration phase of the U curve is a localization phenomena, so the specific events will have specifically local importance (however they are probably examples of general patterns that occur throughout the universe on other developing planets, so in that sense they are likely universal—we just can’t observe them).
The idea of a calendar of size N is to do a clustering analysis of space-time and categorize it into N patterns. Our brains do this naturally, and far better than any current algorithm (although future AIs will improve on this).
There is no acceptable way to compute the ‘perfect’ or ‘correct’ clustering or calendar. Our understanding of structure representation and complex pattern inference just isn’t that mature yet. Nonetheless this is largely irrelevant, because the deviations between the various calendars of historians are infinitesimal with respect to the overall U pattern.
The formation of star systems is a single pattern-emergence event, it doesn’t matter in the slightest how many times it occurs. That’s the entire point of compression.
The calendar contains the births of Buddha, Jesus and Mohammad. Even if we were supposing that these were events of comparable significance to the evolution of life itself,
I think most people would put origin of life in the top ten and origin of current religions in the top hundred or thousand, but this type of nit-picking is largely beside the point. However, we do need at least enough data points to see a trend, of course.
do you honestly think each one adds appreciably to the complexity of the universe, that they could not simply be compressed into “Birth of religious figures
Once again, we are not talking about the complexity of the universe. Only the 1st part of the U pattern is universal, the second half is localized into countless sub-pockets of space-time. (it occurs all over the place wherever life arises, evolves intelligence, civilization, etc etc)
As for the specific events Buddha, Jesus, Mohammad, of course they could be compressed into “origin of major religions”, if we wanted to shrink the calendar. The more relevant question would be: given the current calendar size, are those particular events appropriately clustered? As a side point, its not the organic births of the leaders that is important in the slightest. These events are just poorly named in that sense—they could be given more generic tags such as the “origin of major world dominating religions”, but we need to note the local/specific vs general/universal limitation of our local observational status.
If you think that events like the cave paintings are of comparable significance to the formation of galaxies in general,
The appearance of cave paintings in general is an important historical event. As to what caliber of importance, it’s hard to say. I’d guess somewhere of between 2nd to 3rd order (a good fit for calendars listing between 100 to 1000 events). I’d say galaxies are 1st order or closer, so they are orders of magnitude more important.
But note the spatial scale has no direct bearing on importance.
There is no acceptable way to compute the ‘perfect’ or ‘correct’ clustering or calendar. Our understanding of structure representation and complex pattern inference just isn’t that mature yet. Nonetheless this is largely irrelevant, because the deviations between the various calendars of historians are infinitesimal with respect to the overall U pattern.
The deviations between various calendars of human historians are infinitesimal on the grand scale because the deviations in the history that we have access to and are psychologically inclined to regard as significant are infinitesimal out of the possible history space and mind space.
Can you provide even an approximate definition of the “complexity” that you think has been accumulating at an exponential rate since the beginning of the universe? If not, there’s no point arguing about it at all.
If you take a small slice of laminar cortex and hook it up to an optic feed and show it image sequences, it develops into gabor-like filters which recognize/encode 2D edges. The gabor filters have been mathematically studied and are optimal entropy maximizing transforms for real world images. The edges are real because of the underlying statistical structure of the universe, and they don’t form if you show white noise or nothingness.
Now take that same type of operation and stack many of them on top of each other and add layers of recursion and you get something that starts clustering the universe into patterns—words.
These patterns which we regard as “psychologically inclined to regard as significant” are actual universal structural patterns in the universe, so even if the particular named sequences are arbitrary and the ‘importance’ is debatable, the patterns themselves are not arbitrary. See the cluster structure of thingspace and related posts.
Can you provide even an approximate definition of the “complexity”
See above. Complexity is approximated by words and concepts in the minds of intelligences. This relates back to optimal practical compression which is the core of intelligence.
Kolmogorov complexity is a start, but it’s not computationally tractable so it’s not a good definition. The proper definition of complexity requires an algorithmic definition of general optimal structural compression, which is the core sub-problem of intelligence. So in the future when we completely solve AI, we will have more concrete definitions of complexity. Until then, human judgement is a good approximation. And a first order approximation is “complexity is that which we use words to describe”.
If you take a small slice of laminar cortex and hook it up to an optic feed and show it image sequences, it develops into gabor-like filters which recognize/encode 2D edges. The gabor filters have been mathematically studied and are optimal entropy maximizing transforms for real world images. The edges are real because of the underlying statistical structure of the universe, and they don’t form if you show white noise or nothingness.
Humans possess powerful pattern recognizing systems. We’re adapted to cope with the material universe around us, it’s no wonder if we recognize patterns in it, but not in white noise or nothingness.
“{Interestingness to humans} has an exponential relationship with time over the lifetime of the universe” packs a lot less of a sense of physical inevitability. The universe is not optimized for the development of {interestingness to humans}. We’ve certainly made the world a lot more interesting for ourselves in our recent history, but that doesn’t suggest it’s part of a universal trend. The calendar you linked to, for instance, lists the K-T extinction event, the most famous although not the greatest of five global mass extinction events. Each of those resulted in a large, albeit temporary, reduction in global ecosystem diversity, which strikes me as a pretty big hit to {interestingness to humans}. And while technology has been increasing exponentially throughout the global stage recently, there have been plenty of empire collapses and losses of culture which probably mark significant losses of {interestingness to humans} as well.
So, what your post really relies upon is the proposition that {interestingness to humans} can be made to experience an endless exponential increase over time, without leaving Earth. I am convinced that the reasonable default assumption given the available data is that it cannot.
Humans possess powerful pattern recognizing systems
Yes, and I was trying to show how this relates to intelligence, and how intelligence requires compression, and thus relates to complexity.
We’re adapted to cope with the material universe around us, it’s no wonder if we recognize patterns in it, but not in white noise or nothingness.
We recognize patterns because the universe is actually made of patterns. The recognition is no more arbitrary than thermodynamics or quantum physics.
What you appear to be doing is substituting in a black box function of your own mind as a fundamental character of the universe. You see qualities that seem interesting and complex, and you label them “complexity”
No. One of the principle sub-functions of minds/intelligences in general is general compression. The patterns are part of the fundamental character of the universe, and it is that reality which shapes minds, not the other way around.
Complexity is not {interestingness to humans}. Although of course {interestingness to humans} is related to complexity, because our minds learn/model/represent patterns, we find patterns ‘interesting’ because they allow us to model that which exists, and complexity is a pattern-measure.
I suspect we could agree more on complexity if we could algorithmically define it, even though that shouldn’t be necessary (but I will resort to that shortly as a secondary measure). We could probably agree on what ‘humans’ are without a mathematical definition, and we could probably agree on how the number of humans has been changing over time.
Imagine if we could also loosely agree on what ‘things’ or unique patterns are in general, and then we could form a taxonomy over all patterns, where some patterns have is-a relationships to other patterns and are in turn built out of sub-patterns, forming a loosely hierarchical network. We could then roughly define pattern complexity as the hierarchical network rank order of the pattern in the pattern network. A dog is a mammal which is an animal, so complexity increases along that path, for example, and a dog is more complex than any of it’s subcomponents. We could then define ‘events’ as temporal changes in the set of patterns (within some pocket of the universe). We could then rank events in terms of complexity changes, based on the change in complexity of the whole composite-pattern (within space-time pockets).
Then we make a graph of a set of the top N events.
We then see the U shape trend in complexity change over time.
If you want a more mathematical definition, take Kolmogorov complexity and modify it to be computationally tractable. If K(X) is the K-complexity of string X defined by the minimal program which outputs X (maximal compression), then we define CK(X, M, T) as the minimal program which best approximates X subject to memory-space M and time T constraints. Moving from intractable lossless compression to lossy practical compression makes this modified definition of complexity computable in theory (but it’s exact definition still requires optimal lossy compression algorithms). We are interested in CK complexity of the order computable to humans and AIs in the near future.
Complexity != {interestingness to humans}
“{Interestingness to humans} has an exponential relationship with time over the lifetime of the universe” packs a lot less of a sense of physical inevitability.
Complexity over time does appears to follow an inevitable upward accelerating trend in many localized sub-pockets of the universe over time, mirroring the big bang in reverse, and again the trend is not exponential—it’s a 1/x type shape.
The trend is nothing like a smooth line. It is noisy, and there have been some apparent complexity dips, as you mention, although the overall trend is undeniably accelerating and the best fit is the U shape leading towards a local vertical asymptote. As a side note, complexity/systems theorists would point out that most extinctions actually caused large increases in net complexity, and were some of the most important evolutionary stimuli. Counterintuitive, but true.
Complexity is not {interestingness to humans}. Although of course {interestingness to humans} is related to complexity, because our minds learn/model/represent patterns, we find patterns ‘interesting’ because they allow us to model that which exists, and complexity is a pattern-measure.
I suspect we could agree more on complexity if we could algorithmically define it, even though that shouldn’t be necessary (but I will resort to that shortly as a secondary measure). We could probably agree on what ‘humans’ are without a mathematical definition, and we could probably agree on how the number of humans has been changing over time.
Things can be extraordinarily complex without being particularly interesting to humans. We don’t have a fully general absolute pattern recognizing system; that would be an evolutionary hindrance even if it were something that could practically be developed. There are simply too many possible patterns in too many possible contexts. It’s not advantageous for us to be interested in all of them.
I think we don’t agree on what this “complexity” is because it’s not a natural category. You’re insisting that it’s fundamental because it feels fundamental to you, but you can’t demonstrate that it’s fundamental, and I simply don’t buy that it is.
The trend is nothing like a smooth line. It is noisy, and there have been some apparent complexity dips, as you mention, although the overall trend is undeniably accelerating and the best fit is the U shape leading towards a local vertical asymptote. As a side note, complexity/systems theorists would point out that most extinctions actually caused large increases in net complexity, and were some of the most important evolutionary stimuli. Counterintuitive, but true.
Eventually. Ecosystem diversity eventually bounces back, and while a large number of genuses and families die out, most orders retain representatives, so there’s still plenty of genetic diversity to spread out and reoccupy old niches, and potentially create new ones in the process. But there’s no fundamental principle that demands that massive extinction events must lead to increased ecosystem complexity even in the long term; for a long term decrease, you’d simply have to wipe out genetic diversity on a higher level. An UFAI event, for example, could easily lead to a massive drop in ecosystem complexity.
The number of possible patterns in an information cluster is superexponential with the size of the information cluster
Firstly, you are misquoting EY’s post: the possible number of patterns in a string grows exponentially with the number of bits, as expected. It is the number of ‘concepts’ which grows super-exponentially, where EY is defining concept very loosely as any program which classifies patterns. The super-exponential growth in concepts is combinatoric and just stems from naive specific classifiers which recognize combinations of specific patterns.
Secondly, this doesn’t really relate to universal pattern recognition, which is concerned only with optimal data classifications according to a criteria such as entropy maximization.
As a simple example, consider the set of binary strings of length N. There are 2^N possible observable strings, and a super-exponential combinatoric set of naive classifiers. But consider observed data sequences of the form 10010 10010 10010 repeated ad infinitum. Any form of optimal extropy maximization will reduce this to something of the form repeat “10010” indefinitely.
In general any given sequence of observations has a single unique compressed (extropy reduced) representation, which corresponds to it’s fundamental optimal ‘pattern’ representation.
Can you demonstrate that the patterns you’re recognizing are non-arbitrary?
Depends on what you mean. It’s rather trivial to construct simple universal extropy maximizers/optimizers—just survey the basic building blocks of unsupervised learning algorithms. The cortical circuit performs similar computations.
For example the 2D edge patterns that cortical tissue (and any good unsupervised learning algorithm) learns to represent when exposed to real world video are absolutely not arbitrary in the slightest. This should be obvious.
If you mean higher level thought abstractions by “the patterns you’re recognizing”, then the issue becomes more complex. Certainly the patterns we currently recognize at the highest level are not optimal extractions, if that’s what you mean. But nor are they arbitrary. If they were arbitrary our cortex would have no purpose, would confer no selection advantage, and would not exist.
We don’t have a fully general absolute pattern recognizing system;
We do have a fully general pattern recognition system. I’m not sure what you mean by “general absolute”.
that would be an evolutionary hindrance even if it were something that could practically be developed.
They are trivial to construct, and require far less genetic information to specify than specific pattern recognition systems.
Specific recognition systems have the tremendous advantage that they work instantly without any optimization time. A general recognition system has to be slowly trained on the patterns of data present in the observations—this requires time and lots of computation.
Simpler short lived organisms rely more on specific recognition systems and circuitry for this reason as they allow newborn creatures to start with initial ‘pre-programmed’ intelligence. This actual requires considerably more genetic complexity than general learning systems.
Mammals grew larger brains with increasing reliance on general learning/recognition systems because it provides a tremendous flexibility advantage at the cost of requiring larger brains, longer gestation, longer initial development immaturity, etc. In primates and humans especially this trend is maximized. Human infant brains have very little going on initially except powerful general meta-algorithms which will eventually generate specific algorithms in response to the observed environment.
I think we don’t agree on what this “complexity” is because it’s not a natural category
The concept of “natural category” is probably less well defined that “complexity” itself, so it probably won’t shed too much light on our discussion.
That being said, from that post he describes it as:
I’ve chosen the phrase “unnatural category” to describe a category whose boundary you draw in a way that sensitively depends on the exact values built into your utility function.
In that sense complexity is absolutely a natural category.
Look at Kolmogorov_complexity. It is a fundamental computable property of information, and information is the fundamental property of modern physics. So that definition of complexity is as natural as you can get, and is right up there with entropy. Unfortunately that definition itself is not perfect and is too close to entropy, but computable variants of it exist .. .. one used in a computational biology paper I was browsing recently (measuring the tendency towards increased complexity in biological systems) defined complexity as compressed information minus entropy, which may be the best fit to the intuitive concept.
Intuitively I could explain it as follows.
The information complexity of an intelligent system is a measure of the fundamental statistical pattern structure it extracts from it’s environment. If the information it observes is already at maximum entropy (such as pure noise), then it is already maximally compressed, no further extraction is possible, and no learning is possible. At the other extreme if the information observed is extremely uniform (low entropy) then it can be fully described/compressed by extremely simple low complexity programs. A learning system extracts entropy from it’s environment and grows in complexity in proportion.
Depends on what you mean. It’s rather trivial to construct simple universal extropy maximizers/optimizers—just survey the basic building blocks of unsupervised learning algorithms. The cortical circuit performs similar computations.
For example the 2D edge patterns that cortical tissue (and any good unsupervised learning algorithm) learns to represent when exposed to real world video are absolutely not arbitrary in the slightest. This should be obvious.
It’s objective that our responses exist, and they occur in response to particular things. It’s not obvious that they occur in response to natural categories, rather than constructed categories like “sexy.”
We do have a fully general pattern recognition system. I’m not sure what you mean by “general absolute”.
“General absolute” was probably a poor choice of words, but I meant to express a system capable of recognizing all types of patterns in all contexts. There is an absolute, non arbitrary pattern here, do you recognize it?
Kolmogorov complexity is a fundamental character, but it’s not at all clear that we should want a Kolmogorov complexity optimizer acting on our universe, or that Kolmogorov complexity actually has much to do with the “complexity” you’re talking about. A message or system can be high in Kolmogorov complexity without being interesting to us, and it still seems to me that you’re conflating complexity with interestingness when they really don’t bear that sort of relationship.
“General absolute” was probably a poor choice of words, but I meant to express a system capable of recognizing all types of patterns in all contexts. There is an absolute, non arbitrary pattern here, do you recognize it?
I see your meaning—and no practical system is capable of recognizing all types of patterns in all contexts. A universal/general learn algorithm is simply one that can learn to recognize any pattern, given enough time/space/training. That doesn’t mean it will recognize any random pattern it hasn’t already learned.
I see hints of structure in your example but it doesn’t ring any bells.
Kolmogorov complexity is a fundamental character, but it’s not at all clear that we should want a Kolmogorov complexity optimizer acting on our universe
No, and that’s not my primary interest. Complexity seems to be the closest fit for something-important-which-has-been-changing over time on earth. If we had a good way to measure it, we could then make a quantitative model of that change and use that to predict the rate of change in the future, perhaps even ultimately reducing it to physical theory.
For example, one of the interesting new recent physics papers (entropic gravity) proposes that gravity is actually not a fundamental force or even spacetime curvature, but actually an entropic statistical pseudo-force. The paper is interesting because as a side effect it appears to correctly derive the mysterious cosmological constant for acceleration. As an unrelated side note I have an issue with it because it uses the holographic principle/berkenstein bound for information density which still appears to lead to lost-information paradoxes in my mind.
But anyway, if you look at a random patch of space-time, it is always slowly evolving to a higher-entropy state (2nd law), and this may be the main driver of most macroscopic tendencies (even gravity). It’s also quite apparent that a closely related measure—complexity—increases non-linearly in a fashion perhaps loosely like gravitational collapse. The non-linear dynamics are somewhat related—complexity tends to increase in proportion to the existing local complexity as a fraction of available entropy. In some regions this appears to go super-critical, like on earth, where in most places the growth is minuscule or non-existent.
It’s not apparent that complexity is increasing over time. In some respects, things seem to be getting more interesting over time, although I think that a lot of this is due to selective observation, but we don’t have any good reason to believe we’re dealing with a natural category here. If we were dealing with something like Kolmogorov complexity, at least we could know if we were dealing with a real phenomenon, but instead we’re dealing with some ill defined category for which we cannot establish a clear connection to any real physical quality.
For all that you claim that it’s obvious that some fundamental measure of complexity is increasing nonlinearly over time, not a lot of other people are making the same claim, having observed the same data, so it’s clearly not as obvious as all that.
No. The calendar represents a statistical clustering of pattern changes that maps them into a small set of the most significant. If you actually think there are “hundreds of sextillions of events” that are remotely as significant as the formation of galaxies, then we have a very wide inferential distance or you are adopting a contrarian stance. The appearance of galaxies is one event, having sextillion additional galaxies doesn’t add an iota of complexity to the universe.
Complexity is difficult to define or measure as it relates to actual statistical structural representation and deep compression that requires intelligence. But any group of sophisticated enough intelligences can roughly agree on what the patterns are and will make similar calendars—minus some outliers, contrarians, etc.
The formation of the Milky Way is listed as a single event, as is the formation of the Solar system. There are hundreds of sextillions of stars, with more being created all the time, and plenty more that have died in the past.
The calendar contains the births of Buddha, Jesus and Mohammad. Even if we were supposing that these were events of comparable significance to the evolution of life itself, do you honestly think each one adds appreciably to the complexity of the universe, that they could not simply be compressed into “Birth of religious figures,” whereas the formation of every star system in the universe is compressible into a single complexifying event?
If you think that events like the cave paintings are of comparable significance to the formation of galaxies in general, we’re dealing with a vast gulf of inferential distance.
Again the electron is one pattern, and it’s appearance is a single complexity increasing event, not N events where N is the number of electrons formed. The same for stars, galaxies, or anything else that we have a word to describe.
And once again the increase in complexity in the second half of the U shape is a localizing effect. It is happening here on earth and is probably happening in countless other hotspots throughout the universe.
It is expected that the calendar will contain events of widely differing importance, and the second half acceleration phase of the U curve is a localization phenomena, so the specific events will have specifically local importance (however they are probably examples of general patterns that occur throughout the universe on other developing planets, so in that sense they are likely universal—we just can’t observe them).
The idea of a calendar of size N is to do a clustering analysis of space-time and categorize it into N patterns. Our brains do this naturally, and far better than any current algorithm (although future AIs will improve on this).
There is no acceptable way to compute the ‘perfect’ or ‘correct’ clustering or calendar. Our understanding of structure representation and complex pattern inference just isn’t that mature yet. Nonetheless this is largely irrelevant, because the deviations between the various calendars of historians are infinitesimal with respect to the overall U pattern.
The formation of star systems is a single pattern-emergence event, it doesn’t matter in the slightest how many times it occurs. That’s the entire point of compression.
I think most people would put origin of life in the top ten and origin of current religions in the top hundred or thousand, but this type of nit-picking is largely beside the point. However, we do need at least enough data points to see a trend, of course.
Once again, we are not talking about the complexity of the universe. Only the 1st part of the U pattern is universal, the second half is localized into countless sub-pockets of space-time. (it occurs all over the place wherever life arises, evolves intelligence, civilization, etc etc)
As for the specific events Buddha, Jesus, Mohammad, of course they could be compressed into “origin of major religions”, if we wanted to shrink the calendar. The more relevant question would be: given the current calendar size, are those particular events appropriately clustered? As a side point, its not the organic births of the leaders that is important in the slightest. These events are just poorly named in that sense—they could be given more generic tags such as the “origin of major world dominating religions”, but we need to note the local/specific vs general/universal limitation of our local observational status.
The appearance of cave paintings in general is an important historical event. As to what caliber of importance, it’s hard to say. I’d guess somewhere of between 2nd to 3rd order (a good fit for calendars listing between 100 to 1000 events). I’d say galaxies are 1st order or closer, so they are orders of magnitude more important.
But note the spatial scale has no direct bearing on importance.
The deviations between various calendars of human historians are infinitesimal on the grand scale because the deviations in the history that we have access to and are psychologically inclined to regard as significant are infinitesimal out of the possible history space and mind space.
Can you provide even an approximate definition of the “complexity” that you think has been accumulating at an exponential rate since the beginning of the universe? If not, there’s no point arguing about it at all.
If you take a small slice of laminar cortex and hook it up to an optic feed and show it image sequences, it develops into gabor-like filters which recognize/encode 2D edges. The gabor filters have been mathematically studied and are optimal entropy maximizing transforms for real world images. The edges are real because of the underlying statistical structure of the universe, and they don’t form if you show white noise or nothingness.
Now take that same type of operation and stack many of them on top of each other and add layers of recursion and you get something that starts clustering the universe into patterns—words.
These patterns which we regard as “psychologically inclined to regard as significant” are actual universal structural patterns in the universe, so even if the particular named sequences are arbitrary and the ‘importance’ is debatable, the patterns themselves are not arbitrary. See the cluster structure of thingspace and related posts.
See above. Complexity is approximated by words and concepts in the minds of intelligences. This relates back to optimal practical compression which is the core of intelligence.
Kolmogorov complexity is a start, but it’s not computationally tractable so it’s not a good definition. The proper definition of complexity requires an algorithmic definition of general optimal structural compression, which is the core sub-problem of intelligence. So in the future when we completely solve AI, we will have more concrete definitions of complexity. Until then, human judgement is a good approximation. And a first order approximation is “complexity is that which we use words to describe”.
Humans possess powerful pattern recognizing systems. We’re adapted to cope with the material universe around us, it’s no wonder if we recognize patterns in it, but not in white noise or nothingness.
What you appear to be doing is substituting in a black box function of your own mind as a fundamental character of the universe. You see qualities that seem interesting and complex, and you label them “complexity” when they would be better characterized as {interestingness to humans} (or more precisely, {interestingness to jacob_cannell}, but there’s a lot of overlap there.)
“{Interestingness to humans} has an exponential relationship with time over the lifetime of the universe” packs a lot less of a sense of physical inevitability. The universe is not optimized for the development of {interestingness to humans}. We’ve certainly made the world a lot more interesting for ourselves in our recent history, but that doesn’t suggest it’s part of a universal trend. The calendar you linked to, for instance, lists the K-T extinction event, the most famous although not the greatest of five global mass extinction events. Each of those resulted in a large, albeit temporary, reduction in global ecosystem diversity, which strikes me as a pretty big hit to {interestingness to humans}. And while technology has been increasing exponentially throughout the global stage recently, there have been plenty of empire collapses and losses of culture which probably mark significant losses of {interestingness to humans} as well.
So, what your post really relies upon is the proposition that {interestingness to humans} can be made to experience an endless exponential increase over time, without leaving Earth. I am convinced that the reasonable default assumption given the available data is that it cannot.
Yes, and I was trying to show how this relates to intelligence, and how intelligence requires compression, and thus relates to complexity.
We recognize patterns because the universe is actually made of patterns. The recognition is no more arbitrary than thermodynamics or quantum physics.
No. One of the principle sub-functions of minds/intelligences in general is general compression. The patterns are part of the fundamental character of the universe, and it is that reality which shapes minds, not the other way around.
Complexity is not {interestingness to humans}. Although of course {interestingness to humans} is related to complexity, because our minds learn/model/represent patterns, we find patterns ‘interesting’ because they allow us to model that which exists, and complexity is a pattern-measure.
I suspect we could agree more on complexity if we could algorithmically define it, even though that shouldn’t be necessary (but I will resort to that shortly as a secondary measure). We could probably agree on what ‘humans’ are without a mathematical definition, and we could probably agree on how the number of humans has been changing over time.
Imagine if we could also loosely agree on what ‘things’ or unique patterns are in general, and then we could form a taxonomy over all patterns, where some patterns have is-a relationships to other patterns and are in turn built out of sub-patterns, forming a loosely hierarchical network. We could then roughly define pattern complexity as the hierarchical network rank order of the pattern in the pattern network. A dog is a mammal which is an animal, so complexity increases along that path, for example, and a dog is more complex than any of it’s subcomponents. We could then define ‘events’ as temporal changes in the set of patterns (within some pocket of the universe). We could then rank events in terms of complexity changes, based on the change in complexity of the whole composite-pattern (within space-time pockets).
Then we make a graph of a set of the top N events.
We then see the U shape trend in complexity change over time.
If you want a more mathematical definition, take Kolmogorov complexity and modify it to be computationally tractable. If K(X) is the K-complexity of string X defined by the minimal program which outputs X (maximal compression), then we define CK(X, M, T) as the minimal program which best approximates X subject to memory-space M and time T constraints. Moving from intractable lossless compression to lossy practical compression makes this modified definition of complexity computable in theory (but it’s exact definition still requires optimal lossy compression algorithms). We are interested in CK complexity of the order computable to humans and AIs in the near future.
Complexity != {interestingness to humans}
Complexity over time does appears to follow an inevitable upward accelerating trend in many localized sub-pockets of the universe over time, mirroring the big bang in reverse, and again the trend is not exponential—it’s a 1/x type shape.
The trend is nothing like a smooth line. It is noisy, and there have been some apparent complexity dips, as you mention, although the overall trend is undeniably accelerating and the best fit is the U shape leading towards a local vertical asymptote. As a side note, complexity/systems theorists would point out that most extinctions actually caused large increases in net complexity, and were some of the most important evolutionary stimuli. Counterintuitive, but true.
The number of possible patterns in an information cluster is superexponential with the size of the information cluster. Can you demonstrate that the patterns you’re recognizing are non-arbitrary? Patterns that are natural to us often seem fundamental even when they are not.
Things can be extraordinarily complex without being particularly interesting to humans. We don’t have a fully general absolute pattern recognizing system; that would be an evolutionary hindrance even if it were something that could practically be developed. There are simply too many possible patterns in too many possible contexts. It’s not advantageous for us to be interested in all of them.
I think we don’t agree on what this “complexity” is because it’s not a natural category. You’re insisting that it’s fundamental because it feels fundamental to you, but you can’t demonstrate that it’s fundamental, and I simply don’t buy that it is.
Eventually. Ecosystem diversity eventually bounces back, and while a large number of genuses and families die out, most orders retain representatives, so there’s still plenty of genetic diversity to spread out and reoccupy old niches, and potentially create new ones in the process. But there’s no fundamental principle that demands that massive extinction events must lead to increased ecosystem complexity even in the long term; for a long term decrease, you’d simply have to wipe out genetic diversity on a higher level. An UFAI event, for example, could easily lead to a massive drop in ecosystem complexity.
Firstly, you are misquoting EY’s post: the possible number of patterns in a string grows exponentially with the number of bits, as expected. It is the number of ‘concepts’ which grows super-exponentially, where EY is defining concept very loosely as any program which classifies patterns. The super-exponential growth in concepts is combinatoric and just stems from naive specific classifiers which recognize combinations of specific patterns.
Secondly, this doesn’t really relate to universal pattern recognition, which is concerned only with optimal data classifications according to a criteria such as entropy maximization.
As a simple example, consider the set of binary strings of length N. There are 2^N possible observable strings, and a super-exponential combinatoric set of naive classifiers. But consider observed data sequences of the form 10010 10010 10010 repeated ad infinitum. Any form of optimal extropy maximization will reduce this to something of the form repeat “10010” indefinitely.
In general any given sequence of observations has a single unique compressed (extropy reduced) representation, which corresponds to it’s fundamental optimal ‘pattern’ representation.
Depends on what you mean. It’s rather trivial to construct simple universal extropy maximizers/optimizers—just survey the basic building blocks of unsupervised learning algorithms. The cortical circuit performs similar computations.
For example the 2D edge patterns that cortical tissue (and any good unsupervised learning algorithm) learns to represent when exposed to real world video are absolutely not arbitrary in the slightest. This should be obvious.
If you mean higher level thought abstractions by “the patterns you’re recognizing”, then the issue becomes more complex. Certainly the patterns we currently recognize at the highest level are not optimal extractions, if that’s what you mean. But nor are they arbitrary. If they were arbitrary our cortex would have no purpose, would confer no selection advantage, and would not exist.
We do have a fully general pattern recognition system. I’m not sure what you mean by “general absolute”.
They are trivial to construct, and require far less genetic information to specify than specific pattern recognition systems.
Specific recognition systems have the tremendous advantage that they work instantly without any optimization time. A general recognition system has to be slowly trained on the patterns of data present in the observations—this requires time and lots of computation.
Simpler short lived organisms rely more on specific recognition systems and circuitry for this reason as they allow newborn creatures to start with initial ‘pre-programmed’ intelligence. This actual requires considerably more genetic complexity than general learning systems.
Mammals grew larger brains with increasing reliance on general learning/recognition systems because it provides a tremendous flexibility advantage at the cost of requiring larger brains, longer gestation, longer initial development immaturity, etc. In primates and humans especially this trend is maximized. Human infant brains have very little going on initially except powerful general meta-algorithms which will eventually generate specific algorithms in response to the observed environment.
The concept of “natural category” is probably less well defined that “complexity” itself, so it probably won’t shed too much light on our discussion.
That being said, from that post he describes it as:
In that sense complexity is absolutely a natural category.
Look at Kolmogorov_complexity. It is a fundamental computable property of information, and information is the fundamental property of modern physics. So that definition of complexity is as natural as you can get, and is right up there with entropy. Unfortunately that definition itself is not perfect and is too close to entropy, but computable variants of it exist .. .. one used in a computational biology paper I was browsing recently (measuring the tendency towards increased complexity in biological systems) defined complexity as compressed information minus entropy, which may be the best fit to the intuitive concept.
Intuitively I could explain it as follows.
The information complexity of an intelligent system is a measure of the fundamental statistical pattern structure it extracts from it’s environment. If the information it observes is already at maximum entropy (such as pure noise), then it is already maximally compressed, no further extraction is possible, and no learning is possible. At the other extreme if the information observed is extremely uniform (low entropy) then it can be fully described/compressed by extremely simple low complexity programs. A learning system extracts entropy from it’s environment and grows in complexity in proportion.
It’s objective that our responses exist, and they occur in response to particular things. It’s not obvious that they occur in response to natural categories, rather than constructed categories like “sexy.”
“General absolute” was probably a poor choice of words, but I meant to express a system capable of recognizing all types of patterns in all contexts. There is an absolute, non arbitrary pattern here, do you recognize it?
Kolmogorov complexity is a fundamental character, but it’s not at all clear that we should want a Kolmogorov complexity optimizer acting on our universe, or that Kolmogorov complexity actually has much to do with the “complexity” you’re talking about. A message or system can be high in Kolmogorov complexity without being interesting to us, and it still seems to me that you’re conflating complexity with interestingness when they really don’t bear that sort of relationship.
I see your meaning—and no practical system is capable of recognizing all types of patterns in all contexts. A universal/general learn algorithm is simply one that can learn to recognize any pattern, given enough time/space/training. That doesn’t mean it will recognize any random pattern it hasn’t already learned.
I see hints of structure in your example but it doesn’t ring any bells.
No, and that’s not my primary interest. Complexity seems to be the closest fit for something-important-which-has-been-changing over time on earth. If we had a good way to measure it, we could then make a quantitative model of that change and use that to predict the rate of change in the future, perhaps even ultimately reducing it to physical theory.
For example, one of the interesting new recent physics papers (entropic gravity) proposes that gravity is actually not a fundamental force or even spacetime curvature, but actually an entropic statistical pseudo-force. The paper is interesting because as a side effect it appears to correctly derive the mysterious cosmological constant for acceleration. As an unrelated side note I have an issue with it because it uses the holographic principle/berkenstein bound for information density which still appears to lead to lost-information paradoxes in my mind.
But anyway, if you look at a random patch of space-time, it is always slowly evolving to a higher-entropy state (2nd law), and this may be the main driver of most macroscopic tendencies (even gravity). It’s also quite apparent that a closely related measure—complexity—increases non-linearly in a fashion perhaps loosely like gravitational collapse. The non-linear dynamics are somewhat related—complexity tends to increase in proportion to the existing local complexity as a fraction of available entropy. In some regions this appears to go super-critical, like on earth, where in most places the growth is minuscule or non-existent.
It’s not apparent that complexity is increasing over time. In some respects, things seem to be getting more interesting over time, although I think that a lot of this is due to selective observation, but we don’t have any good reason to believe we’re dealing with a natural category here. If we were dealing with something like Kolmogorov complexity, at least we could know if we were dealing with a real phenomenon, but instead we’re dealing with some ill defined category for which we cannot establish a clear connection to any real physical quality.
For all that you claim that it’s obvious that some fundamental measure of complexity is increasing nonlinearly over time, not a lot of other people are making the same claim, having observed the same data, so it’s clearly not as obvious as all that.