So the universe was created by an intelligent agent. Well, that’s the standard Simulation Hypothesis [...]
I’ve been thinking about a slightly different question: is base-level reality physics-like, or optimization-like, and if it’s optimization-like, did it start out that way?
Here’s an example that illustrates what my terms mean. Suppose we are living in base-level reality which started with the Big Bang and evolution, and we eventually develop an AI that takes over the entire universe. Then I would say that base-level reality started off physics-like, then becomes optimization-like.
But it’s surely conceivable that a universe could start off being optimization-like, and this hypothesis doesn’t seem to violate Occam’s Razor in any obvious way. Consider this related question: what is the shortest program that outputs a human mind? Is it an optimization program, or a physics simulation?
An optimization procedure can be very simple, if computing time isn’t an issue, but we don’t know whether there is a concisely describable objective function that we are the optimum of. On the other hand, the mathematical laws of physics are also simple, but we don’t know how rare intelligent life is, so we don’t know how many bits of coordinates are needed to locate a human brain in the universe.
Does anyone have an argument that settles these questions, in either direction?
Can something be optimization-like without being ontologically mental? In other words, if a higher level is a universal Turing machine that devotes more computing resources to other Turing machines depending on how many 1s they’ve written so far as opposed to 0s, is that the sort of optimization-like thing we’re talking about? I’m assuming you don’t mean anything intrinsically teleological.
Yeah, I think if base-level reality started out optimization-like, it’s not mind-like, or at least not any kind of mind that we’d be familiar with. It might be something like Schmidhuber’s Goedel Machine with a relatively simple objective function.
Suppose we are living in base-level reality which started with the Big Bang and evolution, and we eventually develop an AI that takes over the entire universe. Then I would say that base-level reality started off physics-like, then becomes optimization-like.
Hmm? The base-level that the AI is running on is still physics, right?
The universe presumably isn’t optimised for intelligence, since most organisms are baceria, etc, and isn’t optimised for life, since most of it is barren. See Penrose’s argument against the Anthropic Principle in Road to Reality.
I’m confused by your comment, but I’ll try to answer anyway.
As an agent in environment, you can consider the environment in behavioral semantics: environment is an equivalence class of all the things that behave the same as what you see. Instead of minimal model, this gives a maximal model. Everything about territory remains black box, except the structure imposed by the way you see the territory, by the way you observe things, perform actions, and value strategies. This dissolves the question about what the territory “really is”.
Your answer strikes me as unsatisfactory: if we apply it to humans, we lose interest in electricity, atoms, quarks etc. An agent can opt to dig deeper into reality to find the base-level stuff, or it can “dissolve the question” and walk away satisfied. Why would you want to do the latter?
The agent has preferences over these black boxes (or strategies that instantiate them), and digging deeper may be a good idea. To get rid of the (unobservable) structure in environment, the preferences for the elements of environment have to be translated in terms of preferences over the whole situations. The structure of environment becomes the structure of preferences over the black boxes.
Two models can behave the same as what you’ve seen so far, but diverge in future predictions. Which model should you give greater weight to? That’s the question I’m asking.
The current best answer we know seems to be to write each consistent hypothesis in a formal language, and weight longer explanations inverse exponentially, renormalizing such that your total probability sums to 1. Look up aixi, universal prior
In behavioral interpretation, you weight observations, or effects of possible strategies (on observations/actions), not the way territory is. The base level is the agent, and rules of its game with environment. Everything else describes the form of this interaction, and answers the questions not about the underlying reality, but about how the agent sees it. If the distinction you are making doesn’t reach the level of influencing what the agent experiences, it’s absent from this semantics: no weighting, no moving parts, no distinction at all.
For a salient example: if the agent in the same fixed internal state is instantiated multiple times both in the same environment at the same time, and at different times, or even in different environments, with different probabilities for some notion of that, all of these instances and possibilities together go under one atomic black-box symbol for the territory corresponding to that state of the agent, with no internal structure. The structure however can be represented in preferences for strategies or sets of strategies for the agent.
Vladimir, are you proposing this “behavioral interpretation” for an AI design, or for us too? Is this an original idea of yours? Can you provide a link to a paper describing it in more detail?
There are many similarities (or dualities) between algebras and coalgebras which are often useful as guiding principles. But one should keep in mind that there are also significant differences between algebra and coalgebra. For example, in a computer science setting, algebra is mainly of interest for dealing with finite data elements – such as finite lists or trees – using induction as main definition and proof principle. A key feature of coalgebra is that it deals with potentially infinite data elements, and with appropriate state-based notions and techniques for handling these objects. Thus, algebra is about construction, whereas coalgebra is about deconstruction – understood as observation and modification.
A rule of thumb is: data types are algebras, and state-based systems are coalgebras. But this does not always give a clear-cut distinction. For instance, is a stack a data type or does it have a state? In many cases however, this rule of thumb works: natural numbers are algebras (as we are about to see), and machines are coalgebras. Indeed, the latter have a state that can be observed and modified.
[...]
Initial algebras (in Sets) can be built as so-called term models: they contain everything that can be built from the operations themselves, and nothing more. Similarly, we saw that final coalgebras consist of observations only.
The universe doesn’t optimize entropy, it is people who make strong inferences coming out this way. See e.g. E. T. Jaynes (1988). `The Evolution of Carnot’s Principle’. Maximum-Entropy and Bayesian Methods in Science and Engineering 1:267+ (PDF)
On the other hand, you can always look at how something is, and formulate an optimization problem for which the way things are is a solution, saying that “so, the system optimizes this property”. This is called variational method, and it isn’t terribly ontologically enlightening.
If I understand your terms correctly, it may be possible for realities that are not base-level to be optimization-like without being physics-like, e.g. the reality generated by playing a game of Nomic, a game in which players change the rules of the game. But this is only possible because of interference by optimization processes from a lower-level reality, whose goals (“win”, “have fun”) refer to states of physics-like processes. I suspect that base-level reality be physics-like. To paraphrase John Donne, no optimization process is an island—otherwise how could one tell the difference between an optimization process and purely random modification?
On the other hand, the “evolution” optimization process arose in our universe without a requirement for lower-level interference. Not that I assume our universe is base-level reality, but it seems like evolution or analogous optimizations could arise at any level. So perhaps physics-like realities are also intrinsically optimization-like.
I’ve been thinking about a slightly different question: is base-level reality physics-like, or optimization-like, and if it’s optimization-like, did it start out that way?
Here’s an example that illustrates what my terms mean. Suppose we are living in base-level reality which started with the Big Bang and evolution, and we eventually develop an AI that takes over the entire universe. Then I would say that base-level reality started off physics-like, then becomes optimization-like.
But it’s surely conceivable that a universe could start off being optimization-like, and this hypothesis doesn’t seem to violate Occam’s Razor in any obvious way. Consider this related question: what is the shortest program that outputs a human mind? Is it an optimization program, or a physics simulation?
An optimization procedure can be very simple, if computing time isn’t an issue, but we don’t know whether there is a concisely describable objective function that we are the optimum of. On the other hand, the mathematical laws of physics are also simple, but we don’t know how rare intelligent life is, so we don’t know how many bits of coordinates are needed to locate a human brain in the universe.
Does anyone have an argument that settles these questions, in either direction?
Can something be optimization-like without being ontologically mental? In other words, if a higher level is a universal Turing machine that devotes more computing resources to other Turing machines depending on how many 1s they’ve written so far as opposed to 0s, is that the sort of optimization-like thing we’re talking about? I’m assuming you don’t mean anything intrinsically teleological.
Yeah, I think if base-level reality started out optimization-like, it’s not mind-like, or at least not any kind of mind that we’d be familiar with. It might be something like Schmidhuber’s Goedel Machine with a relatively simple objective function.
What does “intrinsically teleological” mean?
Hmm? The base-level that the AI is running on is still physics, right?
[ETA the word “on”, which I missed out]
No. That’s the point of the question.
The universe presumably isn’t optimised for intelligence, since most organisms are baceria, etc, and isn’t optimised for life, since most of it is barren. See Penrose’s argument against the Anthropic Principle in Road to Reality.
I think Wei_Dai was trying to suggest an objective function beyond our ken.
I’m confused by your comment, but I’ll try to answer anyway.
As an agent in environment, you can consider the environment in behavioral semantics: environment is an equivalence class of all the things that behave the same as what you see. Instead of minimal model, this gives a maximal model. Everything about territory remains black box, except the structure imposed by the way you see the territory, by the way you observe things, perform actions, and value strategies. This dissolves the question about what the territory “really is”.
Your answer strikes me as unsatisfactory: if we apply it to humans, we lose interest in electricity, atoms, quarks etc. An agent can opt to dig deeper into reality to find the base-level stuff, or it can “dissolve the question” and walk away satisfied. Why would you want to do the latter?
The agent has preferences over these black boxes (or strategies that instantiate them), and digging deeper may be a good idea. To get rid of the (unobservable) structure in environment, the preferences for the elements of environment have to be translated in terms of preferences over the whole situations. The structure of environment becomes the structure of preferences over the black boxes.
Two models can behave the same as what you’ve seen so far, but diverge in future predictions. Which model should you give greater weight to? That’s the question I’m asking.
The current best answer we know seems to be to write each consistent hypothesis in a formal language, and weight longer explanations inverse exponentially, renormalizing such that your total probability sums to 1. Look up aixi, universal prior
In behavioral interpretation, you weight observations, or effects of possible strategies (on observations/actions), not the way territory is. The base level is the agent, and rules of its game with environment. Everything else describes the form of this interaction, and answers the questions not about the underlying reality, but about how the agent sees it. If the distinction you are making doesn’t reach the level of influencing what the agent experiences, it’s absent from this semantics: no weighting, no moving parts, no distinction at all.
For a salient example: if the agent in the same fixed internal state is instantiated multiple times both in the same environment at the same time, and at different times, or even in different environments, with different probabilities for some notion of that, all of these instances and possibilities together go under one atomic black-box symbol for the territory corresponding to that state of the agent, with no internal structure. The structure however can be represented in preferences for strategies or sets of strategies for the agent.
Vladimir, are you proposing this “behavioral interpretation” for an AI design, or for us too? Is this an original idea of yours? Can you provide a link to a paper describing it in more detail?
I’m generalizing/analogizing from the stuff I read on coalgebras, and in this case I’m pretty sure the idea makes sense, it’s probably explored elsewhere. You can start here, or directly from Introduction to Coalgebra: Towards Mathematics of States and Observations (PDF).
The physical universe seems to optimize for low-energy / high-entropy states, via some kind of local decision process.
So I think your two options actually coincide.
The universe doesn’t optimize entropy, it is people who make strong inferences coming out this way. See e.g. E. T. Jaynes (1988). `The Evolution of Carnot’s Principle’. Maximum-Entropy and Bayesian Methods in Science and Engineering 1:267+ (PDF)
On the other hand, you can always look at how something is, and formulate an optimization problem for which the way things are is a solution, saying that “so, the system optimizes this property”. This is called variational method, and it isn’t terribly ontologically enlightening.
How about both?
If I understand your terms correctly, it may be possible for realities that are not base-level to be optimization-like without being physics-like, e.g. the reality generated by playing a game of Nomic, a game in which players change the rules of the game. But this is only possible because of interference by optimization processes from a lower-level reality, whose goals (“win”, “have fun”) refer to states of physics-like processes. I suspect that base-level reality be physics-like. To paraphrase John Donne, no optimization process is an island—otherwise how could one tell the difference between an optimization process and purely random modification?
On the other hand, the “evolution” optimization process arose in our universe without a requirement for lower-level interference. Not that I assume our universe is base-level reality, but it seems like evolution or analogous optimizations could arise at any level. So perhaps physics-like realities are also intrinsically optimization-like.