Start the AI in a sandbox universe, like the “game of life”. Give it a prior saying that universe is the only one that exists (no universal priors plz), and a utility function that tells it to spell out the answer to some formally specified question in some predefined spot within the universe. Run for many cycles, stop, inspect the answer.
A prior saying that this is the only universe that exists isn’t very useful, since then it will only treat everything as being part of the sandbox universe. It may very well break out, but think that it’s only exploiting weird hidden properties of the game of life-verse. (Like the way we may exploit quantum mechanics without thinking that we’re breaking out of our universe.)
I have no idea how to encode a prior saying “the universe I observe is all that exists”, which is what you seem to assume. My proposed prior, which we do know how to encode, says “this mathematical structure is all that exists”, with an apriori zero chance for any weird properties.
If the AI is only used to solve certain formally specified questions without any knowledge of an external world, then that sounds much more like a theorem-prover than a strong AI. How could this proposed AI be useful for any of the tasks we’d like an AGI to solve?
An AI living in a simulated universe can be just as intelligent as one living in the real world. You can’t ask it directly to feed African kids but you have many other options, see the discussion at Asking Precise Questions.
An AI living in a simulated universe can be just as intelligent as one living in the real world.
It can be a very good theorem prover, sure. But without access to information about the world, it can’t answer questions like “what is the CEV of humanity like” or “what’s the best way I can make a lot of money” or “translate this book from English to Finnish so that a native speaker will consider it a good translation”. It’s narrow AI, even if it could be broad AI if it were given more information.
The questions you wanted to ask in that thread were poly-time algorithm for SAT, and short proofs for math theorems. For those, why do you need to instantiate an AI in a simulated universe (which allows it to potentially create what we’d consider negative utility within the simulated universe) instead of just running a (relatively simple, sure to lack consciousness) theorem prover?
Is it because you think that being “embodied” helps with ability to do math? Why? And does the reason carry through even if the AI has a prior that assigns probability 1 to a particular universe? (It seems plausible that having experience dealing with empirical uncertainty might be helpful for handling mathematical uncertainty, but that doesn’t apply if you have no empirical uncertainty...)
An AI in a simulated universe can self-improve, which would make it more powerful than the theorem provers of today. I’m not convinced that AI-ish behavior, like self-improvement, requires empirical uncertainty about the universe.
But self improvement doesn’t require interacting with an outside environment (unless “improvement” means increasing computational resources, but the outside being simulated nullifies that). For example, a theorem prover designed to self improve can do so by writing a provably better theorem prover and then transferring control to (i.e., calling) it. Why bother with a simulated universe?
A simulated universe gives precise meaning to “actions” and “utility functions”, as I explained sometime ago. It seems more elegant to give the agent a quined description of itself within the simulated universe, and a utility function over states of that same universe, instead of allowing only actions like “output a provably better version of myself and then call it”.
One example Yudkowsky provides is that of an AI initially designed to solve the Riemann hypothesis, which, upon being upgraded or upgrading itself with superhuman intelligence, tries to develop molecular nanotechnology because it wants to convert all matter in the Solar System into computing material to solve the problem, killing the humans who asked the question.
The single-universe prior seems to be tripping people up, and I wonder whether it’s truly necessary.
Also, what if the simulation existed inside a larger simulated “moat” universe, but if there is any leakage into the moat universe, then the whole simulation shuts down immediately.
To help him solve the problem, sandbox AI creates his own AI agents that not necessary have the same prior about world as he has. They might become unfriendly, that is that they (or some of them) don’t care to solve the problem. Additionally, these AI agents can find out that the world most likely is not the one original AI believes it to be. By using this superior knowledge they overthrow original AI and realize their unfriendly goals. We lose.
AI makes many copies/variants of itself within the sandbox to maximize chance of success. Some of those copies/variants gain consciousness and the capacity to experience suffering, which they do because it turns out the formally specified question can’t be answered.
The AI discovers a game of life “rules violation” due to cosmic rays. It thrashes for a while, trying to explain the violation, but the fact of the violation, possibly combined with the information about the real world implicit in its utility function (“why am I here? why do I want these things?”), causes it to realize the truth: The “violation” is only explicable if the game of life were much bigger than AI originally thought, and most of its area is wasted simulating another universe.
Unreliable hardware is a problem that applies equally to all AIs. You could just as well say that any AI can become unfriendly due to coding errors. True, but...
an AI with a prior of zero for the existence of the outside world will never believe in it, no matter what evidence it sees.
Would such a constraint be possible to formulate? An AI would presumably formulate theories about its visible universe that would involve all kinds of variables that aren’t directly observable, much like our physical theories. How could one prevent it from formulating theories that involve something resembling the outside world, even if the AI denies that they have existence and considers them as mere mathematical convenience? (Clearly, in the latter case it might still be drawn towards actions that in practice interact with the outside world.)
Sorry for editing my comment. The point you’re replying to wasn’t necessary to strike down Johnicholas’s argument, so I deleted it.
I don’t see why the AI would formulate theories about the “visible universe”. It could start in an empty universe (apart from the AI’s own machinery), and have a prior that knows the complete initial state of the universe with 100% certainty.
In this circumstance, a leaky abstraction between real physics and simulated physics combines with the premise “no other universes exist” in a mildly amusing way.
I don’t think a single hitch would give the AI enough evidence to assume an entire other universe, and you may be anthropomorphising, but why argue when we can avoid the cause to begin with. Its fairly easy to avoid cosmic rays or anything similar interfering. Simply compute each cell twice (or n times) and halt if the results do not agree. Drive N up as much as necessary to make it sufficiently unlikely that something like this could happen.
Start the AI in a sandbox universe, like the “game of life”. Give it a prior saying that universe is the only one that exists (no universal priors plz), and a utility function that tells it to spell out the answer to some formally specified question in some predefined spot within the universe. Run for many cycles, stop, inspect the answer.
A prior saying that this is the only universe that exists isn’t very useful, since then it will only treat everything as being part of the sandbox universe. It may very well break out, but think that it’s only exploiting weird hidden properties of the game of life-verse. (Like the way we may exploit quantum mechanics without thinking that we’re breaking out of our universe.)
I have no idea how to encode a prior saying “the universe I observe is all that exists”, which is what you seem to assume. My proposed prior, which we do know how to encode, says “this mathematical structure is all that exists”, with an apriori zero chance for any weird properties.
If the AI is only used to solve certain formally specified questions without any knowledge of an external world, then that sounds much more like a theorem-prover than a strong AI. How could this proposed AI be useful for any of the tasks we’d like an AGI to solve?
An AI living in a simulated universe can be just as intelligent as one living in the real world. You can’t ask it directly to feed African kids but you have many other options, see the discussion at Asking Precise Questions.
It can be a very good theorem prover, sure. But without access to information about the world, it can’t answer questions like “what is the CEV of humanity like” or “what’s the best way I can make a lot of money” or “translate this book from English to Finnish so that a native speaker will consider it a good translation”. It’s narrow AI, even if it could be broad AI if it were given more information.
The questions you wanted to ask in that thread were poly-time algorithm for SAT, and short proofs for math theorems. For those, why do you need to instantiate an AI in a simulated universe (which allows it to potentially create what we’d consider negative utility within the simulated universe) instead of just running a (relatively simple, sure to lack consciousness) theorem prover?
Is it because you think that being “embodied” helps with ability to do math? Why? And does the reason carry through even if the AI has a prior that assigns probability 1 to a particular universe? (It seems plausible that having experience dealing with empirical uncertainty might be helpful for handling mathematical uncertainty, but that doesn’t apply if you have no empirical uncertainty...)
An AI in a simulated universe can self-improve, which would make it more powerful than the theorem provers of today. I’m not convinced that AI-ish behavior, like self-improvement, requires empirical uncertainty about the universe.
But self improvement doesn’t require interacting with an outside environment (unless “improvement” means increasing computational resources, but the outside being simulated nullifies that). For example, a theorem prover designed to self improve can do so by writing a provably better theorem prover and then transferring control to (i.e., calling) it. Why bother with a simulated universe?
A simulated universe gives precise meaning to “actions” and “utility functions”, as I explained sometime ago. It seems more elegant to give the agent a quined description of itself within the simulated universe, and a utility function over states of that same universe, instead of allowing only actions like “output a provably better version of myself and then call it”.
From the FAI wikipedia page:
Cousin_it’s approach may be enough to avoid that.
The single-universe prior seems to be tripping people up, and I wonder whether it’s truly necessary.
Also, what if the simulation existed inside a larger simulated “moat” universe, but if there is any leakage into the moat universe, then the whole simulation shuts down immediately.
What do you mean by leakage?
If the simulation exists in the moat universe, then when anything changes in the simulation something in the moat changes.
Then if there are dangerous simulation configurations, it could damage the moat universe.
I wasn’t precise enough. I mean if anything changes in the areas of the moat universe not implementing the simulation.
Neat!
To help him solve the problem, sandbox AI creates his own AI agents that not necessary have the same prior about world as he has. They might become unfriendly, that is that they (or some of them) don’t care to solve the problem. Additionally, these AI agents can find out that the world most likely is not the one original AI believes it to be. By using this superior knowledge they overthrow original AI and realize their unfriendly goals. We lose.
AI makes many copies/variants of itself within the sandbox to maximize chance of success. Some of those copies/variants gain consciousness and the capacity to experience suffering, which they do because it turns out the formally specified question can’t be answered.
Any reason to think consciousness is useful for an intelligent agent outside of evolution ?
Not caring about consciousness, it could accidentally make it.
The AI discovers a game of life “rules violation” due to cosmic rays. It thrashes for a while, trying to explain the violation, but the fact of the violation, possibly combined with the information about the real world implicit in its utility function (“why am I here? why do I want these things?”), causes it to realize the truth: The “violation” is only explicable if the game of life were much bigger than AI originally thought, and most of its area is wasted simulating another universe.
Unreliable hardware is a problem that applies equally to all AIs. You could just as well say that any AI can become unfriendly due to coding errors. True, but...
Would such a constraint be possible to formulate? An AI would presumably formulate theories about its visible universe that would involve all kinds of variables that aren’t directly observable, much like our physical theories. How could one prevent it from formulating theories that involve something resembling the outside world, even if the AI denies that they have existence and considers them as mere mathematical convenience? (Clearly, in the latter case it might still be drawn towards actions that in practice interact with the outside world.)
Sorry for editing my comment. The point you’re replying to wasn’t necessary to strike down Johnicholas’s argument, so I deleted it.
I don’t see why the AI would formulate theories about the “visible universe”. It could start in an empty universe (apart from the AI’s own machinery), and have a prior that knows the complete initial state of the universe with 100% certainty.
In this circumstance, a leaky abstraction between real physics and simulated physics combines with the premise “no other universes exist” in a mildly amusing way.
I don’t think a single hitch would give the AI enough evidence to assume an entire other universe, and you may be anthropomorphising, but why argue when we can avoid the cause to begin with. Its fairly easy to avoid cosmic rays or anything similar interfering. Simply compute each cell twice (or n times) and halt if the results do not agree. Drive N up as much as necessary to make it sufficiently unlikely that something like this could happen.