The key defining characteristic of a ULM is that it uses its universal learning algorithm for continuous recursive self-improvement with regards to the utility function (reward system). We can view this as second (and higher) order optimization: the ULM optimizes the external world (first order), and also optimizes its own internal optimization process (second order), and so on. Without loss of generality, any system capable of computing a large number of decision variables can also compute internal self-modification decisions.
While I do believe that the human brain can learn to self-modify safely (in the sense that I’ve seen papers showing how to get around the Incompleteness Theorems in a mathematical setting, and I trust a human brain to be able to reason in the right ways, if trained to do so), this statement is completely unjustified about learning systems in general. Any universal learner is going to have to learn to self-represent and self-improve, even though it generally will be able to do so in principle.
Imagination and memory recall seem to be basically the same.
Well, yes, of course. For a causal reasoner, the factual is an element of the probable counterfactuals. Counterfactual conditional simulation is how imagination most likely works, which would indicate that imagination develops as a side-effect of being able to perform the counterfactual evaluations necessary for causal reasoning and decision-making.
The ULH suggests that most everything that defines the human mind is cognitive software rather than hardware: the adult mind (in terms of algorithmic information) is 99.999% a cultural/memetic construct. Obviously there are some important exceptions: infants are born with some functional but very primitive sensory and motor processing ‘code’.
What do you mean here by “algorithmic information”? Kolmogorov complexity? It seems a needless buzzword to throw in here if you’re not doing AIT directly.
I’d also question whether it’s 99.999% a “cultural construct”. Most adult thinking is learned, yes, but, and I can’t find the paper on this right now, what the embodiment provides is sensory and value biases that help “pick out” rigid, biologically determined features to learn from.
A key principle of a secure code sandbox is that the code you are testing should not be aware that it is in a sandbox. If you violate this principle then you have already failed. Yudkowsky’s AI box thought experiment assumes the violation of the sandbox security principle apriori and thus is something of a distraction.
No, the AI Box Experiment just assumes that your agent can grow to be more complex and finely-optimized in its outputs/actions/choices than most adult humans despite having very little data, if any, to learn from. It more-or-less assumes that certain forms of “superintelligence” can do information-theoretic magic. This is compatible with the many-moduled brain theory, but incompatible with the brain-as-general-inductive-learner theory (which says that the brain must obey sample complexity restrictions by making efficient use of sample data, or seeking out more of it, rather than being able to learn without it).
So to create benevolent AGI, we should think about how to create virtual worlds with the right structure, how to educate minds in those worlds, and how to safely evaluate the results.
This is wildly stupid. Sorry, I don’t want to be nasty about this, but I simply don’t trust a “benevolent AGI” design whose value-training is a black-box model. I want to damn well see what I am programming, should I obtain the security clearances to ever touch such a thing ;-).
In general the evidence from the last four years or so supports Hanson’s viewpoint from the Foom debate. More specifically, his general conclusion:
I think we can steelman Eliezer’s position here. Learning is compression; compression is learning. While we can observe in the literature that the human brain uses some fairly powerful compression algorithms, we do not have strong evidence that it uses optimal compression methods. So, if someone finds a domain-general compression method that gets closer to outputting the Kolmogorov structural information of the input sample than the hierarchical compression methods used by the human mind, the artificial Minds built using that superior compression method should learn in a structurally more efficient way than the human mind—that would render Yudkowsky correct.
Overall, I congratulate you on the article! I’ll probably follow it up with my own book/literature review on Plato’s Camera very soon, which isn’t an intentional follow-up but simply happens to cover much of the same material. Good job locating the correct hypothesis, especially given the difficulty of swimming the seas of active scientific literature! Mazal tov, mazal tov!
Also, while we may take the Evolved Modularity Hypothesis to be mostly wrong—in the sense that its most simple and obvious interpretation is very obviously wrong, even though we can steelman it for charity’s sake—I think we should declare that the “Yudkowsky-Hanson FOOM debate” has the result of “game called on account of learning that the true disagreement, upon more science being done to resolve the question, was not that large.”
I’ll probably follow it up with my own book/literature review on Plato’s Camera very soon
[reads abstract]. Looks interesting. I enjoyed Consciousness Explained back in the day. Philosophers armed with neuroscience can make for enjoyable reads.
What do you mean here by “algorithmic information”? Kolmogorov complexity?
I should probably change that terminology to be something like “synaptic code bits”—the amount of info encoded in synapses (which is close to zero percent of it’s adult level at birth for the cortex).
A key principle of a secure code sandbox is that the code you are testing should not be aware that it is in a sandbox. If you violate this principle then you have already failed. Yudkowsky’s AI box thought experiment assumes the violation of the sandbox security principle apriori
No, the AI Box Experiment just assumes that your agent can grow to be more complex and finely-optimized in its outputs/actions/choices than most adult humans despite having very little data, if any, to learn from. It more-or-less assumes that certain forms of “superintelligence” can do information-theoretic magic.
The AI Box experiment explicitly starts with the premise that the AI knows 1.) It is in a box. and 2.) that there is a human who can let it out.
Now perhaps the justification is that “superintelligence can do information-theoretic magic”, therefore it will figure out it’s in a box, but nonetheless—all of that is assumed.
In simplification, I view the information-theoretic-magic type of AI that EY/MIRI seems to worry about as something like wormhole technology.
Are wormholes/magic-AI’s possible in principle? Probably?
If someone were to create wormhole tech tommorow, they could assassinate world leaders, blow up arbitrary buildings, probably destroy the world … etc. Do I worry about that? No.
This is wildly stupid. Sorry, I don’t want to be nasty about this, but I simply don’t trust a “benevolent AGI” design whose value-training is a black-box model. I want to damn well see what I am programming,
There is nothing inherently black-box about neuroscience-inspired AGI (at that viewpoint—once common on LW—simply becomes reinforced by reading everything other than neuroscience). Neuroscience has already made huge strides in terms of peering into the box, and Virtual brains are vastly easier to inspect. The approach I advocate/favor is fully transparent—you will be able to literally see the AGI’s thoughts, read their thoughts in logs, debug, etc.
However, advanced learning AI is not something one ‘programs’, and that viewpoint shift is much of what the article was about.
Learning is compression; compression is learning. While we can observe in the literature that the human brain uses some fairly powerful compression algorithms, we do not have strong evidence that it uses optimal compression methods. So, if someone finds a domain-general compression method that gets closer to outputting the Kolmogorov structural information
This actually isn’t that efficient—practical learning is more than just compression. Compression is simple UL, which doesn’t get you far. It can waste arbitrary computation attempting to learn functions that are unlearnable (deterministic noise), and-or just flat out not important (zero utility). What the brain and all effective learning systems do is more powerful and complex than just compression—it is utilitarian learning.
While I do believe that the human brain can learn to self-modify safely (in the sense that I’ve seen papers showing how to get around the Incompleteness Theorems in a mathematical setting, and I trust a human brain to be able to reason in the right ways, if trained to do so), this statement is completely unjustified about learning systems in general. Any universal learner is going to have to learn to self-represent and self-improve, even though it generally will be able to do so in principle.
Well, yes, of course. For a causal reasoner, the factual is an element of the probable counterfactuals. Counterfactual conditional simulation is how imagination most likely works, which would indicate that imagination develops as a side-effect of being able to perform the counterfactual evaluations necessary for causal reasoning and decision-making.
What do you mean here by “algorithmic information”? Kolmogorov complexity? It seems a needless buzzword to throw in here if you’re not doing AIT directly.
I’d also question whether it’s 99.999% a “cultural construct”. Most adult thinking is learned, yes, but, and I can’t find the paper on this right now, what the embodiment provides is sensory and value biases that help “pick out” rigid, biologically determined features to learn from.
No, the AI Box Experiment just assumes that your agent can grow to be more complex and finely-optimized in its outputs/actions/choices than most adult humans despite having very little data, if any, to learn from. It more-or-less assumes that certain forms of “superintelligence” can do information-theoretic magic. This is compatible with the many-moduled brain theory, but incompatible with the brain-as-general-inductive-learner theory (which says that the brain must obey sample complexity restrictions by making efficient use of sample data, or seeking out more of it, rather than being able to learn without it).
This is wildly stupid. Sorry, I don’t want to be nasty about this, but I simply don’t trust a “benevolent AGI” design whose value-training is a black-box model. I want to damn well see what I am programming, should I obtain the security clearances to ever touch such a thing ;-).
I think we can steelman Eliezer’s position here. Learning is compression; compression is learning. While we can observe in the literature that the human brain uses some fairly powerful compression algorithms, we do not have strong evidence that it uses optimal compression methods. So, if someone finds a domain-general compression method that gets closer to outputting the Kolmogorov structural information of the input sample than the hierarchical compression methods used by the human mind, the artificial Minds built using that superior compression method should learn in a structurally more efficient way than the human mind—that would render Yudkowsky correct.
Overall, I congratulate you on the article! I’ll probably follow it up with my own book/literature review on Plato’s Camera very soon, which isn’t an intentional follow-up but simply happens to cover much of the same material. Good job locating the correct hypothesis, especially given the difficulty of swimming the seas of active scientific literature! Mazal tov, mazal tov!
Also, while we may take the Evolved Modularity Hypothesis to be mostly wrong—in the sense that its most simple and obvious interpretation is very obviously wrong, even though we can steelman it for charity’s sake—I think we should declare that the “Yudkowsky-Hanson FOOM debate” has the result of “game called on account of learning that the true disagreement, upon more science being done to resolve the question, was not that large.”
Thanks!
[reads abstract]. Looks interesting. I enjoyed Consciousness Explained back in the day. Philosophers armed with neuroscience can make for enjoyable reads.
I should probably change that terminology to be something like “synaptic code bits”—the amount of info encoded in synapses (which is close to zero percent of it’s adult level at birth for the cortex).
The AI Box experiment explicitly starts with the premise that the AI knows 1.) It is in a box. and 2.) that there is a human who can let it out.
Now perhaps the justification is that “superintelligence can do information-theoretic magic”, therefore it will figure out it’s in a box, but nonetheless—all of that is assumed.
In simplification, I view the information-theoretic-magic type of AI that EY/MIRI seems to worry about as something like wormhole technology.
Are wormholes/magic-AI’s possible in principle? Probably?
If someone were to create wormhole tech tommorow, they could assassinate world leaders, blow up arbitrary buildings, probably destroy the world … etc. Do I worry about that? No.
There is nothing inherently black-box about neuroscience-inspired AGI (at that viewpoint—once common on LW—simply becomes reinforced by reading everything other than neuroscience). Neuroscience has already made huge strides in terms of peering into the box, and Virtual brains are vastly easier to inspect. The approach I advocate/favor is fully transparent—you will be able to literally see the AGI’s thoughts, read their thoughts in logs, debug, etc.
However, advanced learning AI is not something one ‘programs’, and that viewpoint shift is much of what the article was about.
This actually isn’t that efficient—practical learning is more than just compression. Compression is simple UL, which doesn’t get you far. It can waste arbitrary computation attempting to learn functions that are unlearnable (deterministic noise), and-or just flat out not important (zero utility). What the brain and all effective learning systems do is more powerful and complex than just compression—it is utilitarian learning.
Let me rephrase: generalization is compression. If you do not compress, you cannot generalize, which means you’ll make inefficnet use of your samples.
The term in the literature is resource-rational or bounded-rational inference.
By the way, that book review got done eventually.