We can have a low-information reference class with instances of high entropy, the “heat soup”. But then, picking a reference class is arbitrary (we can contrive a complex class of heat soup flavors).
I don’t like EY’s posts about AI. He’s not immune to the sunk cost fallacy, and the worst form of sunk cost fallacy when one denies outright (with long handwave) any possibility of a better solution, having sunk the cost into the worse one.
Ultimately, if the laws of physics are simple, he’s just flat out factually wrong that morality doesn’t arise from simple rules. His morality arose from those laws of physics, and in so much as he’s not a Boltzmann’s brain, his values aren’t incredibly atypical.
edit: To address it further. He does raise a valid point that there is no simple rule. The complexity metrics though are by no means a simple ‘rule’, they are in-computable and thus aren’t even a rule.
Physics can contain objects whose complexity is much higher than that of physics. Do you have a strong argument why randomness didn’t make a big contribution to human morality?
Well, suppose I were to make just the rough evolution sim, given really powerful computer. Even if it evolves society with principles we can deem moral once in a trillion societies—which is probably way low given that much of our principles are game theoretic—that just adds 40 bits to description for indexing those sims. edit: and the idea of the evolution sim doesn’t really have such a huge complexity; any particular evolution sim does, but we don’t care which evolution simulator we are working with; we don’t need the bits for picking one specific one, just the bits for picking a working one.
Game-theoretic principles might be simple enough, but the utility function of a FAI building a good future for humanity probably needs to encode other information too, like cues for tasty food or sexual attractiveness. I don’t know any good argument why this sort of information should have low complexity.
You may be over-fitting there. The FAI could let people decide what they want when it comes to food and attractiveness. Actually it better would, or i’d be having some serious regrets about this FAI.
Well, and the uFAI needs to know what “paperclips or something” means (or a real world goal at all). Obstacle faced by all contestants in the race. We humans learn what is other people and what isn’t. (Or have evolved it, doesn’t matter)
If you get paperclips slightly wrong, you get something equally bad (staples is the usual example, but the point is that any slight difference is about equally bad), but if you get FAI slightly wrong, you don’t get something equally good. This breaks the symmetry.
I think if you get paperclips slightly wrong, you get a crash of some kind. If I get a ray-tracer slightly wrong, it doesn’t trace electrons instead of photons.
edit: To clarify. It’s about definition of person vs definition of paperclip. You need a very broad definition of person for FAI, so that it won’t misidentify a person as non-person (misidentifying dolphins as persons won’t be a big problem), and you need a very narrow definition of paperclip for uFAI, so that a person holding two papers together is not a paperclip. It’s not always intuitive how broad definitions compare to narrow in difficulty, but it is worth noting that it is ridiculously hard to define paperclip making so that a Soviet factory anxious to maximize the paperclips would make anything at all, while it wasn’t particularly difficult to define what a person is (or to define what ‘money’ are so that capitalist paperclip factory would make paperclips to maximize profit).
I don’t think “paperclip maximizer” is taken as a complete declarative specification of what a paperclip maximizer is, let alone what it understands itself to be.
I imagine the setup is something like this. An AI has been created by some unspecified (and irrelevant) process and is now doing things to its (and our) immediate environment. We look at the things it has done and anthropomorphize it, saying “it’s trying to maximize the quantity of paperclips in the universe”. Obviously, almost every word in that description is problematic.
But the point is that the AI doesn’t need to know what “paperclips or something” means. We’re the ones who notice that the world is much more filled with paperclips after the AI got switched on.
This scenario is invariant under replacing “paperclips” with some arbitrary “X”, I guess under the restriction that X is roughly at the scale (temporal, spatial, conceptual) of human experience. Picking paperclips, I assume, is just a rhetorical choice.
Well, I agree. That goes also for the what ever process determines something to be person. The difference is that the FAI doesn’t have to create persons; it’s definition doesn’t need to process correctly things from the enormous space of possible things that can be or not be persons. It can have very broad definition that will include dolphins, and it will still be OK.
The intelligence, to some extent, is self defeating when finding a way to make something real; the easiest Y that is inside set X should be picked, by design, as instrumental to making more of some kind of X.
I.e. you define X to be something to hold papers together, the AI thinks and thinks and sees that a single atom, under some circumstances common in the universe (very far away in space), can hold the papers together; it finds the Kasimir effect which makes a vacuum able to hold two conductive papers together; and so on. The X has to be resistant against such brute forcing for the optimum solution.
Whenever the AI can come up with some real world manufacturing goal that it can’t defeat in such a fashion, well, that’s open to debate. Incomputable things seem hard to defeat.
edit: Actually. Would you consider a case of a fairly stupid nano-manufacturing AI destroying us, and itself, with gray goo, an unfriendly AI? That seems to be a particularly simple failure mode for self improving system, FAI or UFAI, under bounded computational power.And a failure mode for likely non-general AIs, as we are likely to employ such AIs to work on biotechnology and nanotechnology.
It doesn’t sound like you are agreeing with me. I didn’t make any assumptions about what the AI wants or whether its instrumental goals can be isolated. All I supposed was that the AI was doing something. I particularly didn’t assume that the AI is at all concerned with what we think it is maximizing, namely, X.
As for the grey goo scenario, I think that an AI that caused the destruction of humanity not being called unfriendly would indicate a incorrect definition of at least one of “AI”, “humanity”, or “unfriendly” (“caused” too, I guess).
All I supposed was that the AI was doing something.
Can you be more specific? I have an AI that’s iterating parameters to some strange attractor—defined within it—until it finds unusual behaviour. I can make the AI that would hillclimb+search for the improvements to the former AI. edit: Now, the worst thing that can happen, it makes mind hack image that kills everyone who looks at it. That wasn’t the intent, but the ‘unusual behaviour’ might get too unusual for human brain to handle. Is that a serious risk? No it’s a laughable one.
Implicit in my setup was that the AI reached the point where it was having noticeable macroscopic effects on our world. This is obviously easiest when the AI’s substrate has some built-in capacity for input/output. If we’re being really generous, it might have an autonomous body, cameras, an internet connection, etc. If we’re being stingy, it might just be an isolated process running on a computer with its inputs limited to checking the wall-clock time and outputs limited to whatever physical effects it has on the CPU running it. In the latter case, doing something to the external world may be very difficult but not impossible.
The program you have doing local search in your example doesn’t sound like an AI; even if you stuck it in the autonomous body, it wouldn’t do anything to the world that’s not a generic side-effect of its running. No one would describe it as maximizing anything.
Well, it is maximizing what ever I defined for it to maximize, usefully for me, and in a way that is practical. In any case, you said, “All I supposed was that the AI was doing something.” . My AI is doing something.
This is obviously easiest when the AI’s substrate has some built-in capacity for input/output. If we’re being really generous, it might have an autonomous body, cameras, an internet connection, etc.
Yea, and it’s rolling forward and clamping it’s manipulators until they wear out. Clearly you want it to maximize something in the real world, not just do something. The issue is that the only things it can do approximately this way is shooting at colour blue or the like.
Everything else requires very detailed model, and maximization of something in the model, followed by carrying out of the actions in the real world, which, interestingly, is entirely optional, and which even humans have trouble getting themselves to do (when I invent something and to my satisfaction am sure that it will work, it is boring to implement, and it is a common problem). Edit: and one other point, without model all you can do is try random stuff on the world itself, which is not at all intelligent (and resembles the Wheatley in portal 2 trying to crack the code).
Sorry, I don’t understand what exactly you are proposing. A utility function is a function from states of the universe to real numbers. If the function contains a term like “let people decide”, it should also define “people”, which seems to require a lot of complexity.
Or are you coming at this from some other perspective, like assigning utilities to possible actions rather than world states? That’s a type error and also very likely to be Bayesian-irrational.
Randomness is Chaitin’s omega is God implies stochasticity (mixed
Strategies) implies winning in the limit due to hypercomputational advantages universally if not necessarily contingently. Hence randomness isn’t at odds as such with morality. Maybe Schmidhuber’s ideas about super-omegas are relevant. Doubt it.
Plus the process of a few hundred million years of evolutionary pressures.
Do you think simulating those years and extrapolating the derived values from that simulation is clearly easier and simpler than extrapolating the values from e.g. a study of human neural scans/human biochemistry/human psychology?
Do you think simulating those years and extrapolating the derived values from that simulation is clearly easier and simpler than extrapolating the values from e.g. a study of human neural scans/human biochemistry/human psychology?
It’s not clear to me how the second is obviously easier. How would you even do that? Are there simple examples of doing this that would help me understand what “extrapolating human values from a study of human neural scans” would entail?
One could e.g. run a sim of bounded intelligence agents competing with each other for resources, then pick the best one, that will implement the tit for tat and more complex solutions that work. It was already the case that for iterated prisoner’s dilemma there wasn’t some enormous number of amoral solutions, to the much surprise of AI researchers of the time who wasted their efforts trying to make some sort of nasty sneaky Machiavellian AI.
edit: anyhow i digress. The point is that when something is derivable via simple rules (even if impractical), like laws of physics, that should enormously boost the likehood that it is derivable in some more practical way.
Would “yes” be an acceptable answer? It probably is harder to run the simulations, but it’s worth a shot at uncovering some simple cases where different starting conditions converge on the same moral/decision making system.
I’m not proposing the AI, I’m noting that the humans seem to use some intuitive notion of complexity to decide what they like. edit: also had the Eliezer ever written a Rubik cube solving AI? Or anything even remotely equal? Easy to pontificate how other people think wrong when you aren’t having to solve anything. The way engineers think, it works for making me a car. The way Eliezer thinks, that works for making him an atheist. Big difference. (I am atheist too, so not a religious stab, and I like Eliezer’s sequences, it’s just that problem solving is something we are barely at all capable of, and adding any extra crap to shoot down the lines of thought which may in fact work does not help you any)
edit: also, the solution: you just do hill climbing with n-move look ahead. As a pre-processing step you may search for sequences that climb the hill out of any condition. It’s a very general problem solving method, hill climbing with move look-ahead. If you want the AI to invent hill climbing, well I know of one example, evolution, and this one does increase some kind of complexity on the line that is leading up to mankind, who invents better hill climbing, even though complexity is not the best solution to ‘reproducing the most’. If the point is making the AI that comes up with the very goal of solving Rubik’s cube, that gets into the AGI land, but using the cube for improving own problem solving skill is the way it is for us. I like to solve cube into some pattern. An alien may not care into what pattern to solve the cube, just as long as he pre-commits on something random, and its reachable.
The sequences contain a preemptive counterargument to your post, could you address the issues raised there?
I read Dmytry’s post as a hint, not a solution. Since obviously pursuing complexity at “face value” would be pursuing entropy.
Yep. It’d be maximized by heating you up to maximum attainable temperature, or by throwing you in to black hole, depending to how you look at it.
We can have a low-information reference class with instances of high entropy, the “heat soup”. But then, picking a reference class is arbitrary (we can contrive a complex class of heat soup flavors).
I don’t like EY’s posts about AI. He’s not immune to the sunk cost fallacy, and the worst form of sunk cost fallacy when one denies outright (with long handwave) any possibility of a better solution, having sunk the cost into the worse one.
Ultimately, if the laws of physics are simple, he’s just flat out factually wrong that morality doesn’t arise from simple rules. His morality arose from those laws of physics, and in so much as he’s not a Boltzmann’s brain, his values aren’t incredibly atypical.
edit: To address it further. He does raise a valid point that there is no simple rule. The complexity metrics though are by no means a simple ‘rule’, they are in-computable and thus aren’t even a rule.
Physics can contain objects whose complexity is much higher than that of physics. Do you have a strong argument why randomness didn’t make a big contribution to human morality?
Well, suppose I were to make just the rough evolution sim, given really powerful computer. Even if it evolves society with principles we can deem moral once in a trillion societies—which is probably way low given that much of our principles are game theoretic—that just adds 40 bits to description for indexing those sims. edit: and the idea of the evolution sim doesn’t really have such a huge complexity; any particular evolution sim does, but we don’t care which evolution simulator we are working with; we don’t need the bits for picking one specific one, just the bits for picking a working one.
Game-theoretic principles might be simple enough, but the utility function of a FAI building a good future for humanity probably needs to encode other information too, like cues for tasty food or sexual attractiveness. I don’t know any good argument why this sort of information should have low complexity.
You may be over-fitting there. The FAI could let people decide what they want when it comes to food and attractiveness. Actually it better would, or i’d be having some serious regrets about this FAI.
That’s reasonable, but to let people decide, the FAI needs to recognize people, which also seems to require complexity...
If your biggest problem is on the order of recognizing people, the problem of FAI becomes much, much easier.
Well, and the uFAI needs to know what “paperclips or something” means (or a real world goal at all). Obstacle faced by all contestants in the race. We humans learn what is other people and what isn’t. (Or have evolved it, doesn’t matter)
If you get paperclips slightly wrong, you get something equally bad (staples is the usual example, but the point is that any slight difference is about equally bad), but if you get FAI slightly wrong, you don’t get something equally good. This breaks the symmetry.
I think if you get paperclips slightly wrong, you get a crash of some kind. If I get a ray-tracer slightly wrong, it doesn’t trace electrons instead of photons.
edit: To clarify. It’s about definition of person vs definition of paperclip. You need a very broad definition of person for FAI, so that it won’t misidentify a person as non-person (misidentifying dolphins as persons won’t be a big problem), and you need a very narrow definition of paperclip for uFAI, so that a person holding two papers together is not a paperclip. It’s not always intuitive how broad definitions compare to narrow in difficulty, but it is worth noting that it is ridiculously hard to define paperclip making so that a Soviet factory anxious to maximize the paperclips would make anything at all, while it wasn’t particularly difficult to define what a person is (or to define what ‘money’ are so that capitalist paperclip factory would make paperclips to maximize profit).
I agree that paperclips could also turn out to be pretty complex.
I don’t think “paperclip maximizer” is taken as a complete declarative specification of what a paperclip maximizer is, let alone what it understands itself to be.
I imagine the setup is something like this. An AI has been created by some unspecified (and irrelevant) process and is now doing things to its (and our) immediate environment. We look at the things it has done and anthropomorphize it, saying “it’s trying to maximize the quantity of paperclips in the universe”. Obviously, almost every word in that description is problematic.
But the point is that the AI doesn’t need to know what “paperclips or something” means. We’re the ones who notice that the world is much more filled with paperclips after the AI got switched on.
This scenario is invariant under replacing “paperclips” with some arbitrary “X”, I guess under the restriction that X is roughly at the scale (temporal, spatial, conceptual) of human experience. Picking paperclips, I assume, is just a rhetorical choice.
Well, I agree. That goes also for the what ever process determines something to be person. The difference is that the FAI doesn’t have to create persons; it’s definition doesn’t need to process correctly things from the enormous space of possible things that can be or not be persons. It can have very broad definition that will include dolphins, and it will still be OK.
The intelligence, to some extent, is self defeating when finding a way to make something real; the easiest Y that is inside set X should be picked, by design, as instrumental to making more of some kind of X.
I.e. you define X to be something to hold papers together, the AI thinks and thinks and sees that a single atom, under some circumstances common in the universe (very far away in space), can hold the papers together; it finds the Kasimir effect which makes a vacuum able to hold two conductive papers together; and so on. The X has to be resistant against such brute forcing for the optimum solution.
Whenever the AI can come up with some real world manufacturing goal that it can’t defeat in such a fashion, well, that’s open to debate. Incomputable things seem hard to defeat.
edit: Actually. Would you consider a case of a fairly stupid nano-manufacturing AI destroying us, and itself, with gray goo, an unfriendly AI? That seems to be a particularly simple failure mode for self improving system, FAI or UFAI, under bounded computational power.And a failure mode for likely non-general AIs, as we are likely to employ such AIs to work on biotechnology and nanotechnology.
It doesn’t sound like you are agreeing with me. I didn’t make any assumptions about what the AI wants or whether its instrumental goals can be isolated. All I supposed was that the AI was doing something. I particularly didn’t assume that the AI is at all concerned with what we think it is maximizing, namely, X.
As for the grey goo scenario, I think that an AI that caused the destruction of humanity not being called unfriendly would indicate a incorrect definition of at least one of “AI”, “humanity”, or “unfriendly” (“caused” too, I guess).
Can you be more specific? I have an AI that’s iterating parameters to some strange attractor—defined within it—until it finds unusual behaviour. I can make the AI that would hillclimb+search for the improvements to the former AI. edit: Now, the worst thing that can happen, it makes mind hack image that kills everyone who looks at it. That wasn’t the intent, but the ‘unusual behaviour’ might get too unusual for human brain to handle. Is that a serious risk? No it’s a laughable one.
Implicit in my setup was that the AI reached the point where it was having noticeable macroscopic effects on our world. This is obviously easiest when the AI’s substrate has some built-in capacity for input/output. If we’re being really generous, it might have an autonomous body, cameras, an internet connection, etc. If we’re being stingy, it might just be an isolated process running on a computer with its inputs limited to checking the wall-clock time and outputs limited to whatever physical effects it has on the CPU running it. In the latter case, doing something to the external world may be very difficult but not impossible.
The program you have doing local search in your example doesn’t sound like an AI; even if you stuck it in the autonomous body, it wouldn’t do anything to the world that’s not a generic side-effect of its running. No one would describe it as maximizing anything.
Well, it is maximizing what ever I defined for it to maximize, usefully for me, and in a way that is practical. In any case, you said, “All I supposed was that the AI was doing something.” . My AI is doing something.
Yea, and it’s rolling forward and clamping it’s manipulators until they wear out. Clearly you want it to maximize something in the real world, not just do something. The issue is that the only things it can do approximately this way is shooting at colour blue or the like.
Everything else requires very detailed model, and maximization of something in the model, followed by carrying out of the actions in the real world, which, interestingly, is entirely optional, and which even humans have trouble getting themselves to do (when I invent something and to my satisfaction am sure that it will work, it is boring to implement, and it is a common problem). Edit: and one other point, without model all you can do is try random stuff on the world itself, which is not at all intelligent (and resembles the Wheatley in portal 2 trying to crack the code).
...or perhaps “destruction”.
Sorry, I don’t understand what exactly you are proposing. A utility function is a function from states of the universe to real numbers. If the function contains a term like “let people decide”, it should also define “people”, which seems to require a lot of complexity.
Or are you coming at this from some other perspective, like assigning utilities to possible actions rather than world states? That’s a type error and also very likely to be Bayesian-irrational.
Randomness is Chaitin’s omega is God implies stochasticity (mixed Strategies) implies winning in the limit due to hypercomputational advantages universally if not necessarily contingently. Hence randomness isn’t at odds as such with morality. Maybe Schmidhuber’s ideas about super-omegas are relevant. Doubt it.
Plus the process of a few hundred million years of evolutionary pressures.
Do you think simulating those years and extrapolating the derived values from that simulation is clearly easier and simpler than extrapolating the values from e.g. a study of human neural scans/human biochemistry/human psychology?
It’s not clear to me how the second is obviously easier. How would you even do that? Are there simple examples of doing this that would help me understand what “extrapolating human values from a study of human neural scans” would entail?
One could e.g. run a sim of bounded intelligence agents competing with each other for resources, then pick the best one, that will implement the tit for tat and more complex solutions that work. It was already the case that for iterated prisoner’s dilemma there wasn’t some enormous number of amoral solutions, to the much surprise of AI researchers of the time who wasted their efforts trying to make some sort of nasty sneaky Machiavellian AI.
edit: anyhow i digress. The point is that when something is derivable via simple rules (even if impractical), like laws of physics, that should enormously boost the likehood that it is derivable in some more practical way.
Would “yes” be an acceptable answer? It probably is harder to run the simulations, but it’s worth a shot at uncovering some simple cases where different starting conditions converge on the same moral/decision making system.
You may want to check out this post instead; it seems like a much closer response to the ideas in your post.
I’m not proposing the AI, I’m noting that the humans seem to use some intuitive notion of complexity to decide what they like. edit: also had the Eliezer ever written a Rubik cube solving AI? Or anything even remotely equal? Easy to pontificate how other people think wrong when you aren’t having to solve anything. The way engineers think, it works for making me a car. The way Eliezer thinks, that works for making him an atheist. Big difference. (I am atheist too, so not a religious stab, and I like Eliezer’s sequences, it’s just that problem solving is something we are barely at all capable of, and adding any extra crap to shoot down the lines of thought which may in fact work does not help you any)
edit: also, the solution: you just do hill climbing with n-move look ahead. As a pre-processing step you may search for sequences that climb the hill out of any condition. It’s a very general problem solving method, hill climbing with move look-ahead. If you want the AI to invent hill climbing, well I know of one example, evolution, and this one does increase some kind of complexity on the line that is leading up to mankind, who invents better hill climbing, even though complexity is not the best solution to ‘reproducing the most’. If the point is making the AI that comes up with the very goal of solving Rubik’s cube, that gets into the AGI land, but using the cube for improving own problem solving skill is the way it is for us. I like to solve cube into some pattern. An alien may not care into what pattern to solve the cube, just as long as he pre-commits on something random, and its reachable.