Manfred comments on Proposal: Use logical depth relative to human history as objective function for superintelligence

Manfred 15 Sep 2014 4:13 UTC
5 points
Ah, okay, my bad for just thinking of it as maximizing relative depth.

So what’s really pushed are things that are logically deep in their simplest expression in terms of humanity, but not logically deep in terms of fundamental physics.

Depending on how this actually gets cashed out, the “human” that encodes deep computational results rather than actually living is still a very desirable object.

Here’s a slightly more alive dystopia: Use humanity to embody a complicated turing machine (like how the remark goes that chimpanzees are turing complete because they can be trained to operate a turing machine). Deep computational work appears to be being done (relative to humanity), but from a fundamental physics point of view it’s nothing special. And it’s probably not much fun for the humans enslaved as cogs in the machine.
- sbenthall 15 Sep 2014 17:05 UTC
  3 points
  Parent
  First, I’m grateful for this thoughtful engagement and pushback.
  
  Let’s call your second dystopia the Universal Chinese Turing Factory, since it’s sort of a mash-up of the factory variant of Searle’s Chinese Room argument and a universal Turing Machine.
  
  I claim that the Universal Chinese Turing Factory, if put to some generic task like solving satisfiability puzzles, will not be favored by a supercontroller with the function I’ve specified.
  
  Why? Because if we look at the representations computed by the Universal Chinese Turing Factory, they may be very logically deep. But their depth will not be especially due to the fact that humanity is mechanically involved in the machine. In terms of the ratio of depth-relative-to-humanity to absolute-depth, there’s going to be hardly any gains there.
  
  (You could argue, by the way, that if you employed all of humanity in the task of solving math puzzles, that would be much like the Universal Chinese Turing Factory you describe.)
  
  Let’s consider an alternative world, the Montessori World, where all of humanity is employed creating ever more elaborate artistic representations of the human condition. Arguably, this is the condition of those who dream up a post-scarcity economy where everybody gets to be a quasi-professional creative doing improv comedy, interpretative dance, and abstract expressionist basket-weaving. A utopia, I’m sure you’d agree.
  
  The thing is that these representations will be making use of humanity’s condition h as an integral part of the computing apparatus that produces the representations. So, humanity is not just a cog in a universal machine implementing other algorithms. It is the algorithm, which the supercontroller then has the interest of facilitating.
  
  That’s the vision anyway. Now that I’m writing it out, I’m thinking maybe I got my math wrong. Does the function I’ve proposed really capture these intuitions?
  
  For example, maybe a simpler way to get at this would be to look at the Kolmogorov complexity of the universe relative to humanity, K(u/h). That could be a better Montessori world. Then again, maybe Montessori world isn’t as much of a utopia as I’ve been saying it is.
  
  If I sell the idea of these kinds of complexity-relative-to-humanity functions as a way of designing supercontrollers that are not existential threats, I’ll consider this post a success. The design of the particular function is I think an interesting ethical problem or choice.
  - Manfred 15 Sep 2014 17:34 UTC
    5 points
    Parent
    Well, certainly I’ve been coming from the viewpoint that it doesn’t capture these intuitions :P Human values are complicated (I’m assuming you’ve read the relevant posts here (e.g.)), both in terms of their representation and in terms of how they are cashed out from their representation into preferences over world-states. Thus any solution that doesn’t have very good evidence that it will satisfy human values, will very likely not do so (small target in a big space).
    
    In terms of the ratio of depth-relative-to-humanity to absolute-depth, there’s going to be hardly any gains there.
    
    I’d say, gains relative to what? When considering an enslaved Turing Factory full of people versus a happy Montessori School full of people, I don’t see why there should be any predictable difference in their logical depth relative to fundamental physics. The Turing Factory only shows up as “simple” on a higher level of abstraction.
    
    If we restrict our search space to “a planet full of people doing something human-ish,” and we cash out “logical depth relative to humanity” as operating on a high level of abstraction where this actually has a simple description, then the process seems dominated by whatever squeezes the most high-level-of-abstraction deep computation out of people without actually having a simple pattern in the lower level of abstraction of fundamental physics.
    
    Idea for generally breaking this: If we cash out the the “logical depth relative to humanity” as being in terms of fundamental physics, and allowing us to use a complete human blueprint, then we can use this to encode patterns in a human-shaped object that are simple and high-depth relative to the blueprint but look like noise relative to physics. If both “logical depth relative to humanity” and “logical depth” are on a high, humanish level of abstraction, one encodes high-depth computational results in slight changes to human cultural artifacts that look like noise relative to a high-level-of-abstraction description that doesn’t have the blueprints for those artifacts. Etc.
    - sbenthall 15 Sep 2014 22:20 UTC
      1 point
      Parent
      Maybe this will be more helpful:
      
      If the universe computes things that are not computational continuations of the human condition (which might include resolution to our moral quandaries, if that is in the cards), then it is, with respect to optimizing function g, wasting the perfectly good computational depth achieved by humanity so far. So, driving computation that is not somehow reflective of where humanity was already going is undesirable. The computational work that is favored is work that makes the most of what humanity was up to anyway.
      
      To the extent that human moral progress in a complex society is a difficult computational problem, and there’s lots of evidence that it is, then that is the sort of thing that would be favored by objective g.
      
      If moral progress of humanity is something that has a stable conclusion (i.e., humanity at some point halts or goes into a harmonious infinite loop that does not increase in depth) then objective g will not help us hit that mark. But in that case, it should be computationally feasible to derive a better objective function.
      
      To those who are unsatisfied with objective g as a solution to Problem 2, I pose the problem: is there a way to modify objective g so that it prioritizes morally better futures? If not, I maintain the objective g is still pretty good.
    - sbenthall 15 Sep 2014 22:07 UTC
      1 point
      Parent
      Re: your first point:
      
      As I see it, there are two separate problems. One is preventing catastrophic destruction of humanity (Problem 1). The other is creating utopia (Problem 2). Objective functions that are satisficing with respect to Problem 1 may not be solutions to Problem 2. While as I read it the Yudkowsky post you linked to argues for prioritizing Problem 2, on the contrary my sense of the thrust of Bostrom’s argument is that it’s critical to solve Problem 1. Maybe you can tell me if I’ve misunderstood.
      
      Without implicating human values, I’m claiming that the function f(u) = D(u/ht) / D(u) satisfies Problem 1 (the existential problem). I’m just going to refer to that function as f now.
      
      You seem have conceded this point. Maybe I’ve misinterpreted you.
      
      As for solving Problem 2, I think we’d agree that any solutions to the utopia problem will also be solutions to the existence problem (Problem 1). The nice thing about f is that its range is (0,1), so it’s easy to compose it with other functions that could weight it more towards a solution to Problem 2.
      
      Re: your second point:
      
      I’m not sure if I entirely follow what you’re say here, so I’m having a hard time understanding exactly the point of disagreement.
      
      Is the point you’re making about the unpredictability of the outcome of optimizing for f? Because the abstract patterns favored by f will look like noise relative to physics?
      
      I think there are a couple elaborations worth making.
      
      First, like Kolmogorov complexity, logical depth depends on a universal computer specification. I gather that you are assuming that the universal computer in question is something that simulated fundamental physics. This need not be the case. Depth is computed as the least running time of incompressible programs on the universal computer.
      
      Suppose we were to try to evolve through a computational process a program that outputs a string that represented the ultimate, flourishing potential of humanity. One way to get that is to run the Earth as a physical process for a period of time and get a description of it at the end, selecting only those timelines in its stochastic unfolding in which life on Earth successfully computes itself indefinitely.
      
      If you stop somewhere along the way, like timestep t, then you are going to get a representation that encodes some of the progress towards that teleological end.
      
      (I think there’s a rough conceptual analogy to continuations in functional programming here, if that helps)
      
      An important property of logical depth is the Slow Growth Law. This is proved by Bennett. It says that deep objects cannot be produced quickly from shallow ones. Incompressible programs being the shallowest strings of all. It’s not exactly that depth stacks additively, but I’m going to pretend it does for the intuitive argument here (which may be wrong):
      
      If you have the depth of human progress D(h) and the depth of the universe at some future time D(u), then always D(u/h) < D(u) assuming h is deep at all and the computational products of humanity exist at all. But...
      
      ah, I think I’ve messed up the formula. Let’s see… let’s have h’ be a human slice taken after the time of h.
      
      D(u) > D(u/h’) > D(u/h) > D(h) assuming humanity’s computational process continues. The more that h’ encodes the total computational progress of u, i.e., the higher D(u/h’) is relative to D(u)...
      
      Ok, I think I need to modify the formula some. Here’s function g:
      
      g(u) = (D(h) + D(u/h)) / D(u)
      
      Does maximizing this function produce better results? Or have I missed your point?
      - Manfred 16 Sep 2014 2:07 UTC
        6 points
        Parent
        General response: I think you should revise the chances of this working way downwards until you have some sort of toy model where you can actually prove, completely, with no “obvious” assumptions necessary, that this will preserve values or at least the existence of an agent in a world. But I think enough has been said about this already.
        
        Specific response:
        
        Is the point you’re making about the unpredictability of the outcome of optimizing for f? Because the abstract patterns favored by f will look like noise relative to physics?
        
        “Looks like noise” here means uncompressability, and thus logical shallowness. I’ll try again to explain why I think that relative logical depth turns out to not look like human values at all, and you can tell me what you think.
        
        Consider an example.
        
        Imagine, if you will, logical depth relative to a long string of nearly-random digits, called the Ongoing Tricky Procession. This is the computational work needed to output a string from its simplest description, if our agent already knows the Ongoing Tricky Procession.
        
        On the other hand, boring old logical depth is the computational work needed to output a string from its simplest description period. The logical depth of the Ongoing Tricky Procession is not very big, even though it has a long description length.
        
        Now imagine a contest between two agents, Alice and Bob. Alice knows the Ongoing Tricky Procession, and wants to output a string of high logical depth (to other agents who know the Ongoing Tricky Procession). The caveat is Bob has to think that the string has low logical depth. Is this possible?
        
        The answer is yes. Alice and Bob are spies on opposite sides, and Alice is encrypting her deep message with a One Time Pad. Bob can’t decrypt the message because, as every good spy knows, One Time Pads are super-duper secure, and thus Bob can’t tell that Alice’s message is actually logically deep.
        
        Even if the Ongoing Tricky Procession is not actually that K-complex, Alice can still hide a message in it—she just isn’t allowed to give Bob a simple description that actually decomposes into the OTP and the message.
        
        This is almost the opposite of the Slow Growth Law. Slow Growth is where you have shallow inputs and you want to make a deep output. Alice has this deep message she wants to send to her homeland, but she wants her output to be shallow according to Bob. Fast Decay :P
        sbenthall 16 Sep 2014 22:20 UTC
        1 point
        Parent
        Re: Generality.
        
        Yes, I agree a toy setup and a proof are needed here. In case it wasn’t clear, my intentions with this post was to suss out if there was other related work out there already done (looks like there isn’t) and then do some intuition pumping in preparation for a deeper formal effort, in which you are instrumental and for which I am grateful. If you would be interested in working with me on this in a more formal way, I’m very open to collaboration.
        
        Regarding your specific case, I think we may both be confused about the math. I think you are right that there’s something seriously wrong with the formulas I’ve proposed.
        
        If the string y is incompressible and shallow, then whatever x is, D(x) ~ D(x/y), because D(x) (at least in the version I’m using for this argument) is the minimum computational time of producing x from an incompressible program. If there is a minimum running time program P that produces x, then appending y as noise at the end isn’t going to change the running time.
        
        I think this case with incompressible y is like your Ongoing Tricky Procession.
        
        On the other hand, say w is a string with high depth. Which is to say, whether or not it is compressible in space, it is compressible in time: you get it by starting with something incompressible and shallow and letting it run in time. Then there are going to be some strings x such that D(x/w) + D(w) ~ D(x). There will also be a lot of strings x such that D(x/w) ~ D(x) because D(w) is finite and there tons of deep things the universe can compute that are deeper. So for a given x, D(x) > D(x/w) > D(x) - D(w) , roughly speaking.
        
        I’m saying the h, the humanity data, is logically deep, like w, not incompressible and shallow, like y or the ongoing tricky procession.
        
        Hmm, it looks like I messed up the formula yet again.
        
        What I’m trying to figure out is to select for universes u such that h is responsible for a maximal amount of the total depth. Maybe that’s a matter of minimizing D(u/h). Only that would lead perhaps to globe-flattening shallowness.
        
        What if we tried to maximize D(u) - D(u/h)? That’s like the opposite of what I originally proposed.
        hairyfigment 25 Sep 2014 19:57 UTC
        0 points
        Parent
        I’m still confused as to what D(u/h) means. It looks like it should refer to the number of logical steps you need to predict the state of the universe—exactly, or up to a certain precision—given only knowledge of human history up to a certain point. But then any event you can’t predict without further information, such as the AI killing everyone using some astronomical phenomenon we didn’t include in the definition of “human history”, would have infinite or undefined D(u/h).