paulfchristiano comments on Asymptotically Unambitious AGI

paulfchristiano 11 Mar 2019 17:49 UTC
LW: 4 AF: 2
AF
The fast algorithms to predict our physics just aren’t going to be the shortest ones. You can use reasoning to pick which one to favor (after figuring out physics), rather than just writing them down in some arbitrary order and taking the first one.
- michaelcohen 12 Mar 2019 0:26 UTC
  LW: 1 AF: 1
  AF Parent
  You can use reasoning to pick which one to favor (after figuring out physics), rather than just writing them down in some arbitrary order and taking the first one.
  Using “reasoning” to pick which one to favor, is just picking the first one in some new order. (And not really picking the first one, just giving earlier ones preferential treatment). In general, if you have an infinite list of possibilities, and you want to pick the one that maximizes some property, this is not a procedure that halts. I’m agnostic about what order you use (for now) but one can’t escape the necessity to introduce the arbitrary criterion of “valuing” earlier things on the list. One can put 50% probability mass on the first billion instead of the first 1000 if one wants to favor “simplicity” less, but you can’t make that number infinity.
  - paulfchristiano 12 Mar 2019 1:48 UTC
    LW: 2 AF: 1
    AF Parent
    Using “reasoning” to pick which one to favor, is just picking the first one in some new order.
    Yes, some new order, but not an arbitrary one. The resulting order is going to be better than the speed prior order, so we’ll update in favor of the aliens and away from the rest of the speed prior.
    one can’t escape the necessity to introduce the arbitrary criterion of “valuing” earlier things on the list
    Probably some miscommunication here. No one is trying to object to the arbitrariness, we’re just making the point that the aliens have a lot of leverage with which to beat the rest of the speed prior.
    (They may still not be able to if the penalty for computation is sufficiently steep—e.g. if you penalize based on circuit complexity so that the model might as well bake in everything that doesn’t depend on the particular input at hand. I think it’s an interesting open question whether that avoids all problems of this form, which I unsuccessfully tried to get at here.)
    - michaelcohen 12 Mar 2019 6:48 UTC
      LW: 1 AF: 1
      AF Parent
      They may still not be able to if the penalty for computation is sufficiently steep
      It was definitely reassuring to me that someone else had had the thought that prioritizing speed could eliminate optimization daemons (re: minimal circuits), since the speed prior came in here for independent reasons. The only other approach I can think of is trying to do the anthropic update ourselves.
      - paulfchristiano 12 Mar 2019 16:46 UTC
        LW: 3 AF: 2
        AF Parent
        The only other approach I can think of is trying to do the anthropic update ourselves.
        If you haven’t seen Jessica’s post in this area, it’s worth taking a quick look.
    - michaelcohen 12 Mar 2019 6:23 UTC
      LW: 1 AF: 1
      AF Parent
      The only point I was trying to respond to in the grandparent of this comment was your comment
      The fast algorithms to predict our physics just aren’t going to be the shortest ones. You can use reasoning to pick which one to favor (after figuring out physics), rather than just writing them down in some arbitrary order and taking the first one.
      Your concern (I think) is that our speed prior would assign a lower probability to [fast approximation of real world] than the aliens’ speed prior.
      I can’t respond at once to all of the reasons you have for this belief, but the one I was responding to here (which hopefully we can file away before proceeding) was that our speed prior trades off shortness with speed, and aliens could avoid this and only look at speed.
      My point here was just that there’s no way to not trade off shortness with speed, so no one has a comparative advantage on us as result of the claim “The fast algorithms to predict our physics just aren’t going to be the shortest ones.”
      The “after figuring out physics” part is like saying that they can use a prior which is updated based on evidence. They will observe evidence for what our physics is like, and use that to update their posterior, but that’s exactly what we’re doing to. The prior they start with can’t be designed around our physics. I think that the only place this reasoning gets you is that their posterior will assign a higher probability to [fast approximation of real world] than our prior does, because the world-models have been reasonably reweighted in light of their “figuring out physics”. Of course I don’t object to that—our speed prior’s posterior will be much better than the prior too.
      - paulfchristiano 12 Mar 2019 16:58 UTC
        LW: 2 AF: 1
        AF Parent
        but that’s exactly what we’re doing to
        It seems totally different from what we’re doing, I may be misunderstanding the analogy.
        Suppose I look out at the world and do some science, e.g. discovering the standard model. Then I use my understanding of science to design great prediction algorithms that run fast, but are quite complicated owing to all of the approximations and heuristics baked into them.
        The speed prior gives this model a very low probability because it’s a complicated model. But “do science” gives this model a high probability, because it’s a simple model of physics, and then the approximations follow from a bunch of reasoning on top of that model of physics. We aren’t trading off “shortness” for speed—we are trading off “looks good according to reasoning” for speed. Yes they are both arbitrary orders, but one of them systematically contains better models earlier in the order, since the output of reasoning is better than a blind prioritization of shorter models.
        Of course the speed prior also includes a hypothesis that does “science with the goal of making good predictions,” and indeed Wei Dai and I are saying that this is the part of the speed prior that will dominate the posterior. But now we are back to potentially-malign consequentialistism. The cognitive work being done internally to that hypothesis is totally different from the work being done by updating on the speed prior (except insofar as the speed prior literally contains a hypothesis that does that work).
        In other words:
        Suppose physics takes n bits to specify, and a reasonable approximation takes N >> n bits to specify. Then the speed prior, working in the intended way, takes N bits to arrive at the reasonable approximation. But the aliens take n bits to arrive at the standard model, and then once they’ve done that can immediately deduce the N bit approximation. So it sure seems like they’ll beat the speed prior. Are you objecting to this argument?
        (In fact the speed prior only actually takes n + O(1) bits, because it can specify the “do science” strategy, but that doesn’t help here since we are just trying to say that the “do science” strategy dominates the speed prior.)
        What links here?
        michaelcohen's comment on Asymptotically Unambitious AGI by michaelcohen (1 Apr 2019 23:39 UTC; 13 points)
        michaelcohen 13 Mar 2019 0:03 UTC
        LW: 1 AF: 1
        AF Parent
        I’m not sure which of these arguments will be more convincing to you.
        Yes they are both arbitrary orders, but one of them systematically contains better models earlier in the order, since the output of reasoning is better than a blind prioritization of shorter models.
        This is what is what I was trying to contextualize above. This is an unfair comparison. You’re imagining that the “reasoning”-based order gets to see past observations, and the “shortness”-based order does not. A reasoning-based order is just a shortness-based order that has been updated into a posterior after seeing observations (under the view that good reasoning is Bayesian reasoning). Maybe the term “order” is confusing us, because we both know it’s a distribution, not an order, and we were just simplifying to a ranking. A shortness-based order should really just be called a prior, and a reasoning-based order (at least a Bayesian-reasoning-based order) should really just be called a posterior (once it has done some reasoning; before it has done the reasoning, it is just a prior too). So yes, the whole premise of Bayesian reasoning is that updating based on reasoning is a good thing to do.
        Here’s another way to look at it.
        The speed prior is doing the brute force search that scientists try to approximate efficiently. The search is for a fast approximation of the environment. The speed prior considers them all. The scientists use heuristics to find one.
        In fact the speed prior only actually takes n + O(1) bits, because it can specify the “do science” strategy
        Exactly. But this does help for reasons I describe here. The description length of the “do science” strategy (I contend) is less than the description length of the “do science” + “treacherous turn” strategy. (I initially typed that as “tern”, which will now be the image I have of a treacherous turn.)
        paulfchristiano 13 Mar 2019 17:31 UTC
        LW: 2 AF: 1
        AF Parent
        a reasoning-based order (at least a Bayesian-reasoning-based order) should really just be called a posterior
        Reasoning gives you a prior that is better than the speed prior, before you see any data. (*Much* better, limited only by the fact that the speed prior contains strategies which use reasoning.)
        The reasoning in this case is not a Bayesian update. It’s evaluating possible approximations *by reasoning about how well they approximate the underlying physics, itself inferred by a Bayesian update*, not by directly seeing how well they predict on the data so far.
        The description length of the “do science” strategy (I contend) is less than the description length of the “do science” + “treacherous turn” strategy.
        I can reply in that thread.
        I think the only good arguments for this are in the limit where you don’t care about simplicity at all and only care about running time, since then you can rule out all reasoning. The threshold where things start working depends on the underlying physics, for more computationally complex physics you need to pick larger and larger computation penalties to get the desired result.
        michaelcohen 14 Mar 2019 2:14 UTC
        LW: 1 AF: 1
        AF Parent
        Given a world model $ν$ , which takes $k$ computation steps per episode, let $ν^{log}$ be the best world-model that best approximates $ν$ (in the sense of KL divergence) using only $log k$ computation steps. $ν^{log}$ is at least as good as the “reasoning-based replacement” of $ν$ .
        The description length of $ν^{log}$ is within a (small) constant of the description length of $ν$ . That way of describing it is not optimized for speed, but it presents a one-time cost, and anyone arriving at that world-model in this way is paying that cost.
        One could consider instead $ν_{ε}^{log}$ , which is, among the world-models that $ε$ -approximate $ν$ in less than $log k$ computation steps (if the set is non-empty), the first such world-model found by a searching procedure $ψ$ . The description length of $ν_{ε}^{log}$ is within a (slightly larger) constant of the description length of $ν$ , but the one-time computational cost is less than that of $ν^{log}$ .
        $ν^{log}$ , $ν_{ε}^{log}$ , and a host of other approaches are prominently represented in the speed prior.
        If this is what you call “the speed prior doing reasoning,” so be it, but the relevance for that terminology only comes in when you claim that “once you’ve encoded ‘doing reasoning’, you’ve basically already written the code for it to do the treachery that naturally comes along with that.” That sense of “reasoning” really only applies, I think, to the case where our code is simulating aliens or an AGI.
        paulfchristiano 14 Mar 2019 4:21 UTC
        LW: 2 AF: 1
        AF Parent
        (ETA: I think this discussion depended on a detail of your version of the speed prior that I misunderstood.)
        Given a world model ν, which takes k computation steps per episode, let νlog be the best world-model that best approximates ν (in the sense of KL divergence) using only logk computation steps. νlog is at least as good as the “reasoning-based replacement” of ν.
        The description length of νlog is within a (small) constant of the description length of ν. That way of describing it is not optimized for speed, but it presents a one-time cost, and anyone arriving at that world-model in this way is paying that cost.
        To be clear, that description gets ~0 mass under the speed prior, right? A direct specification of the fast model is going to have a much higher prior than a brute force search, at least for values of $β$ large enough (or small enough, however you set it up) to rule out the alien civilization that is (probably) the shortest description without regard for computational limits.
        One could consider instead νlogε, which is, among the world-models that ε-approximate ν in less than logk computation steps (if the set is non-empty), the first such world-model found by a searching procedure ψ. The description length of νlogε is within a (slightly larger) constant of the description length of ν, but the one-time computational cost is less than that of νlog.
        Within this chunk of the speed prior, the question is: what are good ψ? Any reasonable specification of a consequentialist would work (plus a few more bits for it to understand its situation, though most of the work is done by handing it $ν$ ), or of a petri dish in which a consequentialist would eventually end up with influence. Do you have a concrete alternative in mind, which you think is not dominated by some consequentialist (i.e. a ψ for which every consequentialist is either slower or more complex)?
        michaelcohen 14 Mar 2019 10:18 UTC
        LW: 1 AF: 1
        AF Parent
        Do you have a concrete alternative in mind, which you think is not dominated by some consequentialist (i.e. a ψ for which every consequentialist is either slower or more complex)?
        Well one approach is in the flavor of the induction algorithm I messaged you privately about (I know I didn’t give you a completely specified algorithm). But when I wrote that, I didn’t have a concrete algorithm in mind. Mostly, it just seems to me that the powerful algorithms which have been useful to humanity have short descriptions in themselves. It seems like there are many cases where there is a simple “ideal” approach which consequentialists “discover” or approximately discover. A powerful heuristic search would be one such algorithm, I think.
        (ETA: I think this discussion depended on a detail of your version of the speed prior that I misunderstood.)
        I don’t think anything here changes if K(x) were replaced with S(x) (if that was what you misunderstood).