SilasBarta comments on Decision theory: An outline of some upcoming posts

SilasBarta 26 Aug 2009 17:47 UTC
2 points

Consider why one might design an agent that uses concepts like “could” and “should” (hereafter a “Could/Should Agent”, or “CSA”), rather than designing an agent that acts in some other way. Consider what specific concepts of “could” and “should” are what specific kinds of useful. (This is meant as a more thorough investigation of the issues treated by Eliezer in “Possibility and Couldness”.)

I’m having some trouble with this part, because I could imagine modeling any object—agent or otherwise—as embodying the concepts of “could” and “should” without significant penalty to model complexity. (I thought I would change my mind after reading Possibility and Couldness but I didn’t.)

For example: A planet embodies that it “should” alter its velocity and position as per the laws of motion and gravitation. Because you can (Eliezer_Yudkowsky claims) find the laws of physics inside a single pebble, the planet doesn’t simply follow this regularity, but also embodies the “shouldness”.

And just the same, it embodies “couldness” in terms of where it “could” go, if only other bodies had different positions and masses, etc.

So, I’m confused: since I can reframe anything as embodying couldness and shouldness, what would it even mean for an agent not to be constructed this way? I suspect there’s an additional hidden assumption in here about additional requirements for something to meet “shouldness” and “couldness”.

ETA: I should note here that Vladimir_Nesov already made the largely similar point that, despite AnnaSalamon’s assumption of a distinction between could/should agents and “non-agent” jumbles of wires, there doesn’t seem to actually be a salient distinction. Whatever AnnaSalamon believes it is, I’d like to know.
What links here?
- SilasBarta's comment on Decision theory: Why we need to reduce “could”, “would”, “should” by AnnaSalamon (2 Sep 2009 15:40 UTC; 3 points)
- SilasBarta's comment on Why the beliefs/values dichotomy? by Wei Dai (20 Oct 2009 23:11 UTC; 0 points)
- Wei Dai 28 Aug 2009 1:20 UTC
  3 points
  Parent
  I’m not sure how Anna Salamon would distinguish between a “could/should” agent and non-agent, but here’s my definition. An agent is an algorithm that given an input, evaluates multiple possible outputs (and for a consequentialist agent specifically, their predicted consequences), then picks the one that best satisfies its preferences to be the actual output. So,
  - “could” is the set of actions that you consider during the course of your decision computation
  - “should” is preferences that you use to select between multiple could’s
  To categorize something as an agent, you need to look at its internal dynamics. A planet is not an agent because it’s not doing any computation that could be interpreted as evaluating multiple possible choices, and it’s certainly not predicting their consequences.
  - Vladimir_Nesov 28 Aug 2009 7:43 UTC
    3 points
    Parent
    But people are made out of meat! How can that be an agent? I remain sceptical of ability of a theory to interpret humans as agents with preference if the same theory is unable to interpret a tree or a rock the same way, perhaps with ridiculous preference, but preference nonetheless.
    - thomblake 28 Aug 2009 17:07 UTC
      2 points
      Parent
      in case it’s not been linked to here: They’re made out of meat!
      - Vladimir_Nesov 28 Aug 2009 18:24 UTC
        1 point
        Parent
        That was the reference. I’ll also add a link to this video version.
    - Eliezer Yudkowsky 28 Aug 2009 7:47 UTC
      2 points
      Parent
      http://lesswrong.com/lw/tx/optimization/
      
      http://lesswrong.com/lw/va/measuring_optimization_power/
      - Vladimir_Nesov 28 Aug 2009 8:08 UTC
        1 point
        Parent
        Well, one thing is power, another preference. The whole point of FAI is that there is not enough power in humans, while preference should be preserved.
        
        It may be easier to make out what people’s preference is than what a tree’s preference is, but for example consider the situation where a person just died and is never to be heard from again, where they can’t possibly make an impact on the world anymore (let’s say it’s 3000BC to exclude medical miracles). This person is extensionally no different from a tree, the difference lies primarily in the internal structure. These systems have the same power to optimize in the given situations, but they hardly have the same preference.
        
        You could say that there are counterfactuals where the person recovers and goes on optimizing, but there are also counterfactuals that make the tree turn into an optimizer. There is in some sense a lot less situations where a tree turns into an optimizer than there are situations in which a dead person turns into an optimizer, but similarly there is a lot more situations in which a living person or an AGI operate as optimizers.
        
        Where do you draw the line? If a theory does draw this line, the position of this line should be rigorously explained, not assumed on anthropomorphic scale.
        Wei Dai 28 Aug 2009 13:24 UTC
        1 point
        Parent
        The “meat” is clearly implementing a computation of the type I described, whereas a tree or rock isn’t. Do you dispute that?
        
        A person who has died is no longer running such a computation, but until his brain decays, the agent-algorithm that he was running before he died can theoretically still be retrieved from his brain.
        
        Your point seems to be that part of FAI theory should be a general and rigorous theory of how to extract preferences from any given object. Only then could we have sufficient theoretical support for any specific procedures for extracting preferences from human beings.
        
        You may be right (I’m not sure) but I think that’s a separate question from “why one might design [a could/should] agent”, which is what started this thread. For that, the informal definition of “agent” that I gave seems to be sufficient, at least to understand the question.
        Vladimir_Nesov 28 Aug 2009 13:51 UTC
        1 point
        Parent
        
        The “meat” is clearly implementing a computation of the type I described, whereas a tree or rock isn’t. Do you dispute that?
        
        I not so much dispute that as don’t know of a way to make this judgment precise.
        
        Your point seems to be that part of FAI theory should be a general and rigorous theory of how to extract preferences from any given object. Only then could we have sufficient theoretical support for any specific procedures for extracting preferences from human beings.
        
        Right, although I’m not sure that “objects” are the right scope of such theory. I suspect that you also need enough subjective specification of preference to initiate the process of interpretation (preference-extraction). This will make preference of rocks arbitrary, because the process of their interpretation can start in too many arbitrary ways and won’t converge to the same result from different starting points. At the same time, the structure of humans possibly creates a strong attractor, so that you have enough freedom in choosing the initial interpretation to specify something manually, while knowing that the end result depends very little on the initial specification.
        
        I think that’s a separate question from “why one might design [a could/should] agent”, which is what started this thread. For that, the informal definition of “agent” that I gave seems to be sufficient, at least to understand the question.
        
        On the level of informal understanding, of course. When you classify systems on agents and non-agents informally, you are using your own brain to interpret the system. This is not strong enough mechanism to extract preference, while a mechanism that can extract preference presumably would be able to see agents in configurations that people can’t interpret as agents, and what those mechanisms can see as agents is a more rigorous definition of what an agent is, hence my remark.
        thomblake 28 Aug 2009 17:04 UTC
        0 points
        Parent
        
        The “meat” is clearly implementing a computation of the type I described, whereas a tree or rock isn’t. Do you dispute that?
        
        Many would dispute that, possibly including Luciano Floridi. A tree or even a rock engages in information processing—it exchanges heat, electrons,and such with its surroundings, for starters. And there is almost certainly a decompression you can run on some of the information to fit whatever sort of pattern you’re looking for.
        SilasBarta 30 Aug 2009 21:18 UTC
        2 points
        Parent
        
        And there is almost certainly a decompression you can run on some of the information to fit whatever sort of pattern you’re looking for.
        
        I’ve explained before why this reasoning is misguided: to get arbitrary desired information processing out of random processes, you have to apply an ever-expanding interpretation, meaning that any model that calls e.g. a rock a computer is strictly longer than a model that doesn’t because the former would have to include all of the latter, plus random data.
        
        So a rock is not a general use computer (though it can be used to compute a result, if the computation you want to perform happens to be isomorphic to whatever e.g. heat transfer is going on right now).
        
        Now, with that in mind, I was among those who claimed that rocks are agents as defined by AnnaSalamon et. al, so how do I reconcile this with the claim that rocks aren’t computers?
        
        Well, it’s like this: An agent, as defined here, has internal dynamics that could, in principle, be understood as a network of counterfactuals and preferences. A computer, OTOH, does in fact do the work of altering your beliefs about an arbitrary computation. (Generally, that just means concentrating your probability distribution onto the right answer, when before you just figured it was within some range.)
        
        And since Eliezer_Yudkowsky claims that even a pebble embodies the laws of physics, which are nothing but a causal network containing counterfactuals and a species of preference (like energy minimization), that means the term “agent” is carving out a much huger chunk of conceptspace than I think AnnaSalamon et al intended. Which is what makes it hard for me to understand what the agent concept is supposed to be distinguished from.
        What links here?
        SilasBarta's comment on Decision theory: Why we need to reduce “could”, “would”, “should” by AnnaSalamon (2 Sep 2009 15:40 UTC; 3 points)
        SilasBarta 1 Sep 2009 2:59 UTC
        0 points
        Parent
        Okay, come on guys, give me a break here; I think this post merits an explanation of where I erred rather than (or at least on top of) a downmod. Sure, I might have said something stupid, but I clearly laid out my reasoning about an important distinction that is being made. Help me out here.
  - SilasBarta 28 Aug 2009 14:36 UTC
    1 point
    Parent
    I assumed as much, but my problem with this reasoning starts here:
    
    To categorize something as an agent, you need to look at its internal dynamics. A planet is not an agent because it’s not doing any computation that could be interpreted as evaluating multiple possible choices, and it’s certainly not predicting their consequences.
    
    Normally, I’d agree, but as I said, Eliezer_Yudkowsky claims that a pebble contains the laws of physics, which are nothing but a network of counterfactuals. So there necessarily is an isomorphism between a planet and “multiple possible consequences”.
    
    This is why I say there must be a stronger sense in which you mean that the agent has computations that can be interpreted as evaluating multiple choices/consequences, because all of the universe is doing a sort of efficient version of that. And I don’t yet know what this stronger sense is.
    What links here?
    SilasBarta's comment on Decision theory: Why we need to reduce “could”, “would”, “should” by AnnaSalamon (2 Sep 2009 15:40 UTC; 3 points)
- Vladimir_Nesov 26 Aug 2009 18:40 UTC
  0 points
  Parent
  Where planet actually goes has nothing to do with where it should go. Shouldness is about preference, and you said nothing of preference in your example. If the planet is on collision course with Earth, I say that it should turn the other way (and it could if an appropriate system was placed in interaction with it).
  - SilasBarta 26 Aug 2009 18:50 UTC
    1 point
    Parent
    And that would be shouldness with respect to you, not the planet. I submit that you’re making the mind-projection fallacy here.
    
    In the Eliezer_Yudkowsky article “Possibility and Couldness”, it (or some other article in the series) identifies “shouldness” as the algorithm’s internal recognition of a state it ranks higher in wanting to bring about. So I can in fact map that concept onto the planet, in that it identifies and acts on the preference for moving as per the laws of motion and gravitation.
    What links here?
    SilasBarta's comment on Decision theory: Why we need to reduce “could”, “would”, “should” by AnnaSalamon (2 Sep 2009 15:40 UTC; 3 points)
    - Vladimir_Nesov 26 Aug 2009 20:23 UTC
      0 points
      Parent
      This doesn’t capture the concept of an error. Preference should also be seen as an abstract mathematical object which the algorithm doesn’t necessarily maximize, but tries to set as high as it can. Of course, if I talk of shouldness, I must refer to particular preference, in this case I referred to mine. Notice that if I can’t move the planet away, it in fact collides with Earth, but it doesn’t mean that it should collide with Earth according to my preference. Likewise, you can’t assert that according to the planet’s preference, it should collide with Earth merely from the fact that it does: maybe the planet wants to be a whale instead, but can’t.
      - SilasBarta 26 Aug 2009 20:37 UTC
        2 points
        Parent
        
        Preference should also be seen as an abstract mathematical object which the algorithm doesn’t necessarily maximize, but tries to set as high as it can.
        
        Right, it maximizes according to constrains. And?
        
        Notice that if I can’t move the planet away, it in fact collides with Earth, but it doesn’t mean that it should collide with Earth according to my preference
        
        Right, your preference is different from the planet’s. That was your error in your last response.
        
        Likewise, you can’t assert that according to the planet’s preference, it should collide with Earth merely from the fact that it does: maybe the planet wants to be a whale instead, but can’t.
        
        The planet doesn’t want to be a whale; that wouldn’t minimize its Gibbs Free Energy in its local domain of attraction.
        Vladimir_Nesov 26 Aug 2009 22:13 UTC
        1 point
        Parent
        
        Notice that if I can’t move the planet away, it in fact collides with Earth, but it doesn’t mean that it should collide with Earth according to my preference
        
        Right, your preference is different from the planet’s. That was your error in your last response.
        
        My preference is over everything, the planet included. By saying “the planet shouldn’t collide with Earth” I mean that I should make the planet not collide with Earth, I’m not talking about the preference of the planet in this sentence, I only talk about my preference.
        
        Likewise, you can’t assert that according to the planet’s preference, it should collide with Earth merely from the fact that it does: maybe the planet wants to be a whale instead, but can’t.
        
        The planet doesn’t want to be a whale; that wouldn’t minimize its Gibbs Free Energy in its local domain of attraction.
        
        That planet wants to be a whale is a hypothetical. Accept it in reading what depends on accepting it. If the planet does in fact wants to be a whale, it can still be unable to make that happen, and you may still observe it moving along its orbit. You can’t assert that it doesn’t want to be a whale from extensionally observing how it actually moves.
        
        You are confusing the variational formulation of laws of physics with preference of optimization processes (probably because in both cases, you maximize/minimize something). Optimization process actually optimizes stuff over time (at least on simpler stages, e.g. humans), while variational form of the laws of physics just says that the true solution (that describes what will actually happen) can be represented as the maximum/minimum of a certain function, given the constraints. This is just a convenient form for finding approximate solutions and for understanding the system’s properties.
        
        The same factual outcome can be written as the maximum of many different functions under different constraints. One of the functions for which you can seek an extremum given constraints describes the behavior of the system on the level of physics (for example, using principle of least action; I forgot my physics, but it doesn’t look like Gibbs free energy applies to motion of planets). A completely different function for which you can seek an extremem given constraints describes its behavior on the level of preference—that’s utility. Both these accounts give the same solution stating what will actually happen, but the functions are different.
        
        “Shouldness” refers to a particular very specific way of presenting the system’s behavior, and it’s not free energy. Notice that you can describe AI’s or man’s behavior with physical variational principles as well, but that will have nothing to do with their preference.
        abramdemski 27 Aug 2009 21:35 UTC
        1 point
        Parent
        
        “Shouldness” refers to a particular very specific way of presenting the system’s >behavior, and it’s not free energy. Notice that you can describe AI’s or man’s behavior >with physical variational principles as well, but that will have nothing to do with their >preference.
        
        It seems to me that what SilasBarta is asking for here is a definition of shouldness such that the above statement holds. Why is it invalid to think that the system “wants” its physics? All you are indicating is that such is not what’s intended (which I’m sure SilasBarta knows)...
        Xplat 1 Sep 2009 17:27 UTC
        2 points
        Parent
        As far as variational principles go, one difference is that a physical system displays no preference among the different local extrema. (IIRC you can even come up with models where the same system will minimize (an) action for some initial conditions and maximize it for others.) This makes a Lagrangian-style physical system a pretty poor CSA even if you go out of your way to model it as one.
        SilasBarta 1 Sep 2009 17:34 UTC
        0 points
        Parent
        CSAs can’t escape local optima either … unless you found your global optimum without telling us ;-)
        Vladimir_Nesov 27 Aug 2009 21:37 UTC
        −2 points
        Parent
        Nothing singles out a particular variational formulation of physical laws as preference, among all the other equivalent formulations. Stating that the planet wants to minimize its action or whatever is as arbitrary as saying that it wants to be a whale. Silas Barta was asserting that “free energy” is the answer, which seems to be wrong on multiple accounts.
        SilasBarta 27 Aug 2009 22:05 UTC
        2 points
        Parent
        
        Stating that the planet wants to minimize its action or whatever is as arbitrary as saying that it wants to be a whale. Silas Barta was asserting that “free energy” is the answer
        
        No, I wasn’t, but I couldn’t even follow what your point was, once you started equating your own “shouldness” with the planet’s shouldness, as if that implied some kind of contradiction if they’re different. So, I didn’t follow up.
        
        The point was, if indeed we are all fully deterministic, and planets are fully deterministic, and planets embody the laws of physics, the concept of “shouldness” must be equally applicable in both cases. (More generally, I can’t distinguish “agent” type algorithms from “non-agent” type algorithms, so I don’t know what the alternative is.)
        
        You “could jump off that cliff, if you wanted to.” But as Eliezer_Yudkowsky notes in the link above, this statement is completely consistent with “It is physically impossible that you will jump off that cliff.” Because the “causal forces within physics that are you” cannot reach that state.
        
        And there’s the kicker: that situation is no different from that of a planet: whatever it “wishes”, it’s physically impossible to do anything but follow the path dictated by physics.
        
        My point about free energy was just to a) do a simple “reality check” (not the only check you can do) that would justify saying “the planet doesn’t want to be a whale”, and b) that every system will minimize its free energy with respect to a local domain of attraction. Just like how water will flow downhill spontaneously, but it won’t jump out of a basin, just because that can get it even further downhill.
        
        Now, in the sense that people can “want the impossible”, then yes, I have no evidence that a planet doesn’t want to be a whale. What I perhaps should have said is, a planet has not identified being a whale as the goal or subgoal it is in pursuit of. Even taking this reasoning to the extreme, the very first steps toward becoming a whale, would immediately hit the hard limits of free energy minimization, and so the planet could never even begin such a path—not viewed as a single entity.
        Vladimir_Nesov 28 Aug 2009 7:37 UTC
        0 points
        Parent
        
        Now, in the sense that people can “want the impossible”, then yes, I have no evidence that a planet doesn’t want to be a whale.
        
        Yup, that’s the case. This concept is meaningful because sometimes unexpected opportunities appear and the predictably impossible turns into an option. Or, more constructively, this concept is required to implement external “help” that is known in advance to be welcome.