I’m not sure how Anna Salamon would distinguish between a “could/should” agent and non-agent, but here’s my definition. An agent is an algorithm that given an input, evaluates multiple possible outputs (and for a consequentialist agent specifically, their predicted consequences), then picks the one that best satisfies its preferences to be the actual output. So,
“could” is the set of actions that you consider during the course of your decision computation
“should” is preferences that you use to select between multiple could’s
To categorize something as an agent, you need to look at its internal dynamics. A planet is not an agent because it’s not doing any computation that could be interpreted as evaluating multiple possible choices, and it’s certainly not predicting their consequences.
But people are made out of meat! How can that be an agent? I remain sceptical of ability of a theory to interpret humans as agents with preference if the same theory is unable to interpret a tree or a rock the same way, perhaps with ridiculous preference, but preference nonetheless.
Well, one thing is power, another preference. The whole point of FAI is that there is not enough power in humans, while preference should be preserved.
It may be easier to make out what people’s preference is than what a tree’s preference is, but for example consider the situation where a person just died and is never to be heard from again, where they can’t possibly make an impact on the world anymore (let’s say it’s 3000BC to exclude medical miracles). This person is extensionally no different from a tree, the difference lies primarily in the internal structure. These systems have the same power to optimize in the given situations, but they hardly have the same preference.
You could say that there are counterfactuals where the person recovers and goes on optimizing, but there are also counterfactuals that make the tree turn into an optimizer. There is in some sense a lot less situations where a tree turns into an optimizer than there are situations in which a dead person turns into an optimizer, but similarly there is a lot more situations in which a living person or an AGI operate as optimizers.
Where do you draw the line? If a theory does draw this line, the position of this line should be rigorously explained, not assumed on anthropomorphic scale.
The “meat” is clearly implementing a computation of the type I described, whereas a tree or rock isn’t. Do you dispute that?
A person who has died is no longer running such a computation, but until his brain decays, the agent-algorithm that he was running before he died can theoretically still be retrieved from his brain.
Your point seems to be that part of FAI theory should be a general and rigorous theory of how to extract preferences from any given object. Only then could we have sufficient theoretical support for any specific procedures for extracting preferences from human beings.
You may be right (I’m not sure) but I think that’s a separate question from “why one might design [a could/should] agent”, which is what started this thread. For that, the informal definition of “agent” that I gave seems to be sufficient, at least to understand the question.
The “meat” is clearly implementing a computation of the type I described, whereas a tree or rock isn’t. Do you dispute that?
I not so much dispute that as don’t know of a way to make this judgment precise.
Your point seems to be that part of FAI theory should be a general and rigorous theory of how to extract preferences from any given object. Only then could we have sufficient theoretical support for any specific procedures for extracting preferences from human beings.
Right, although I’m not sure that “objects” are the right scope of such theory. I suspect that you also need enough subjective specification of preference to initiate the process of interpretation (preference-extraction). This will make preference of rocks arbitrary, because the process of their interpretation can start in too many arbitrary ways and won’t converge to the same result from different starting points. At the same time, the structure of humans possibly creates a strong attractor, so that you have enough freedom in choosing the initial interpretation to specify something manually, while knowing that the end result depends very little on the initial specification.
I think that’s a separate question from “why one might design [a could/should] agent”, which is what started this thread. For that, the informal definition of “agent” that I gave seems to be sufficient, at least to understand the question.
On the level of informal understanding, of course. When you classify systems on agents and non-agents informally, you are using your own brain to interpret the system. This is not strong enough mechanism to extract preference, while a mechanism that can extract preference presumably would be able to see agents in configurations that people can’t interpret as agents, and what those mechanisms can see as agents is a more rigorous definition of what an agent is, hence my remark.
The “meat” is clearly implementing a computation of the type I described, whereas a tree or rock isn’t. Do you dispute that?
Many would dispute that, possibly including Luciano Floridi. A tree or even a rock engages in information processing—it exchanges heat, electrons,and such with its surroundings, for starters. And there is almost certainly a decompression you can run on some of the information to fit whatever sort of pattern you’re looking for.
And there is almost certainly a decompression you can run on some of the information to fit whatever sort of pattern you’re looking for.
I’ve explained before why this reasoning is misguided: to get arbitrary desired information processing out of random processes, you have to apply an ever-expanding interpretation, meaning that any model that calls e.g. a rock a computer is strictly longer than a model that doesn’t because the former would have to include all of the latter, plus random data.
So a rock is not a general use computer (though it can be used to compute a result, if the computation you want to perform happens to be isomorphic to whatever e.g. heat transfer is going on right now).
Now, with that in mind, I was among those who claimed that rocks are agents as defined by AnnaSalamon et. al, so how do I reconcile this with the claim that rocks aren’t computers?
Well, it’s like this: An agent, as defined here, has internal dynamics that could, in principle, be understood as a network of counterfactuals and preferences. A computer, OTOH, does in fact do the work of altering your beliefs about an arbitrary computation. (Generally, that just means concentrating your probability distribution onto the right answer, when before you just figured it was within some range.)
And since Eliezer_Yudkowsky claims that even a pebble embodies the laws of physics, which are nothing but a causal network containing counterfactuals and a species of preference (like energy minimization), that means the term “agent” is carving out a much huger chunk of conceptspace than I think AnnaSalamon et al intended. Which is what makes it hard for me to understand what the agent concept is supposed to be distinguished from.
Okay, come on guys, give me a break here; I think this post merits an explanation of where I erred rather than (or at least on top of) a downmod. Sure, I might have said something stupid, but I clearly laid out my reasoning about an important distinction that is being made. Help me out here.
I assumed as much, but my problem with this reasoning starts here:
To categorize something as an agent, you need to look at its internal dynamics. A planet is not an agent because it’s not doing any computation that could be interpreted as evaluating multiple possible choices, and it’s certainly not predicting their consequences.
Normally, I’d agree, but as I said, Eliezer_Yudkowsky claims that a pebble contains the laws of physics, which are nothing but a network of counterfactuals. So there necessarily is an isomorphism between a planet and “multiple possible consequences”.
This is why I say there must be a stronger sense in which you mean that the agent has computations that can be interpreted as evaluating multiple choices/consequences, because all of the universe is doing a sort of efficient version of that. And I don’t yet know what this stronger sense is.
I’m not sure how Anna Salamon would distinguish between a “could/should” agent and non-agent, but here’s my definition. An agent is an algorithm that given an input, evaluates multiple possible outputs (and for a consequentialist agent specifically, their predicted consequences), then picks the one that best satisfies its preferences to be the actual output. So,
“could” is the set of actions that you consider during the course of your decision computation
“should” is preferences that you use to select between multiple could’s
To categorize something as an agent, you need to look at its internal dynamics. A planet is not an agent because it’s not doing any computation that could be interpreted as evaluating multiple possible choices, and it’s certainly not predicting their consequences.
But people are made out of meat! How can that be an agent? I remain sceptical of ability of a theory to interpret humans as agents with preference if the same theory is unable to interpret a tree or a rock the same way, perhaps with ridiculous preference, but preference nonetheless.
in case it’s not been linked to here: They’re made out of meat!
That was the reference. I’ll also add a link to this video version.
http://lesswrong.com/lw/tx/optimization/
http://lesswrong.com/lw/va/measuring_optimization_power/
Well, one thing is power, another preference. The whole point of FAI is that there is not enough power in humans, while preference should be preserved.
It may be easier to make out what people’s preference is than what a tree’s preference is, but for example consider the situation where a person just died and is never to be heard from again, where they can’t possibly make an impact on the world anymore (let’s say it’s 3000BC to exclude medical miracles). This person is extensionally no different from a tree, the difference lies primarily in the internal structure. These systems have the same power to optimize in the given situations, but they hardly have the same preference.
You could say that there are counterfactuals where the person recovers and goes on optimizing, but there are also counterfactuals that make the tree turn into an optimizer. There is in some sense a lot less situations where a tree turns into an optimizer than there are situations in which a dead person turns into an optimizer, but similarly there is a lot more situations in which a living person or an AGI operate as optimizers.
Where do you draw the line? If a theory does draw this line, the position of this line should be rigorously explained, not assumed on anthropomorphic scale.
The “meat” is clearly implementing a computation of the type I described, whereas a tree or rock isn’t. Do you dispute that?
A person who has died is no longer running such a computation, but until his brain decays, the agent-algorithm that he was running before he died can theoretically still be retrieved from his brain.
Your point seems to be that part of FAI theory should be a general and rigorous theory of how to extract preferences from any given object. Only then could we have sufficient theoretical support for any specific procedures for extracting preferences from human beings.
You may be right (I’m not sure) but I think that’s a separate question from “why one might design [a could/should] agent”, which is what started this thread. For that, the informal definition of “agent” that I gave seems to be sufficient, at least to understand the question.
I not so much dispute that as don’t know of a way to make this judgment precise.
Right, although I’m not sure that “objects” are the right scope of such theory. I suspect that you also need enough subjective specification of preference to initiate the process of interpretation (preference-extraction). This will make preference of rocks arbitrary, because the process of their interpretation can start in too many arbitrary ways and won’t converge to the same result from different starting points. At the same time, the structure of humans possibly creates a strong attractor, so that you have enough freedom in choosing the initial interpretation to specify something manually, while knowing that the end result depends very little on the initial specification.
On the level of informal understanding, of course. When you classify systems on agents and non-agents informally, you are using your own brain to interpret the system. This is not strong enough mechanism to extract preference, while a mechanism that can extract preference presumably would be able to see agents in configurations that people can’t interpret as agents, and what those mechanisms can see as agents is a more rigorous definition of what an agent is, hence my remark.
Many would dispute that, possibly including Luciano Floridi. A tree or even a rock engages in information processing—it exchanges heat, electrons,and such with its surroundings, for starters. And there is almost certainly a decompression you can run on some of the information to fit whatever sort of pattern you’re looking for.
I’ve explained before why this reasoning is misguided: to get arbitrary desired information processing out of random processes, you have to apply an ever-expanding interpretation, meaning that any model that calls e.g. a rock a computer is strictly longer than a model that doesn’t because the former would have to include all of the latter, plus random data.
So a rock is not a general use computer (though it can be used to compute a result, if the computation you want to perform happens to be isomorphic to whatever e.g. heat transfer is going on right now).
Now, with that in mind, I was among those who claimed that rocks are agents as defined by AnnaSalamon et. al, so how do I reconcile this with the claim that rocks aren’t computers?
Well, it’s like this: An agent, as defined here, has internal dynamics that could, in principle, be understood as a network of counterfactuals and preferences. A computer, OTOH, does in fact do the work of altering your beliefs about an arbitrary computation. (Generally, that just means concentrating your probability distribution onto the right answer, when before you just figured it was within some range.)
And since Eliezer_Yudkowsky claims that even a pebble embodies the laws of physics, which are nothing but a causal network containing counterfactuals and a species of preference (like energy minimization), that means the term “agent” is carving out a much huger chunk of conceptspace than I think AnnaSalamon et al intended. Which is what makes it hard for me to understand what the agent concept is supposed to be distinguished from.
Okay, come on guys, give me a break here; I think this post merits an explanation of where I erred rather than (or at least on top of) a downmod. Sure, I might have said something stupid, but I clearly laid out my reasoning about an important distinction that is being made. Help me out here.
I assumed as much, but my problem with this reasoning starts here:
Normally, I’d agree, but as I said, Eliezer_Yudkowsky claims that a pebble contains the laws of physics, which are nothing but a network of counterfactuals. So there necessarily is an isomorphism between a planet and “multiple possible consequences”.
This is why I say there must be a stronger sense in which you mean that the agent has computations that can be interpreted as evaluating multiple choices/consequences, because all of the universe is doing a sort of efficient version of that. And I don’t yet know what this stronger sense is.