The point of utilons is to scale linearly, unlike, say dollars. Maybe there’s a maximum utility that can be obtained, but they never scale non-linearly. The task where you can name any number below 100, but not 100 itself, avoids these issues though.
I don’t understand your objection to the Unlimited Swap scenario, but isn’t it plausible that a perfectly rational agent might not exist?
The task where you can name any number below 100, but not 100 itself, avoids these issues though.
That task still has the issue that the agent incurs some unstated cost (probably time) to keep mashing on the 9 key (or whatever input method). At some point, the gains are nominal and the agent would be better served collecting utility in the way it usually does. Same goes for the Unlimited Swap scenario: the agent could better spend its time by instantly taking the 1 utilon and going about its business as normal, thus avoiding a stalemate (condition where nobody gets any utility) with 100% certainty.
Is it plausible that a perfectly rational agent might not exist? Certainly. But I hardly think these thought exercises prove that one is not possible. Rather, they suggest that when working with limited information we need a sane stopping function to avoid stalemate. Some conditions have to be “good enough”… I suppose I object to the concept of “infinite patience”.
True, everything does exist in context. And the context being considered here, is not the real world, but the behaviour in a purely theoretically constructed world. I have made no claims that it corresponds to the real world as of yet, so claiming that it doesn’t correspond to the real world is not a valid criticism.
My criticism is that you have either set up a set of scenarios with insufficient context to answer the question of how to obtain maximum utility, or deliberately constructed these scenarios such that attempting to obtain maximum utility leads to the Actor spending an infinite amount of time while failing to ever complete the task and actually collect. You stated that until the specification of the number, or the back-and-forth game was complete no utility was gained. I responded that the solution is to not play the game, but for the actor to grab as much utility as it could get within a certain finite time limit according to its stopping function and go about its business.
I have made no claims that it corresponds to the real world as of yet...
If it does not, then what is the point? How does such an exercise help us to be “less wrong”? The point of constructing beliefs about Rational Actors is to be able to predict how they would behave so we can emulate that behavior. By choosing to explore a subject in this context, you are implicitly making the claim that you believe it does correspond to the real world in some way. Furthermore, your choice to qualify your statement with “as of yet” reinforces that implication. So I ask you to state your claim so we may examine it in full context.
“Insufficient context”—the context is perfectly well defined. How tired do I get considering large numbers? You don’t get tired at all! What is the opportunity cost of considering large number? There is no opportunity cost at all. And so on. It’s all very well defined.
“Responded that the solution is to not play the game, but for the actor to grab as much utility as it could get within a certain finite time limit according to its stopping function and go about its business.”—except that’s not a single solution, but multiple solutions, depending on which number you stop at.
“If it does not, then what is the point?”—This is only part 1. I plan to write more on this subject eventually. As an analogy, a reader of a book series can’t go to an author and demand that they release volume 2 right now so that they can understand part 1 in its full context. My objective here is only to convince people of this abstract theoretical point, because I suspect that I’ll need it later (but I don’t know for certain).
You don’t get tired at all… there is no cost at all...
So you have deliberately constructed a scenario, then defined “winning” as something forbidden by the scenario. Unhelpful.
That’s multiple solutions.
You have specified multiple games. I have defined a finite set of solutions for each Actor that can all be stated as “use the stopping function”. If your Actor has no such function, it is not rational because it can get stuck by problems with the potential to become unbounded. Remember, the Traveling Salesman must eventually sell something or all that route planning is meaningless. This sort of thing is exactly what a stopping function is for, but you seem to have written them out of the hypothetical universe for some (as yet unspecified) reason.
A reader can’t go to the author and demand volume 2...
Incorrect. People do it all the time, and it is now easier than ever. Moreover, I object to the comparison of your essay with a book. This context is more like a conversation than a publication. Please get to the point.
My objective is to convince people of this abstract theoretical point...
You have done nothing but remove criteria for stopping functions from unbounded scenarios. I don’t believe that is convincing anybody of anything. I suspect the statement “not every conceivable game in every conceivable universe allows for a stopping function that does not permit somebody else to do better” would be given a non-negligible probability by most of us already. That statement seems to be what you have been arguing, and seems to coincide with your title.
Friendly Style Note: I (just now) noticed that you have made some major changes to the article. It might be helpful to isolate those changes structurally to make them more visually obvious. Remember, we may not be rereading the full text very often, so a timestamp might be nice too. :)
You’ll be pleased to know that I found a style of indicating edits that I’m happy with. I reaslised that if I make the word edited subscript then it is much less obnoxious, so I’ll be using this technique on future posts.
There is no need to re-read the changes to the article. The changes just incorporate things that I’ve also written in the comments to reduce the chance of new commentators coming into the thread with misunderstandings I’ve clarified in the comments.
“So you have deliberately constructed a scenario, then defined “winning” as something forbidden by the scenario. Unhelpful.”—As long as the scenario does not explicitly punish rationality, it is perfectly valid to expect a perfectly rational agent to outperform any other agent.
“Remember, the Traveling Salesman must eventually sell something or all that route planning is meaningless”—I completely agree with this, not stopping is irrational as you gain 0 utility. My point was that you can’t just say, “A perfectly rational agent will choose an action in this set”. You have to specify which action (or actions) an agent could choose whilst being perfectly rational.
“You have done nothing but remove criteria for stopping functions from unbounded scenarios”—And that’s a valid situation to hand off to any so-called “perfectly rational agent”. If it gets beaten, then it isn’t deserving of that name.
There is no need to re-read the changes to the article...
I have been operating under my memory of the original premise. I re-read the article to refresh that memory and found the changes. I would simply have been happier if there was an ETA section or something. No big deal, really.
As long as the scenario does not explicitly punish rationality, it is perfectly valid to expect a perfectly rational agent to outperform any other agent.
Not so: you have generated infinite options such that there is no selection that can fulfill that expectation. Any agent that tries to do so cannot be perfectly rational since the goal as defined is impossible.
Exactly, if you accept the definition of a perfectly rational agent as a perfect utility maximiser, then there’s no utility maximiser as there’s always another agent that obtains more utility, so there is no perfectly rational agent. I don’t think that this is a particularly unusual way of using the term “perfectly rational agent”.
And it would still get beaten by a more rational agent, that would be beaten by a still more rational agent and so on until infinity. There’s a non-terminating set of increasingly rational agents, but no final “most rational” agent.
If the PRA isn’t trying to “maximize” an unbounded function, it can’t very well get “beaten” by another agent who chooses x+n because they didn’t have the same goal. I reject, therefore, that an agent that obeys its stopping function in an unbounded scenario may be called any more or less “rational” based on that reason only than any other agent that does the same, regardless of the utility it may not have collected.
By removing all constraints, you have made comparing results meaningless.
Might be. Maybe that agent’s utility function is actually bounded at 1 (it’s not trying to maximize, after all). Perhaps it wants 100 utility, but already has firm plans to get the other 99. Maybe it chose a value at random from the range of all positive real numbers (distributed such that the probability of choosing X grows proportional to X) and pre-committed to the results, thus guaranteeing a stopping condition with unbounded expected return. Since it was missing out on unbounded utility in any case, getting literally any is better than none, but the difference between x and y is not really interesting.
(humorously) Maybe it just has better things to do than measuring its *ahem* stopping function against the other agents.
The point of utilons is to scale linearly, unlike, say dollars. Maybe there’s a maximum utility that can be obtained, but they never scale non-linearly. The task where you can name any number below 100, but not 100 itself, avoids these issues though.
I don’t understand your objection to the Unlimited Swap scenario, but isn’t it plausible that a perfectly rational agent might not exist?
That task still has the issue that the agent incurs some unstated cost (probably time) to keep mashing on the 9 key (or whatever input method). At some point, the gains are nominal and the agent would be better served collecting utility in the way it usually does. Same goes for the Unlimited Swap scenario: the agent could better spend its time by instantly taking the 1 utilon and going about its business as normal, thus avoiding a stalemate (condition where nobody gets any utility) with 100% certainty.
Is it plausible that a perfectly rational agent might not exist? Certainly. But I hardly think these thought exercises prove that one is not possible. Rather, they suggest that when working with limited information we need a sane stopping function to avoid stalemate. Some conditions have to be “good enough”… I suppose I object to the concept of “infinite patience”.
Everything exists in context
True, everything does exist in context. And the context being considered here, is not the real world, but the behaviour in a purely theoretically constructed world. I have made no claims that it corresponds to the real world as of yet, so claiming that it doesn’t correspond to the real world is not a valid criticism.
My criticism is that you have either set up a set of scenarios with insufficient context to answer the question of how to obtain maximum utility, or deliberately constructed these scenarios such that attempting to obtain maximum utility leads to the Actor spending an infinite amount of time while failing to ever complete the task and actually collect. You stated that until the specification of the number, or the back-and-forth game was complete no utility was gained. I responded that the solution is to not play the game, but for the actor to grab as much utility as it could get within a certain finite time limit according to its stopping function and go about its business.
If it does not, then what is the point? How does such an exercise help us to be “less wrong”? The point of constructing beliefs about Rational Actors is to be able to predict how they would behave so we can emulate that behavior. By choosing to explore a subject in this context, you are implicitly making the claim that you believe it does correspond to the real world in some way. Furthermore, your choice to qualify your statement with “as of yet” reinforces that implication. So I ask you to state your claim so we may examine it in full context.
“Insufficient context”—the context is perfectly well defined. How tired do I get considering large numbers? You don’t get tired at all! What is the opportunity cost of considering large number? There is no opportunity cost at all. And so on. It’s all very well defined.
“Responded that the solution is to not play the game, but for the actor to grab as much utility as it could get within a certain finite time limit according to its stopping function and go about its business.”—except that’s not a single solution, but multiple solutions, depending on which number you stop at.
“If it does not, then what is the point?”—This is only part 1. I plan to write more on this subject eventually. As an analogy, a reader of a book series can’t go to an author and demand that they release volume 2 right now so that they can understand part 1 in its full context. My objective here is only to convince people of this abstract theoretical point, because I suspect that I’ll need it later (but I don’t know for certain).
So you have deliberately constructed a scenario, then defined “winning” as something forbidden by the scenario. Unhelpful.
You have specified multiple games. I have defined a finite set of solutions for each Actor that can all be stated as “use the stopping function”. If your Actor has no such function, it is not rational because it can get stuck by problems with the potential to become unbounded. Remember, the Traveling Salesman must eventually sell something or all that route planning is meaningless. This sort of thing is exactly what a stopping function is for, but you seem to have written them out of the hypothetical universe for some (as yet unspecified) reason.
Incorrect. People do it all the time, and it is now easier than ever. Moreover, I object to the comparison of your essay with a book. This context is more like a conversation than a publication. Please get to the point.
You have done nothing but remove criteria for stopping functions from unbounded scenarios. I don’t believe that is convincing anybody of anything. I suspect the statement “not every conceivable game in every conceivable universe allows for a stopping function that does not permit somebody else to do better” would be given a non-negligible probability by most of us already. That statement seems to be what you have been arguing, and seems to coincide with your title.
Friendly Style Note: I (just now) noticed that you have made some major changes to the article. It might be helpful to isolate those changes structurally to make them more visually obvious. Remember, we may not be rereading the full text very often, so a timestamp might be nice too. :)
You’ll be pleased to know that I found a style of indicating edits that I’m happy with. I reaslised that if I make the word edited subscript then it is much less obnoxious, so I’ll be using this technique on future posts.
That sounds like it will be much easier to read. Thank you for following up!
There is no need to re-read the changes to the article. The changes just incorporate things that I’ve also written in the comments to reduce the chance of new commentators coming into the thread with misunderstandings I’ve clarified in the comments.
“So you have deliberately constructed a scenario, then defined “winning” as something forbidden by the scenario. Unhelpful.”—As long as the scenario does not explicitly punish rationality, it is perfectly valid to expect a perfectly rational agent to outperform any other agent.
“Remember, the Traveling Salesman must eventually sell something or all that route planning is meaningless”—I completely agree with this, not stopping is irrational as you gain 0 utility. My point was that you can’t just say, “A perfectly rational agent will choose an action in this set”. You have to specify which action (or actions) an agent could choose whilst being perfectly rational.
“You have done nothing but remove criteria for stopping functions from unbounded scenarios”—And that’s a valid situation to hand off to any so-called “perfectly rational agent”. If it gets beaten, then it isn’t deserving of that name.
I have been operating under my memory of the original premise. I re-read the article to refresh that memory and found the changes. I would simply have been happier if there was an ETA section or something. No big deal, really.
Not so: you have generated infinite options such that there is no selection that can fulfill that expectation. Any agent that tries to do so cannot be perfectly rational since the goal as defined is impossible.
Exactly, if you accept the definition of a perfectly rational agent as a perfect utility maximiser, then there’s no utility maximiser as there’s always another agent that obtains more utility, so there is no perfectly rational agent. I don’t think that this is a particularly unusual way of using the term “perfectly rational agent”.
In this context, I do not accept that definition: you cannot maximize an unbounded function. A Perfectly Rational Agent would know that.
And it would still get beaten by a more rational agent, that would be beaten by a still more rational agent and so on until infinity. There’s a non-terminating set of increasingly rational agents, but no final “most rational” agent.
If the PRA isn’t trying to “maximize” an unbounded function, it can’t very well get “beaten” by another agent who chooses x+n because they didn’t have the same goal. I reject, therefore, that an agent that obeys its stopping function in an unbounded scenario may be called any more or less “rational” based on that reason only than any other agent that does the same, regardless of the utility it may not have collected.
By removing all constraints, you have made comparing results meaningless.
So an agent that chooses only 1 utility could still be a perfectly rational agent in your books?
Might be. Maybe that agent’s utility function is actually bounded at 1 (it’s not trying to maximize, after all). Perhaps it wants 100 utility, but already has firm plans to get the other 99. Maybe it chose a value at random from the range of all positive real numbers (distributed such that the probability of choosing X grows proportional to X) and pre-committed to the results, thus guaranteeing a stopping condition with unbounded expected return. Since it was missing out on unbounded utility in any case, getting literally any is better than none, but the difference between x and y is not really interesting.
(humorously) Maybe it just has better things to do than measuring its *ahem* stopping function against the other agents.