imply that injecting randomness can improve an algorithm
Hm. This would actually appear to be a distinct class of cases in which randomness is “useful”, but it’s important to note that a simple deterministic algorithm would do better if we were allowed to remember our own past actions—i.e. this is a very special case. I should probably think about it further, but offhand, it looks like the reason for randomized-optimality is that the optimal deterministic algorithm has been prohibited in a way that makes all remaining deterministic algorithms stupid.
More generally, game theory often produces situations where randomization is clearly the rational strategy when you do not know with certainty the other player’s action. The example given in this post is straightforwardly convertible into a game involving Nature, as many games do, where Nature plays first and puts you into one of two states; if you are in X and play Continue, Nature plays again and the game loops. The solution to that game is the same as that derived above, I believe.
The point is, for a broad class of games/decision problems where Nature has a substantial role, we might run into similar issues that can be resolved a similar way.
I think the motivation for this problem is that memory in general is a limited resource, and a decision theory should be able to handle cases were recall is imperfect. I don’t believe that there was a deliberate attempt to prohibit the optimal deterministic algorithm in order to make randomized algorithms look good.
I don’t think that resolves the issue. As I demonstrated in this comment, if you have some probabilistic knowledge of which intersection you’re at, you can do better than the p=2/3 method. Specifically, as long as you have 0.0012 bits of information about which intersection you’re at (i.e. assign a greater than 52% chance of guessing correctly), you’re better off choosing based on what seems most likely.
However—and this is the kicker—that means that if you have between 0 and 0.0012 bits of information about your intersection, you’re best off throwing that information away entirely and going with the method that’s optimal for when you’re fully forgetful. So it’s still a case where throwing away information helps you.
ETA: False alarm; Wei_Dai corrects me here—you can still use your knowledge to do better than 4⁄3 when your probability of guessing right is between 50% and 52.05%.
It’s probably better not to think of this as a randomized algorithm. Here is a simpler example of what I mean.
Suppose you have two urns in front of you. One urn is full of N white marbles, and the other urn is empty. Your task is to take a marble out of the first urn, paint it either red or blue, place it in the second urn, and then repeat this process until the first urn is empty. Moreover, when you are done, something very close to two-thirds of the marbles in the second urn must be red.
The catch, of course, is that you have very poor short-term memory, so you can never remember how many marbles you’ve painted or what colors you’ve painted them.
The “randomized algorithm” solution would be to use a pseudo-random number generator to produce a number x between 0 and 1 for each marble, and to paint that marble red if and only if x < 2⁄3.
But there is a non-random way to think of that procedure. Suppose instead that, before you start painting your N marbles, you set out a box of N poker chips, of which you know (that is, have reason to be highly confident) that very nearly two-thirds are red. You then proceed to paint marbles according to the following algorithm. After taking a marble in hand, you select a poker chip non-randomly from the box, and then paint the marble the same color as that poker chip.
This is a non-random algorithm that you can use with confidence, but which requires no memory. And, as I see it, the method with the pseudo-random number generator amounts to the same thing. By deciding to use the generator, you are determining N numbers: the next N numbers that the generator will produce. Moreover, if you know how the generator is constructed, you know (that is, have reason to be highly confident) that very nearly two-thirds of those numbers will be less than 2⁄3. To my mind, this is functionally identical to the poker chip procedure.
This is a non-random algorithm that you can use with confidence, but which requires no memory.
You mean it does not require memory in your brain, because you implemented your memory with the poker chips. It is quite convenient they were available.
My point is that it’s no more convenient than having the pseudo-random number generator available. I maintain that the generator is implementing your memory in functionally the same sense. For example, you are effectively guaranteed not to get the same number twice, just as you are effectively guaranteed not to get the same poker chip twice.
ETA: After all, something in the generator must be keeping track of the passage of the marbles for you. Otherwise the generator would keep producing the same number over and over.
Rather than using a PRNG (which, as you say, requires memory), you could use a source of actual randomness (e.g. quantum decay). Then you don’t really have extra memory with the randomized algorithm, do you?
I thought of this as well, but it does not really matter because it is the ability to produce the different output in each case event that gives part of the functionality of memory, that is, the ability to distinguish between events. Granted, this is not as effective as deterministically understood memory, where you know in advance which output you get at each event. Essentially, it is memory with the drawback that you don’t understand how it works to the extent that you are uncertain how it correlates with what you wanted to remember.
After all, something in the generator must be keeping track of the passage of the marbles for you. Otherwise the generator would keep producing the same number over and over.
Randomness ‘degenerates’ (perhaps by action of a malicious daemon) into non-randomness, and so it can do better and no worse than a non-random approach?
(If the environments and agents are identical down to the source of randomness, then the agent defaults to a pure strategy; but with ‘genuine’ randomness, random sources that are different between instances of the agent, the agent can actually implement the better mixed strategy?)
I’m having trouble parsing your questions. You have some sentences that end in question marks. Are you asking whether I agree with those sentences? I’m having trouble understanding the assertions made by those sentences, so I can’t tell whether I agree with them (if that was what you were asking).
The claim that I was making could be summed up as follows. I described an agent using a PRNG to solve a problem involving painting marbles. The usual way to view such a solution is as
deterministic amnesiac agent PLUS randomness.
My suggestion was instead to view the solution as
deterministic amnesiac agent
PLUS
a particular kind of especially limited memory
PLUS
an algorithm that takes the contents of that memory as input and produces an output that is almost guaranteed to have a certain property.
The especially limited memory is the part of the PRNG that remembers what the next seed should be. If there weren’t some kind of memory involved in the PRNG’s operation, the PRNG would keep using the same seed over and over again, producing the same “random” number again and again.
The algorithm is the algorithm that the PRNG uses to turn the first seed into a sequence of pseudo-random numbers.
The certain property of that sequence is the property of having two-thirds of its terms being less than 2⁄3.
I maintain that the generator is implementing your memory in functionally the same sense.
That is fair enough, though the reason I find scenario at all interesting is that it illustrates the utility of a random strategy under certain conditions.
For me, finding an equivalent nonrandom strategy helps to dispel confusion.
I like your characterization above that the PRNG is “memory with the drawback that you don’t understand how it works to the extent that you are uncertain how it correlates with what you wanted to remember.” Another way to say it is that the PRNG gives you exactly what you need with near-certainty, while normal memory gives you extra information that happens to be useless for this problem.
What is “random” about the PRNG (the exact sequence of numbers) is extra stuff that you happen not to need. What you need from the PRNG (N numbers, of which two-thirds are less than 2⁄3) is not random but a near-certainty. So, although you’re using a so-called pseudo-random number generator, you’re really using an aspect of it that’s not random in any significant sense. For this reason, I don’t think that the PRNG algorithm should be called “random”, any more than is the poker chip algorithm.
Very clever! It is indeed true that if you forget all previous marble paintings, the best way to ensure that 2⁄3 get painted one color is to paint it that color with p = 2⁄3.
And interestingly, I can think of several examples of my own life when I’ve been in that situation. For example, when I’m playing Alpha Centauri, I want to make sure I have a good mix of artillery, infantry, and speeders, but it’s tedious to keep track of how many I have of each, so I just pick in a roughly random way, but biased toward those that I want in higher proportions.
I’m going to see if I can map the urn/marble-painting problem back onto the absent-minded driver problem.
If you’re allowed to use external memory, why not just write down how many you painted of each color? Note that memory is different from a random number generator; for example, a random number generator can be used (imperfectly) to coordinate with a group of people with no communication, whereas memory would require communication but could give perfect results.
Hm. This would actually appear to be a distinct class of cases in which randomness is “useful”, but it’s important to note that a simple deterministic algorithm would do better if we were allowed to remember our own past actions—i.e. this is a very special case. I should probably think about it further, but offhand, it looks like the reason for randomized-optimality is that the optimal deterministic algorithm has been prohibited in a way that makes all remaining deterministic algorithms stupid.
More generally, game theory often produces situations where randomization is clearly the rational strategy when you do not know with certainty the other player’s action. The example given in this post is straightforwardly convertible into a game involving Nature, as many games do, where Nature plays first and puts you into one of two states; if you are in X and play Continue, Nature plays again and the game loops. The solution to that game is the same as that derived above, I believe.
The point is, for a broad class of games/decision problems where Nature has a substantial role, we might run into similar issues that can be resolved a similar way.
I think the motivation for this problem is that memory in general is a limited resource, and a decision theory should be able to handle cases were recall is imperfect. I don’t believe that there was a deliberate attempt to prohibit the optimal deterministic algorithm in order to make randomized algorithms look good.
I don’t think that resolves the issue. As I demonstrated in this comment, if you have some probabilistic knowledge of which intersection you’re at, you can do better than the p=2/3 method. Specifically, as long as you have 0.0012 bits of information about which intersection you’re at (i.e. assign a greater than 52% chance of guessing correctly), you’re better off choosing based on what seems most likely.
However—and this is the kicker—that means that if you have between 0 and 0.0012 bits of information about your intersection, you’re best off throwing that information away entirely and going with the method that’s optimal for when you’re fully forgetful. So it’s still a case where throwing away information helps you.
ETA: False alarm; Wei_Dai corrects me here—you can still use your knowledge to do better than 4⁄3 when your probability of guessing right is between 50% and 52.05%.
It’s probably better not to think of this as a randomized algorithm. Here is a simpler example of what I mean.
Suppose you have two urns in front of you. One urn is full of N white marbles, and the other urn is empty. Your task is to take a marble out of the first urn, paint it either red or blue, place it in the second urn, and then repeat this process until the first urn is empty. Moreover, when you are done, something very close to two-thirds of the marbles in the second urn must be red.
The catch, of course, is that you have very poor short-term memory, so you can never remember how many marbles you’ve painted or what colors you’ve painted them.
The “randomized algorithm” solution would be to use a pseudo-random number generator to produce a number x between 0 and 1 for each marble, and to paint that marble red if and only if x < 2⁄3.
But there is a non-random way to think of that procedure. Suppose instead that, before you start painting your N marbles, you set out a box of N poker chips, of which you know (that is, have reason to be highly confident) that very nearly two-thirds are red. You then proceed to paint marbles according to the following algorithm. After taking a marble in hand, you select a poker chip non-randomly from the box, and then paint the marble the same color as that poker chip.
This is a non-random algorithm that you can use with confidence, but which requires no memory. And, as I see it, the method with the pseudo-random number generator amounts to the same thing. By deciding to use the generator, you are determining N numbers: the next N numbers that the generator will produce. Moreover, if you know how the generator is constructed, you know (that is, have reason to be highly confident) that very nearly two-thirds of those numbers will be less than 2⁄3. To my mind, this is functionally identical to the poker chip procedure.
You mean it does not require memory in your brain, because you implemented your memory with the poker chips. It is quite convenient they were available.
My point is that it’s no more convenient than having the pseudo-random number generator available. I maintain that the generator is implementing your memory in functionally the same sense. For example, you are effectively guaranteed not to get the same number twice, just as you are effectively guaranteed not to get the same poker chip twice.
ETA: After all, something in the generator must be keeping track of the passage of the marbles for you. Otherwise the generator would keep producing the same number over and over.
Rather than using a PRNG (which, as you say, requires memory), you could use a source of actual randomness (e.g. quantum decay). Then you don’t really have extra memory with the randomized algorithm, do you?
I thought of this as well, but it does not really matter because it is the ability to produce the different output in each case event that gives part of the functionality of memory, that is, the ability to distinguish between events. Granted, this is not as effective as deterministically understood memory, where you know in advance which output you get at each event. Essentially, it is memory with the drawback that you don’t understand how it works to the extent that you are uncertain how it correlates with what you wanted to remember.
Have you read this comment of mine from another branch of this conversation?
Randomness ‘degenerates’ (perhaps by action of a malicious daemon) into non-randomness, and so it can do better and no worse than a non-random approach?
(If the environments and agents are identical down to the source of randomness, then the agent defaults to a pure strategy; but with ‘genuine’ randomness, random sources that are different between instances of the agent, the agent can actually implement the better mixed strategy?)
I’m having trouble parsing your questions. You have some sentences that end in question marks. Are you asking whether I agree with those sentences? I’m having trouble understanding the assertions made by those sentences, so I can’t tell whether I agree with them (if that was what you were asking).
The claim that I was making could be summed up as follows. I described an agent using a PRNG to solve a problem involving painting marbles. The usual way to view such a solution is as
My suggestion was instead to view the solution as
The especially limited memory is the part of the PRNG that remembers what the next seed should be. If there weren’t some kind of memory involved in the PRNG’s operation, the PRNG would keep using the same seed over and over again, producing the same “random” number again and again.
The algorithm is the algorithm that the PRNG uses to turn the first seed into a sequence of pseudo-random numbers.
The certain property of that sequence is the property of having two-thirds of its terms being less than 2⁄3.
OK, that’s clearer. And different from what I thought you were saying.
That is fair enough, though the reason I find scenario at all interesting is that it illustrates the utility of a random strategy under certain conditions.
For me, finding an equivalent nonrandom strategy helps to dispel confusion.
I like your characterization above that the PRNG is “memory with the drawback that you don’t understand how it works to the extent that you are uncertain how it correlates with what you wanted to remember.” Another way to say it is that the PRNG gives you exactly what you need with near-certainty, while normal memory gives you extra information that happens to be useless for this problem.
What is “random” about the PRNG (the exact sequence of numbers) is extra stuff that you happen not to need. What you need from the PRNG (N numbers, of which two-thirds are less than 2⁄3) is not random but a near-certainty. So, although you’re using a so-called pseudo-random number generator, you’re really using an aspect of it that’s not random in any significant sense. For this reason, I don’t think that the PRNG algorithm should be called “random”, any more than is the poker chip algorithm.
Very clever! It is indeed true that if you forget all previous marble paintings, the best way to ensure that 2⁄3 get painted one color is to paint it that color with p = 2⁄3.
And interestingly, I can think of several examples of my own life when I’ve been in that situation. For example, when I’m playing Alpha Centauri, I want to make sure I have a good mix of artillery, infantry, and speeders, but it’s tedious to keep track of how many I have of each, so I just pick in a roughly random way, but biased toward those that I want in higher proportions.
I’m going to see if I can map the urn/marble-painting problem back onto the absent-minded driver problem.
If you’re allowed to use external memory, why not just write down how many you painted of each color? Note that memory is different from a random number generator; for example, a random number generator can be used (imperfectly) to coordinate with a group of people with no communication, whereas memory would require communication but could give perfect results.
Have you read this comment of mine from another branch of this conversation?
Take 3 marbles out of the urn. Paint one of them blue, and then the other two red. Put all three in the second urn. Repeat
Yeah, yeah—white marbles are half-medusas, so if you can see more than one you die, or something.
Taking more than one marble out of the urn at a time violates the task description. Don’t fight the hypothetical!
… That’s what the second paragraph is for...