How is the ass to determine that they’re equal, as opposed to one being ε larger than the other in finite time?
Again, this is the reverse of how you should go about things. The amount of time to spend making a decision is proportional to the difference between the value of the actions- once you’re sure the difference is smaller than ε, you stop caring. You might as well flip a coin.
The amount of time to spend making a decision is proportional to the difference between the value of the actions- once you’re sure the difference is smaller than ε, you stop caring.
Aw, but how does one determine that the distance is smaller than ε? What if the difference is arbitrarily close to ε?
You assign a disutility to spending time t further evaluating, and when the disutility of further evaluation is equal to the expected increase in utility of the selected bale (because one of them is/may be ε larger than the other), you select randomly between them.
If you adjust the disutility of marginal time t to increase with the total amount of time spent deciding, you break the deadlock condition which exists if you are uncertain about whether the disutility of spending more time evaluating is greater or less than the expected utility increase of spending more time evaluating the choice; if that return-on-investment is ever within ε of zero, then some time later it must be greater than ε less than zero (because the disutility of the time t will have increased by more than 2ε).
Again, this is the reverse of how you should go about things. The amount of time to spend making a decision is proportional to the difference between the value of the actions- once you’re sure the difference is smaller than ε, you stop caring. You might as well flip a coin.
Aw, but how does one determine that the distance is smaller than ε? What if the difference is arbitrarily close to ε?
You assign a disutility to spending time t further evaluating, and when the disutility of further evaluation is equal to the expected increase in utility of the selected bale (because one of them is/may be ε larger than the other), you select randomly between them.
If you adjust the disutility of marginal time t to increase with the total amount of time spent deciding, you break the deadlock condition which exists if you are uncertain about whether the disutility of spending more time evaluating is greater or less than the expected utility increase of spending more time evaluating the choice; if that return-on-investment is ever within ε of zero, then some time later it must be greater than ε less than zero (because the disutility of the time t will have increased by more than 2ε).