disclaimer: I’m not a computer scientist. I read up on the problem to see what the takeaways might be for decision theory. Since I’m not trained in any formal logic, I don’t know how to represent this solution in symbols. I think of the problem in terms of things like—am I spending too much time becoming smarter, than doing things that are smart?
Exploitation dominates exploration cause unless exploration is a subset of exploitation by definition, it would not be optimising expected utility for a given optimisation problem.
If exploitation is a subset of exploitation then unless components of exploitation have negative utility and thus wouldn’t be included in exploitation anyway, exploitation will have a higher expected utility than exploration
Rhetorical solution: Multi armed bandit problem
disclaimer: I’m not a computer scientist. I read up on the problem to see what the takeaways might be for decision theory. Since I’m not trained in any formal logic, I don’t know how to represent this solution in symbols. I think of the problem in terms of things like—am I spending too much time becoming smarter, than doing things that are smart?
Exploitation dominates exploration cause unless exploration is a subset of exploitation by definition, it would not be optimising expected utility for a given optimisation problem.
If exploitation is a subset of exploitation then unless components of exploitation have negative utility and thus wouldn’t be included in exploitation anyway, exploitation will have a higher expected utility than exploration
Thoughts?