bogdanb comments on The Weighted Majority Algorithm

bogdanb 14 Jul 2013 8:03 UTC
7 points

There are many problems for which Monte Carlo integration is the most efficient method available.

(Emphasis mine.) I know I’m late to the party, but I just noticed this. While what you say it’s true, “available” in this case means “that we know of”, not “that is possible”. I’m not an expert, but IIRC the point of the MCI is that the functions are hard to analyze. You could integrate analogously without randomization (e.g., sample the integration domain on a regular grid), and it very well might work. The problem is that if the function you integrate happens to behave atypically on the regular grid with respect to its global behavior (e.g., if the function has some regular features with a frequency that accidentally matches that of your grid, you might sample only local maxima, and severely overestimate the integral).

But for an individual function there certainly exists (for some values of “certainly”) an ideal sampling that yields an excellent integration, or some even better method that doesn’t need sampling. But, you need to understand the function very well, and search for that perfect evaluation strategy, and do this for every individual function you want to integrate. (Not a math expert here, but I suspect in the general case that grid would be impossible to find, even if it exists, and it’s probably very hard even for those where the solution can be found.)

So what MCI does is exactly what Eliezer mentions above: you randomize the sampling as well as you can, to avoid as much as possible tripping an unexpected “resonance” between the grid and the function’s behavior. In effect, you’re working against an environment you can’t overwhelm by being smart, so you brute force it and do your best to reduce the ways the environment can overwhelm you, i.e. you try to minimize the worst-case consequences.

Note that for functions we really understand, MCI is not by far the best method. If you want to integrate a polynomial, you just compute the antiderivative, evaluate it in two points, subtract, and you’re done. MCI is nice because in general we don’t know how to find out the antiderivative. In analogy with tic-tac-toe, integration would be the game, and each function is a different opponent; a polynomial is an adversary we can anticipate as well as Omega can model a human—in fact, we don’t even need to evaluate the function, we can deduce directly what the result of “its actions” will be; but most functions (probably almost all) are so difficult adversaries that we can’t do better than try to limit how badly they can screw us.

(I have a vague recollection of an anecdote about MCI: Someone somewhere was using it to do a complicated multiple integral, and got very silly results. Turns out, while the pseudo-random generator looked OK by itself, when its output was distributed throughout the complicated multi-dimensional space that was sampled, a non-obvious regularity in the generated numbers happened to match something the function did. So, in effect, the function beat them. In other words (ignoring that this is apocryphal), you don’t even need a very smart opponent to be screwed, a hard function is enough… That quantum noise generator can be very handy.)
- A1987dM 14 Jul 2013 8:22 UTC
  4 points
  Parent
  
  (I have a vague recollection of an anecdote about MCI: Someone somewhere was using it to do a complicated multiple integral, and got very silly results. Turns out, while the pseudo-random generator looked OK by itself, when its output was distributed throughout the complicated multi-dimensional space that was sampled, a non-obvious regularity in the generated numbers happened to match something the function did. So, in effect, the function beat them. In other words (ignoring that this is apocryphal), you don’t even need a very smart opponent to be screwed, a hard function is enough… That quantum noise generator can be very handy.)
  
  Modern PRNGs tend to be less bad than that.
  - bogdanb 14 Jul 2013 13:28 UTC
    4 points
    Parent
    That’s not at all my domain of expertise, so I’ll take your word for it (though I guessed as much).
    
    That said, I meant that as half funny anecdote, half “it’s not just theory, it’s real life, and you can get screwed hard by something that guesses your decision algorithm, even if it’s just a non-sentient function”, not as an argument for/against either MC or quantum noise. (Though, now that I think of it, these days there probably are quantum noise generators available somewhere.)
    - A1987dM 14 Jul 2013 14:23 UTC
      2 points
      Parent
      
      (Though, now that I think of it, these days there probably are quantum noise generators available somewhere.)
      
      AFAIK, there are but they’re way too slow for use in MC, so decent-but-fast PRNGs are still preferred.
      - A1987dM 10 Dec 2013 11:55 UTC
        0 points
        Parent
        Also, sometimes you need to run several tests with the same random sequence for debugging purposes, which with true random numbers would require you to store all of them, whereas with pseudo-random numbers you just need to use the same seed.
  - A1987dM 18 Oct 2013 8:57 UTC
    1 point
    Parent
    Too bad that the default PRNG in many systems is still an ancient one.
    What links here?
    A1987dM's comment on Open Thread, October 13 − 19, 2013 by Scott Garrabrant (18 Oct 2013 9:11 UTC; 7 points)
    - V_V 18 Oct 2013 9:53 UTC
      3 points
      Parent
      Yes.
      However, programming environments oriented towards numerical computing tend to use the Mersenne Twister as their default PRNG.
    - Mark_Friedenbach 18 Oct 2013 16:46 UTC
      0 points
      Parent
      Mersenne Twister is the most commonly used algorithm. It is fast, and generates good results. It is not adversarial secure, but in that situation you should be using a cryptographically secure RNG.
      
      Age of the PRNG algorithm has no bearing on this discussion. (If rand() uses something else, it’s for standards-compatability reasons; nobody writing Monti-Carlo algorithms would be using such PRNGs.)
      - A1987dM 19 Oct 2013 12:38 UTC
        0 points
        Parent
        
        If rand() uses something else, it’s for standards-compatability reasons;
        
        Actually the C and POSIX standard impose very few requirements on rand(); it would be entirely possible in principle for it to be based on the Mersenne Twister while still complying to the standards.