SatvikBeri comments on Crypto quant trading: Naive Bayes

SatvikBeri 8 May 2019 18:48 UTC
15 points
Yes, avoiding overfitting is the key problem, and you should expect almost anything to be overfit by default. We spend a lot of time on this (I work w/Alexei). I’m thinking of writing a longer post on preventing overfitting, but these are some key parts:
- Theory. Something that makes economic sense, or has worked in other markets, is more likely to work here
- Components. A strategy made of 4 components, each of which can be independently validated, is a lot more likely to keep working than one black box
- Measuring strategy complexity. If you explore 1,000 possible parameter combinations, that’s less likely to work than if you explore 10.
- Algorithmic decision making. Any manual part of the process introduces a lot of possibilities for overfit.
- Abstraction & reuse. The more you reuse things, the fewer degrees of freedom you have with each idea, and therefore the lower your chance of overfitting.
- John_Maxwell 8 May 2019 21:12 UTC
  2 points
  Parent
  I’d be interested to learn more about the “components” part.
  - SatvikBeri 9 May 2019 3:24 UTC
    1 point
    Parent
    As an example, consider a strategy like “on Wednesdays, the market is more likely to have a large move, and signal XYZ predicts big moves accurately.” You can encode that as an algorithm: trade signal XYZ on Wednesdays. But the algorithm might make money on backtests even if the assumptions are wrong! By examining the individual components rather than just whether the algorithm made money, we get a better idea of whether the strategy works.
    - John_Maxwell 9 May 2019 5:32 UTC
      2 points
      Parent
      Is this an instance of the “theory” bullet point then? Because the probability of the statement “trading signal XYZ works on Wednesdays, because [specific reason]” cannot be higher than the probability of the statement “trading signal XYZ works” (the first statement involves a conjunction).
      - SatvikBeri 9 May 2019 17:18 UTC
        1 point
        Parent
        It’s a combination. The point is to throw out algorithms/parameters that do well on backtests when the assumptions are violated, because those are much more likely to be overfit.