I disagree. I do, of course, run a backtest before trading, and ultimately with simulated fees and spread, yes. But when looking for new alphas, I don’t do that first.
I don’t just eye the histograms. I compute summary statistics. I also plot two at once to see where they differ, and maybe plot a bell curve on top of the histogram to see where it differs.
It’s so easy to get in a cycle of optimizing backtests but this will almost always overfit. I want to be sure the effect exists first, in isolation, before I try to trade it.
We’re trying to reverse-engineer the behavior of a “random variable” (hopefully plus some exploitable non-randomness) from its outputs. Maybe this is easier with an immature, less-efficient market like crypto (in which case, I want to figure out how to trade it too), but there is so much noise in what I’m trading, that the exploitable effects are very hard to see in a backtest. The signal-to-noise ratio is too low.
I think the best way to illustrate this would be with a simulation. Model the price as a random walk with an appropriate distribution (normal-ish, depending on how realistic you want to be), plus some signal (in the simple case of collecting the risk premium, just a constant drift) and backtest that. You’ll find that with realistic parameters, the cumulative returns from a backtest are highly variable, depending on your random seed. I just can’t trust it.
I don’t know if you’ve read HPMOR yet, but remember the scene with the 2-4-6 test? Read it if you haven’t yet, I don’t want to spoil it for anyone.
It’s a good illustration of what’s required of scientific thinking. How do you go about building an accurate model of something? What cognitive biases get in the way?
Suppose I ran one of those simulations, and gave you the output as price data. Remember, this isn’t a stock, and I know exactly what the underlying distribution is, because I made it on a computer. Maybe I added some exploitable non-randomness. How confident would you be in knowing that distribution from running a backtest? How would you find the exploit? How confident could you be in your bet size? If I then took your strategy and ran it using fresh data from the simulator (with the same distribution but a new random seed), how confident could you be that your strategy would perform well?
You don’t have to be 100% accurate to make money, but you have to be accurate enough, and more accurate is better. When the noise is very loud, it takes a lot of data to infer a distribution like that with much accuracy. Often more than is available from a single instrument.
I disagree. I do, of course, run a backtest before trading, and ultimately with simulated fees and spread, yes. But when looking for new alphas, I don’t do that first.
I don’t just eye the histograms. I compute summary statistics. I also plot two at once to see where they differ, and maybe plot a bell curve on top of the histogram to see where it differs.
It’s so easy to get in a cycle of optimizing backtests but this will almost always overfit. I want to be sure the effect exists first, in isolation, before I try to trade it.
We’re trying to reverse-engineer the behavior of a “random variable” (hopefully plus some exploitable non-randomness) from its outputs. Maybe this is easier with an immature, less-efficient market like crypto (in which case, I want to figure out how to trade it too), but there is so much noise in what I’m trading, that the exploitable effects are very hard to see in a backtest. The signal-to-noise ratio is too low.
I think the best way to illustrate this would be with a simulation. Model the price as a random walk with an appropriate distribution (normal-ish, depending on how realistic you want to be), plus some signal (in the simple case of collecting the risk premium, just a constant drift) and backtest that. You’ll find that with realistic parameters, the cumulative returns from a backtest are highly variable, depending on your random seed. I just can’t trust it.
I don’t know if you’ve read HPMOR yet, but remember the scene with the 2-4-6 test? Read it if you haven’t yet, I don’t want to spoil it for anyone.
It’s a good illustration of what’s required of scientific thinking. How do you go about building an accurate model of something? What cognitive biases get in the way?
Suppose I ran one of those simulations, and gave you the output as price data. Remember, this isn’t a stock, and I know exactly what the underlying distribution is, because I made it on a computer. Maybe I added some exploitable non-randomness. How confident would you be in knowing that distribution from running a backtest? How would you find the exploit? How confident could you be in your bet size? If I then took your strategy and ran it using fresh data from the simulator (with the same distribution but a new random seed), how confident could you be that your strategy would perform well?
You don’t have to be 100% accurate to make money, but you have to be accurate enough, and more accurate is better. When the noise is very loud, it takes a lot of data to infer a distribution like that with much accuracy. Often more than is available from a single instrument.