Is it possible to have "too much history" for backtesting?

Posted By: dusktrader

Is it possible to have "too much history" for backtesting? - 09/11/13 13:02

More is not always better.

Is there a rule-of-thumb for how much history we should be backtesting on during different stages of strategy design? If there was, it seems like it should be tied to something dynamic, like the number of Training cycle bars (because "bars" would be relative to the BarPeriod used, right?)

One theory I'm exploring is that too much history fed to Zorro in the early design stage might be bad, for the reason that it creates a too-homogeneous logic. You can't please everyone, and certainly the market is known for its multiple personalities.

Therefore, perhaps a rule-of-thumb should limit initial logic testing to a recent past (a reasonable number of WFO cycles and trades-per-cycle, but not excessive). Then, once a trading edge could be identified and optimized... deeper history could be introduced to see how such a logic would perform under (potentially very) different market personalities.

It seems to me that a short BarPeriod strategy fully optimized to capture the profits over a 10 year period would be hindered by the fact that it has been designed for a one-size-fits-all approach. Instead, it seems likely that current trends and potential Black Swan events would be better captured by a bot that studies and optimizes only a recent history. That would be the market's current or recent personality. Perhaps personalities from years and years ago are largely irrelevant in the grand scheme.

I'm also studying Zorro's DataSlope feature, and wondering if this was designed for exactly the purpose I'm discussing here.

I'd love to hear your thoughts.

Posted By: dusktrader

Re: Is it possible to have "too much history" for backtesting? - 09/11/13 19:48

Inline with this... can you please confirm that in the Performance Report, "bars" refers to a figure that is correlated to the strategy BarPeriod?

I think that it does, because if I adjust BarPeriod in my strategy, the corresponding change in total # of bars between "WFO test cycles" and "Training cycles" seems to be about inline (I say "about" because it is not a direct correlation, but that could be simply because you are building the requested BarPeriod and there would be cases in m1 data where bars could be incomplete, for example)

Anyway, if we have a correlation for BarPeriod in the Performance Report, then it would be possible to make a ratio between this and the total Simulation time. That figure "might" be useful in comparing strategies (as discussed above) and possibly-more-importantly, for building strategies that are tested in a consistent way.

What I mean by this is:
If I have a BarPeriod 120 strategy tested across 3 years (156 weeks) of historical data --
and I want to compare it with a BarPeriod 240 strategy... then I may potentially want to test it across 6 years (312 weeks) of historical data for consistency. It would seem inconsistent to compare the BarPeriod 240 strategy against only 3 years of data.

Likewise, having this type of measure would also provide a way to control the effect I discussed above, where older historical data "could" be somewhat irrelevant in the design stage of a new bot.

So if it is true that "bars" does correlate with BarPeriod, then I would like to also request that you could include a count of the # of weeks in the Performance Report on the "Simulation Period" line... that would give a single time figure that could be used in such a ratio.

THANKS,
and if you have other thoughts on this I'd also love to hear them.

Posted By: Anonymous

Re: Is it possible to have "too much history" for backtesting? - 09/11/13 19:50

This is of course a very hard question without a simple answer. So we can just theorize.

What you call one-size-fits-all approach, I could call a perfect strategy that stands the test of time. If I had one that trades on BarPeriod less than 4h.

The other approach would be to optimize for the recent market condition, and then as soon as the strategy becomes unprofitable, drop it and replace it with another one. This looks so good on paper, until you realize you would need a proper and fast algorithm for that, and of course, that's a problem in itself. You would also need lots of predefined strategies. This approach gets very complex very fast, although I haven't completely lost faith in it.

Actually, I started some research with this second approach some weeks ago, but stumbled into a Zorro bug, so it'll have to wait until Zorro is fixed. That was also my pet project to see what equity curve trading is good for.

There was an article on mechanicalforex site where Daniel explained why he thinks that you should always backtest on at least 10 years of data. But, I can't find it anymore, I'm afraid you'll have to dig yourself. While we all have our own opinions, I tend to mostly agree that a strategy that is profitable in the longer term is better, because there's a greater chance that it will continue to be profitable in the future.

Posted By: dusktrader

Re: Is it possible to have "too much history" for backtesting? - 09/11/13 20:05

Originally Posted By: acidburn

There was an article on mechanicalforex site where Daniel explained why he thinks that you should always backtest on at least 10 years of data. But, I can't find it anymore, I'm afraid you'll have to dig yourself. While we all have our opinions, I tend to mostly agree that strategy that is profitable in longer term is better, because there's a greater chance that it will continue to be profitable in the future.

I'd love to read the article if you do find it.

But I would argue that one-size-does-not-fit-all, and that includes backtesting with "as much data as possible". Personally my theory is that there is an "appropriate amount" of recent data to test with, but that too little is dangerous, and also perhaps too much is as well (at least that's my theory).

Consider how the trader is planning to utilize the bot in the bigger picture. In my case, I am not necessarily trying to build the supreme-holy-grail bot that will lay golden eggs and make me rich enough to buy an island country (though I wouldn't complain if I stumbled upon such a gem).

Instead, I envision more of a dynamic/rolling portfolio of bots (ie, the legion) which actually grows and improves with... my brain. Yep, it would be myself that coaches the legion, and thus myself that would be contributing to its (hopefully ever increasing) success over time.

Therefore, I envision a different route. A route of safety and measured risk. But perhaps one with only "mediocre" bots on the playing field. That's ok, because each player in the legion will be judged, and if they don't make the grade ... they're soon replaced.

So in this scenario, I hope you can understand that consistency is critical... I think it is extremely important to have a consistent way to test and compare bots.

Posted By: Anonymous

Re: Is it possible to have "too much history" for backtesting? - 09/11/13 20:41

I'm absolutely following you. You are basically after the same idea as I'm. Don't lose time chasing elusive holy grail when you can have a dozen mediocre strategies, but cleverly managed to obtain great consistency and thus profit, right?

But, I also see where this might fail horribly, eventhough I'm yet to spend some serious time playing with that approach. I'm already afraid that this might fall into "great on paper, but in reality.." category. But, we'll see... i certainly look forward to exchange practical experience with you.

Here's one example of a terribly simple idea that guarantees great profits if implemented properly. We all know that trend following is profitable. We also know that by design you must return some of the profits when market becomes choppy. So why not implement a filter that detects when market is trending and only THEN run your TF strategy? Can you imagine anything simpler? Obviously, such a strategy would be a great winner and you could finally buy your island country. But, only when you actually attempt to detect market regime you find out that it's not nearly that simple. As soon as you detect that you're in the choppy market you already gave up some profits, and mysteriously market starts trending just when you've decided to stop trading so you miss some more profits.

Why this example? Because I believe the algorithm (the bot manager) that will decide which of your strategies to trade will have all the same problem. You might say, I'll just replace the strategy that is losing money, but how much will you let it lose before you decide to replace it? What if it starts losing right away like Z's are doing in my demo? Should I replace them already or let them lose even more money? Some tough questions, right?

To end on the positive note, I'm yet to see if what I wrote above holds true. At this point all that is just a speculation based on the experience gathered from my previous work. As I wrote elsewhere, I'm reasonably convinced that adding even a losing strategy to the mix can be positive for overall result, so the idea of having the legion is basically sound. I'm just afraid that the devil is in details, as always...

Posted By: swingtraderkk

Re: Is it possible to have "too much history" for backtesting? - 09/11/13 21:46

some thoughts from a higher timeframe trader:

Do not confuse backtesting over long periods to establish positive expectancy and optimisation to tweak a little more profitability by keeping your parameters in tune with current conditions.

What do we mean when we say markets change over time? Markets trend, range or chop so are we saying that the % of time they spend in these states changes? I think that is what we mean, so we want to know how well the strategy copes not only when the mix suits our approach but more importantly how badly it does when the market mix doesn't suit. I'm leaning for more history.

A completely different problem we have is defining our market mix, what mix of trend, range and chop gave periods of high profitability or low profitability. If we define this, we can look into mechanisms to switch on or off the bot appropriately, be they filters or equity curve trading. Only then would I look into optimising parameters, and I would be wary of where the breaks in my WFO cycles fall. 2008 is the type of year where I would want the bot switched on or off not contributing sample data for optimising parameters in 2011 for example. I'm still for more history, but cautious about optimisation.

Finally, beware permanent changes over time when backtesting. A 0.1 change in the S&P in 2007 is a more significant percentage change than 0.1 in 2013, this can affect stops, targets, certain indicators, and equity. So I'm still with more history, but algos need to specifically allow for permanent changes.

Edit: was composing this when acid posted, apologies for making similar points. Also I wouldn't expect my comments to be valid on shorter time frames, but I have little experience of this.

Posted By: dusktrader

Re: Is it possible to have "too much history" for backtesting? - 09/12/13 10:58

(I will comment more in just a bit on the above, have to shift gears at the moment...)

After much toying around, I think what I'm looking for might be as simple as this:

Code:

info("%d",(NumBars-LookBack)/BarPeriod);

(note: I prefer it to print in the test log with quit() but i can't figure out how to make quit() display a variable)

What I want is simply a relationship between the BarPeriod and # of bars used in the simulation. I "think" this gives that. (I still have some more tests to verify.) Basically I want an indicator, for comparison purposes, to know if the ratio of BarPeriod:NumBars is dramatically different between strategies.

Posted By: Anonymous

Re: Is it possible to have "too much history" for backtesting? - 09/12/13 11:04

Originally Posted By: dusktrader

What I mean by this is:
If I have a BarPeriod 120 strategy tested across 3 years (156 weeks) of historical data --
and I want to compare it with a BarPeriod 240 strategy... then I may potentially want to test it across 6 years (312 weeks) of historical data for consistency. It would seem inconsistent to compare the BarPeriod 240 strategy against only 3 years of data.

I forgot to add that I completely agree with the above reasoning. If we can agree that 10 years of data should be a minimum for say 4hr bars, than it should also be true that if testing say 15min strategy, backtesting on about 7.5 months should be equivalent.

Edit: and after a quick look, your formula makes sense, too

Posted By: dusktrader

Re: Is it possible to have "too much history" for backtesting? - 09/12/13 13:16

Originally Posted By: acidburn

Here's one example of a terribly simple idea that guarantees great profits if implemented properly. We all know that trend following is profitable. We also know that by design you must return some of the profits when market becomes choppy. So why not implement a filter that detects when market is trending and only THEN run your TF strategy? Can you imagine anything simpler? Obviously, such a strategy would be a great winner and you could finally buy your island country. But, only when you actually attempt to detect market regime you find out that it's not nearly that simple. As soon as you detect that you're in the choppy market you already gave up some profits, and mysteriously market starts trending just when you've decided to stop trading so you miss some more profits.

I wanted to comment on this and just say -- in my vision of the legion, none of the active bots would be turned on and off, aside from strategy logic already programmed into the bot. For example filters or phantom trading -- but they would already be statistically measured in the backtesting phase, before they got inducted to the legion.

So it is not my position as the human to tinker with bots once they are in the legion - only to measure their ongoing performance stats. As a sidenote, I'm hoping that one day my skills would evolve to a point that I could actually program the bots to measure themselves and deactivate on their own. For example, after the testing/approval/induction phase, I as the human would set acceptable thresholds for each bot, perhaps in a .ini file. Then after each trade, the bot would calculate metrics and determine whether or not they had become "out-of-spec" for those thresholds and then act accordingly (note: I have not yet defined what "out-of-spec" means to me).

Quote:

Why this example? Because I believe the algorithm (the bot manager) that will decide which of your strategies to trade will have all the same problem. You might say, I'll just replace the strategy that is losing money, but how much will you let it lose before you decide to replace it? What if it starts losing right away like Z's are doing in my demo? Should I replace them already or let them lose even more money? Some tough questions, right?

I think you just made that more complicated than it has to be. My plan was to simply design the bots and then establish acceptable performance boundaries. At some yet-to-be-defined interval, I would test the bots performance in-the-field, and if it showed that it was performing out-of-spec (also yet-to-be-defined), then it would simply be removed and the next contender bot added to the legion. "Losing money", in and of itself, is not a reason to remove a bot. But say for example, if the bot loses more than the statistically-measured historical Max-Avg-Excursion, then that could be a trigger to say it has fallen out-of-spec.

I believe the bots need some sort of infrastructure to be part of the legion, lets call it their credentials. Zorro already brings much of this infrastructure to the table (proper trading interface to the broker, error-free order execution, performance measures, etc) but there will always be other credentials that the trader may individually want in every bot. For example, I may want bots to always close trades before the weekend, or to restrict the number of trades that could be open at once, etc. Steve Hopwood used a similar philosophy that enabled him to churn out new bots at lightning speed because the only thing he needed to do was add the actual logic programming and then flip a few switches on/off in the infrastructure. So I'm borrowing from his design model in that.

Originally Posted By: swingtraderkk

I agree with you. My only bone of contention with that line of thinking is that "more history" is too vague. And it doesn't provide an apples-to-apples comparison when you change something like the BarPeriod of the logic. So while you may be right in the thinking that max history is best, it could still be misleading if you aren't aware of that relationship between BarPeriod and NumBars, for example. One example I can think of offhand is in the Z strategies... I believe it is the EURCHF one... it doesn't have nearly as much history on the backtest as the others. That fact should be taken into account when comparing those strategies, for example.

Quote:

A completely different problem we have is defining our market mix, what mix of trend, range and chop gave periods of high profitability or low profitability. If we define this, we can look into mechanisms to switch on or off the bot appropriately, be they filters or equity curve trading. Only then would I look into optimising parameters, and I would be wary of where the breaks in my WFO cycles fall. 2008 is the type of year where I would want the bot switched on or off not contributing sample data for optimising parameters in 2011 for example. I'm still for more history, but cautious about optimisation.

I'm still trying to understand the best way to utilize optimization myself. One area I've struggled with recently is being able to identify which parameters are even appropriate to be optimized. I think perhaps only core-logic params should be... for example, only those parameters that would produce the "broad hill" the manual talks about in the optimization profile.

Quote:

Finally, beware permanent changes over time when backtesting. A 0.1 change in the S&P in 2007 is a more significant percentage change than 0.1 in 2013, this can affect stops, targets, certain indicators, and equity. So I'm still with more history, but algos need to specifically allow for permanent changes.

Edit: was composing this when acid posted, apologies for making similar points. Also I wouldn't expect my comments to be valid on shorter time frames, but I have little experience of this.

Agreed, the market changes in permanent ways over time. This is partly why I'm exploring the idea of "less history" rather than "max history". If, over the course of several years, the market personality changes in a significant way (take the evolution to decimalization as an example in equity markets), then perhaps it doesn't make sense to fit the bot logic to those periods. You could force-fit the bot, but then you have just watered down its potential in current markets. That's my theory anyway. I'd rather have a more dynamic way to deal with the permanent market changes.

Btw I'm also a longer-term trader. I've tried the complete range, from 1440 down to 1 minute over the years. I always come back to 240, so that is my preferred (discretionary) timeframe. However, I don't believe it's wise to set limits on automated trading. I intend to test a wide gamut of BarPeriods and see what I can learn from them. It may be that in my legion I trade with both short-term and long-term bots; that seems like a good idea even for diversification purposes.

Posted By: dusktrader

Re: Is it possible to have "too much history" for backtesting? - 09/13/13 13:08

Here's some friendlier code that prints in the message window:

Code:

if(is(EXITRUN)) printf("\nHistory ratio: %d",(NumBars-LookBack)/BarPeriod);

I ran some quick tests on a strategy with LookBack 500 from 2003-2012 (a full 10 year period on m1 data):
BarPeriod 15 --> 16091
BarPeriod 60 --> 1000
BarPeriod 240 --> 60

I am working now on defining what I personally think is reasonable. If I said that 10 years was a reasonable period to test a 240 strategy, then I would simply use that figure of 60 to align other strategies to that figure as well, for comparison purposes:
if # is too small, add more history / if too big, reduce history

Posted By: swingtraderkk

Re: Is it possible to have "too much history" for backtesting? - 09/13/13 16:58

dusktrader,

I'm still confused as to what the objective of the "reasonable period" is?

reasonable period to backtest for positive expectancy, or reasonable period for optimisation purposes?

Posted By: dusktrader

Re: Is it possible to have "too much history" for backtesting? - 09/13/13 18:05

Well I suppose "reasonable period" is subjective, unfortunately. I hate that myself, because I don't like introducing subjectivity. However, my theory is that for COMPARISON purposes, you must ensure consistency.

For example,
I could look at a Performance Report and see that a bot is "good" or maybe "excellent". Those would be subjective evaluations. However, my theory is that I could not look at two bots and know -- in an objective way -- if one was better than the other... at least without having knowledge that they were developed with the same amount of history. One way I think you could answer that question would be knowing the relationship of the BarPeriod with the total amount of data.

Not knowing that relationship means you would also affect the statistical accuracy of the logic, I believe (note: I confess I am NOT a statistics or math expert!). The reason why is because of this example:
240minute barperiod strategy across 10 years data
is not comparable to 15minute barperiod strategy across 10 years

There would be 16x more opportunities for trades in the 15minute strategy across that dataset. Therefore, it is not a fair comparison with the 240minute strategy across the SAME dataset.

Now don't get me wrong... I'm not advocating necessarily that it's bad to get rid or avoid testing on max history available. I'm only suggesting that perhaps it is not best for COMPARISON purposes.

If you believe that max history is the best philosophy, then one way you could accomplish that is like this:
1) determine the History ratio of your longest BarPeriod strategy across the max history available
2) Adjust testing history on all other BarPeriods to match

This provides a more similar way to COMPARE different bots without introducing the unfair bias of additional trade opportunity history. You could also "maybe" argue that more history is less relevant and therefore potentially "bad".

Separately, in my design process, I'm comparing first with the limited/equalized dataset but also testing with max history. The max history test is just for my information, but may not be appropriate for comparison purposes.

----
Btw, to answer your question -- I think I mean "reasonable period for optimization purposes" ... because in the stage where I would be limiting history, it is mostly for the purpose of finding the best trading logic.

Posted By: Anonymous

Re: Is it possible to have "too much history" for backtesting? - 09/14/13 11:16

Seems that Daniel decided to contribute to this discussion with the latest post on his blog

http://mechanicalforex.com/2013/09/long-...fitability.html

Edit:

And here's one of the older articles that dealt with the topic:

http://mechanicalforex.com/2010/09/why-are-five-years-statistically.html

Unfortunately, it's missing important graphs.

Posted By: swingtraderkk

Re: Is it possible to have "too much history" for backtesting? - 09/16/13 10:16

dusktrader,

I can agree with almost everything you are saying - and I have my own issues re ranking bots. Where you lose me is when you abstract the bars from the actual time period you are looking at.

1000 240 min bars = 16000 15 mins bars = the exact same history = The exact same time period you wish to have a return in so you can live = the exact same time period that the market changes in.

The types of market changes that make strategies/algos less profitable occur over the exact same periods of years whether you measure them in 240 min bars or 15 min bars.

Implicit in your logic is an assumption that the permanent and cyclical changes you wish to include (or avoid) in your backtesting happen 16 times faster on a 15 min chart compared to a 240 min chart. This does not make sense to me.

I can understand your arguments for less history in general, (I'm not convinced though ;-)), but if you decide that x years is the period you will test for your 15 mins algo, then I think the only valid comparison for the 240 min algo is the same x years.

Posted By: dusktrader

Re: Is it possible to have "too much history" for backtesting? - 09/17/13 19:54

First, I should preface that "I don't know" if this is a good idea. I'm just throwing it out there, and testing it myself. It might be a terrible idea, or maybe it has some value? I'm not sure yet.

You make a good argument about the overall market's (permanent) changes. I agree with that, that bots should be robust enough to hopefully go with that flow.

My intent was to find some sort of common ground to compare strategies of different barperiods. On this level, I'm not looking at them from a timeframe perspective as much as just barperiods.

It seems like a fractal problem. Each new bar presented is another opportunity to test the trading logic. On some level, it doesn't matter how many ticks it took to build that bar.

I agree that judging a strategy's worth should not be done necessarily by restricting possible testing history. But for comparing strategy A with strategy B, of different barperiod logics, perhaps there is some value to equalizing them on some level. But it seems like you'd have to keep it within bounds... for example, reduce data but ensure enough trades in both strategies.

Again, I'm not sure if it's a good idea. For example, I just jumped from testing a 287-barperiod strategy to now a 15-barperiod strategy. It seemed pretty much useless to equalize the history ratio in that example.