Slow backtest performance using R

Posted By: chrisr

Slow backtest performance using R - 07/22/15 14:02

Hi, first of all thanks for the R integration => works perfect!

Is there a way to improve the backtesting performance using "Test"/"Train" modes?

Cheers
C.
Posted By: GPEngine

Re: Slow backtest performance using R - 07/22/15 14:49

- Make as few calls out from Zorro to R as necessary. Consolidate fixed sequences of commands into functions written on the R side.
- For training, library "parallel" offers some parallelization and is supported by some modeling types
- If desperate and/or ambitious, use Zorro Train mode only to produce the csv files, then produce Rdata for each cycle completely outside of Zorro. Only Test mode uses RBridge and simply expects Rdata files to already exist.
- In that case, you can additionally train on Linux and take advantage of library "multicore" and "doMC", which library "caret" loves.
Posted By: GPEngine

Re: Slow backtest performance using R - 07/22/15 15:12

If its the actually time to build the models within R that bothers you, what can I say. Machine Learning is not a simple calculation and often involves executing dead and inferior branches that are not part of the eventual solution.

Check your data dimensions.
- typically, modeling time depends on the number of training examples. Use R "sample" method as a sledgehammer for reducing it.
- typically, modeling time depends on the number of features. Detect near-zero-variance features, use covariance matrix to detect identical features, or use advanced feature selection such as recursive feature elimination -- but that has its own cost.
Posted By: GPEngine

Re: Slow backtest performance using R - 07/22/15 15:16

Last thing, for the last point, be careful not to introduce future leakage by, say, selecting features for the 1st cycle's model based on performance the 10th cycle. wink
Posted By: jcl

Re: Slow backtest performance using R - 07/23/15 08:34

From our experiments so far, about 90% of the training time is for generating the models in R. This will be faster in the next Zorro update that can run several R sessions in parallel, using multiple CPU cores.

Using covariance for eliminating features is a good idea.
Posted By: GPEngine

Re: Slow backtest performance using R - 07/24/15 07:33

For feature selection also look at PCA and ICA.
© 2024 lite-C Forums