OK, thanks for the reply. That's an important distinction.

What conclusions could I draw if I used a relatively large number of WFO cycles, with say only 30-40 trades per asset, and got good performance in the WFO test? Obviously the concern is that the strategy would be curve fitted to each training cycle, but in my mind this shouldn't matter because each test cycle that makes up the WFO test is out of sample.

Therefore it seems to make sense to test various numbers of WFO cycles and find the number that gives good performance and is stable when the numbers adjacent to it are tested (as we do when optimizing the strategy's parameters). This number would then be used to set the re-train interval in live trading.

Is this a statistically robust approach to WFO testing?