Loading the options chain with additional data

Posted By: asdvao

Loading the options chain with additional data - 09/18/17 08:49

Hi, new Zorro user here. The software seems quite promising so far, but I'm struggling with how to get the data I want to work with in.

I have a few ideas for options trading strategies that I would like to try out. The historical data that I have already has some of the greeks and other parameters calculated by the source, plus, I've added some of my own calculations. This in total amounts to over a dozen of potential data columns (not necessarily all of which will be useful).

I realize that the Zorro's T8 (CONTRACT) format supports 2 extra fields except for the price data. Would it be possible to load the data consisting of "richer" records?
Could the system be extended with a custom data type used instead of the CONTRACT, so that the classical Zorro backtesting functionality could be used - but with more data available for the decisions (and machine learning)? If so, what would need to be extended?
Posted By: jcl

Re: Loading the options chain with additional data - 09/18/17 11:07

The CONTRACT struct has a fixed structure. For more data, add datasets and store the data in them. But I would not store greeks. Better calculate the greeks directly.
Posted By: asdvao

Re: Loading the options chain with additional data - 09/19/17 10:33

If using datasets, are you suggesting simply loading the dataset, and (for the backtest) looping through it, and then retrieving individual values via dataVar/dataInt functions?

As for the greeks, my thought was that it may be beneficial to store greeks to speed up the backtest. Furthermore, I already have a dataset containing those, as well as some other pieces of information obtained from the broker, so I was considering testing a few things using those. Can you explain your thoughts why you would prefer to calculate the greeks on the fly?
Posted By: asdvao

Re: Loading the options chain with additional data - 09/27/17 15:53

What would be the appropriate way to backtest while using additional data contained in a data set?

Is it somehow possible to run the backtest directly off the dataset? That is, is it possible to use the contractUpdate on the dataset containing more data columns? In that case, is it correct to presume that the first few fields have to be in the same order as in the T8/CONTRACT struct definition?

If not, would the most appropriate way be to load the full dataset, and save its subset containing the data consisting of the fields from the CONTRACT struct to a T8 file? Then, each invocation of the run function during a backtest, one would need to look up the additional information from the backtest?
Posted By: Zheka

Re: Loading the options chain with additional data - 09/28/17 00:02

I have the exact same problem and questions! IB streams pre-calculated greeks, and I would guestimate that using those is more efficient than calculating locally for a number of strikes.

I too wanted to suggest to add 2-3 more fields for the greeks to the T8/Contract struct definition (or make it extendable by a user) and verify that IB plug-in can also populate those...Backtesting would be more efficient and RT too.
Posted By: jcl

Re: Loading the options chain with additional data - 09/28/17 06:05

You have no limit for the data. When your options data is in a dataset options.t8, put the greeks and other data in a parallel dataset greeks.t99. To retrieve all greeks that belong to the current options chain, use dataFind() with the current date. It returns the row with the additional data for the first contract of the chain. The other contracts are in the next rows.

We will make this easier in a future version by storing the dataset row number of a found contract.
Posted By: asdvao

Re: Loading the options chain with additional data - 09/28/17 11:45

Thanks jcl, this clears it up somewhat.

Suppose I want to try a simple strategy using the additional data. Suppose I have options.t8 and options_with_additional.t99, where the latter is simply a superset of the former (i.e. the T99 also includes, beside additional data, the data that's already in T8).

I'd first load up both datasets, and update the contract chain with contractUpdate to the T8 dataset.

After that, I need to find the date for which I'm wanting to trade, and get the appropriate rows from the dataset using dataFind and the bar date (wdate?). Then, I'd loop thru the dataset, and figure out which contract(s) I want to trade. With this information, for each contract I want to trade, I'd create the CONTRACT struct, and try to find a matching contract within the chain (contract() function), and enter the appropriate positions.

Would this be a sensible approach?

Alternatively, from what I can figure from your answer, you were suggesting first going thru the contracts, and then use dataFind to use the appropriate greeks. In this approach, it's not clear how would I loop thru all the possible contracts to identify the ones that I want to trade.




Posted By: Zheka

Re: Loading the options chain with additional data - 09/28/17 14:05

I somehow missed on the might of free-form datasets! Got into details and here are the questions:

1. Would "99" in suggested "greeks.t99" typically be a Handle?
2.
Quote:
In [Test] or [Train] mode the chain is copied from a dataset containing CONTRACT records, which is either automatically obtained from a Name.t8 file, or copied from a previously loaded dataset with the given Handle

What should be the format of records in dataset with a list of CONTRACT structs to be used in ContractUpdate()?

3. You seem to favor using big single files to store all of the options/ greeks data for an asset. Why? Would storing data in many files e.g. 'by expiry' be less efficient in practice?
Or is it because of the file name convention used?

4. Do i understand correctly that *.t8 files would update in real-time but a "dataset" would not?

IB streams greeks and IV together with prices in RT
https://interactivebrokers.github.io/tws-api/option_computations.html#gsc.tab=0

So, why not just add 4-5 fields to the *.t8 struct? (and avoid having to use several datasets, locating matching records,etc)
Posted By: Zheka

Re: Loading the options chain with additional data - 09/28/17 14:16

I also would like to clarify the example on the "Datasets" page in the manual:
Quote:
// parse iVolatility historical option chain data and store the resulting array
void main()
{
string Format = "+,,%m/%d/%y,,,i,f,s,s,f,f,f,,f";
int records = dataParse(1,Format,"iVolatility_SPY_2014_1.csv");
records += dataParse(1,Format,"iVolatility_SPY_2014_2.csv");
records += dataParse(1,Format,"iVolatility_SPY_2015_1.csv");
records += dataParse(1,Format,"iVolatility_SPY_2015_2.csv");
records += dataParse(1,Format,"iVolatility_SPY_2016_1.csv");
printf("n%d records parsed",records);
dataSave(1,"SPY_Options.t8");
}

The Format is different from the *.t8 struct. How would it map to it?
Posted By: Zheka

Re: Loading the options chain with additional data - 10/02/17 09:36

Hi, Jcl,

Would greatly appreciate your answers!

Best
Z.
Posted By: jcl

Re: Loading the options chain with additional data - 10/02/17 09:55

The ".t8" just means a dataset file with 8 fields plus the date. It is not always the CONTRACT format. For avoiding confusion, we'll modify that example.

To the other questions: Yes, splitting options data over many datasets by their expiration dates would be less effective I suppose, if not to say crazy. Historical files are either not split, or split by years. Files or datasets are not updated unless you update them. We can not add 4-5 fields to the *.t8 file because it would then be a *.t12-*.t13 file.
Posted By: asdvao

Re: Loading the options chain with additional data - 10/02/17 10:44

When I have an additional dataset loaded containing the greeks, how would you suggest finding the row related to the contract that I would like to trade?

Suppose I have an options chain for that date loaded (from T8 data), and I have a parallel dataset containing additional data.

1. How can I loop thru the various available contracts in the current chain loaded from T8 data? The examples that I found just suggest an option to select a given contract in the chain, given its parameters (strike, CALL/PUT, etc).
2. How can I relate each of those to the appropriate row in my aditional dataset?
Posted By: jcl

Re: Loading the options chain with additional data - 10/02/17 10:57

The simplest way would be using the ContractRow variable that contains the dataset row of the selected contract. It is available in the latest Zorro beta version.
Posted By: Zheka

Re: Loading the options chain with additional data - 10/02/17 12:26

Jcl,
sorry for insisting, but I do want to get better understanding of things.

1. IB is the only Zorro's integration to trade options (at the moment anyway), and IB streams greeks via API.

- Why would you not want to receive and store them in RT together with prices in a Contract structure?
There does not seem to be extra overhead related to receiving them alongside prices...Is it because you tried them and judged them to be inaccurate?
Is it because calculating them locally would still be faster? (it would definitely not be so for numerous backtests during development).
PLease share your thinking.

2. Why would we prefer to store 4-5 fields (option contract specs) TWICE - once in a Contract struct and once in a related dataset with Greeks, - AND waste time looping through these fields to locate the needed line in a dataset(with a ContractRow)?

What's the tradeoff you are facing that necessitates sticking to an 8-field contract structure rather than an 12-field one?

Thanks for your patience and clarifications.

Best
Z.
Posted By: AndrewAMD

Re: Loading the options chain with additional data - 10/02/17 12:47

For the record, the Ally Invest plugin also trades options. I wrote it as a community contribution.
Posted By: jcl

Re: Loading the options chain with additional data - 10/02/17 14:46

Historical data is very large. Therefore the size of the used structs is not irrelevant. We had to be carefully to only include fields that are really needed and used. No user so far wanted greek fields in historical data. Some vendors do offer history with greeks, but at twice the price of normal options history. I hope this answers the question.
Posted By: Zheka

Re: Loading the options chain with additional data - 10/02/17 17:20

If someone got the point of backtesting an algorithmic options
strategy and obtained data - the one would "normally" use the greeks, rather sooner then later.
Having to store and process same data twice just magnifies the "size" problem for a typical user.

Some more questions:
1. What should the sequence of fields in a dataset be for the ContractRaw to work?
2.
Quote:
contract (int Type, int Days, var Strike): CONTRACT*

Will ContractRaw get set to the contract found with this function? If not, then what's the suggested way of setting it upon calling this function?

3. Using "ContractUpdate()":
- to improve efficiency, it would be great to be able to limit retrieved chains to "max N-days" from now and "X-stds" strikes from current price, or do this manually - as per manual.
Quote:
2.
In [Test] or [Train] mode the chain is copied from a dataset containing CONTRACT records, which is either automatically obtained from a Name.t8 file, or copied from a previously loaded dataset with the given Handle

What should be the sequence of fields in such dataset?

4. What's the principle of sequencing records in a chain - from 1..to NumContracts?

5. What CONTRACT Struct fields are populated by ContractPrice()in RT?
Will the price of underlying get updated?
I am trading spot FX, but would like to implement a strategy with options on futures..

Best
Z.
Posted By: jcl

Re: Loading the options chain with additional data - 10/02/17 17:37

We've written all sorts of options systems for clients, but never had to "store and process some data twice" - and I would not hire a programmer who does that without need. Yes, contract() sets ContractRow. The sequence of fields in a CONTRACT struct can be found in trading.h. The sequence of options in an options chain is determined by the broker or the historical data. contractPrice() sets the contract ask and bid price. If the underlying is also updated depends on the broker; IB does it, other brokers might not.
Posted By: Zheka

Re: Loading the options chain with additional data - 10/02/17 20:18

Originally Posted By: jcl
We've written all sorts of options systems for clients, but never had to "store and process some data twice"

I truly seek to understand and would really appreciate your
thinking, experience, best practice, and considerations that seem obvious to you, but might not to us.

Quote:
Yes, contract() sets ContractRow.

It is documented for (type,Expiry,strike) but not for (type,Days,strike).

Quote:
The sequence of fields in a CONTRACT struct can be found in trading.h.

In what sequence do fields need to be in the 'greeks' dataset?
(for the ContractRaw to work). Same as in CONTRACT?

Quote:
If the underlying is also updated depends on the broker; IB does it

Ok. IB also streams greeks, and your opinion/experience with this - if any - would be very helpful and save time.
Is this something you worth considering?
Posted By: jcl

Re: Loading the options chain with additional data - 10/03/17 08:09

From your questions it seems that something was still confused or misunderstood. Your greeks dataset is created and used by you. So I do not know its sequence of fields, which is up to you. How to create and use a dataset is described in the manual. The only number you need is ContractRow, for getting the right row.

I have no experiences with the greeks from IB, so I can not comment on them. They are used for manual trading, so I assume that they are ok.
Posted By: Zheka

Re: Loading the options chain with additional data - 10/03/17 13:11

My understanding is as follows:

- each row in a 'greeks' dataset must contain contract identifier fields for expiry,strike,type
- having found the correct contract in a list of *.t8 records,
Zorro has to look up through these fields in a 'greeks' dataset to get the correct ContractRow.

Both *.t8 records and the "greeks" dataset would most probably derive from the same file with historical options data, and have similar sequences of fields - but that's not given.

So, where am I wrong and how does it actually happen?
Posted By: jcl

Re: Loading the options chain with additional data - 10/03/17 14:59

I do not really understand what you want to do with contract identifiers in your greeks dataset. The contract identifiers must be stored in a .t8 file of CONTRACT structs, not in your greeks file. You store both datasets in same order and look up the greeks of a selected contract with the ContractRow variable that gives you the dataset row. What exactly is the problem?

Posted By: Zheka

Re: Loading the options chain with additional data - 10/03/17 21:15

I've re-read the manual one more time, but still cannot grasp how ContractRow can magically find the correct row in a separate, user-defined/freehand dataset,unless both *.t8 and *greeks* datasets have certain fields in common (besides timestamp). How?

An example of how a 'greeks' dataset can typically be structured would really help.
Posted By: jcl

Re: Loading the options chain with additional data - 10/04/17 07:07

dataSet(2,CurrentRow,0,Delta);
dataSet(2,CurrentRow,1,Theta);

...


Delta = dataVar(2,ContractRow,0);
Theta = dataVar(2,ContractRow,1);
Posted By: Zheka

Re: Loading the options chain with additional data - 10/05/17 15:13

A! First getting a ContractRow and then using it to save data to the dataset!

Thanks!

I believe this should be documented, with an example of how to use it when parsing a big options data file into price/*.t8 and greeks/etc files.
Posted By: asdvao

Re: Loading the options chain with additional data - 10/18/17 09:20

Thanks for all the clarifications JCL, I was able to nicely connect my additional data with the T8 data, and retrieve as desired.

I'm able to access the appropriate entries with the dataVar and similar functions, but it seems that it would be useful (mainly to enhance code manageability), to retrieve the whole record within the dataset as a struct.

Say I have a struct CONTRACT_EXT, containing the definition of my record. Is it possible to retrieve the address of the dataset? If it were possible, I could do something like

CONTRACT_EXT * c = baseDatasetAddress + ContractRow;

And then access the data more intuitively, such as c->impl_vol, as opposed to using something like dataVar(2, ContractRow, 7);

Furthermore, is it somehow possible to loop thru all the contracts in a chain for the current bar date?
Posted By: jcl

Re: Loading the options chain with additional data - 10/18/17 10:31

Yes, dataStr delivers the pointer to the record.

CONTRACT_EXT * c = dataStr(Handle,ContractRow,0);
© 2024 lite-C Forums