In this article we are going to revisit the concept of building a trading strategy backtest based on mean reverting, co-integrated pairs of stocks. So to restate the theory, stocks that are statistically co-integrated move in a way that means when their prices start to diverge by a certain amount (i.e. the spread between the 2 stocks prices increases), we would expect that divergence to

eventually revert back to the mean. In this instance we would look to sell the outperforming stock,and buy the under performing stock in our expectance that the under performing stock would eventually “catch up” with the overpeforming stock and rise in price, or vice versa the overperforming stock would in time suffer from the same downward pressure of the underperforming stock and fall in relative value.

Hence, pairs trading is a market neutral trading strategy enabling traders to profit from virtually any market conditions: uptrend, downtrend, or sideways movement.

So in our search for co-integrated stocks, economic theory would suggest that we are more likley to find pairs of stocks that are driven by the same factors, if we search for pairs that are drawn from similar/the same industry. After all, it is logical to expect

2 stocks in the technology sector that produce similar products, to be at the mercy of the same general ups and downs of the industry environment. So for this particular backtest I will be scraping a load of tech stock tickers from the web and then using Pandas data-reader to download daily data for those stocks. (n.b. Pandas data-reader has been facing some serious issues recently and in effect the Yahoo and Google APIs are no longer fit for purpose and are no longer working. Instead I shall use “iex” provider, which offers daily data for a maximum of a 5 year historical period. I see 5 years as being more than long enough for our purposes.)

I started this blog a few years ago, and one of my very first blog series was on this exact subject matter – mean reversion based pairs trading. So why approach it again and repeat myself? Well this time I am going to add a few more elements that were not present in the initial blog series.

I am going to

- Create a heatmap of co-integrated pairs so we can visually see the level of cointegration between any and all pairs that we are concerning ourselves with.
- Introduce the concept of a “Kalman Filter” when considering the spread series which will give us our trading signal.
- Add the concept of a “training set” of data, and a “test set” of data – seperating the two. What this helps us avoid is “look back” bias, whereby we would incorrectly test co-integration over the full set of data that we have, and also run our backtest of trading over the same data. Afetr all, how would we be able to both

run a co-integration test over a time period to select our trading pairs, and then travel back in time and carry out trades over the same time period. That’sjust not possible, however it is one subtle pitfall that many, many systematic traders fall into.

So what is a Kalman Filter? Well I this site (click here) explains the concept and shows examples in the clearest manner that I have yet to find while searching online. It states the following:

You can use a Kalman filter in any place where you have uncertain information about some dynamic system, and you can make an educated guess about what the system is going to do next. Even if messy reality comes along and interferes with the clean motion you guessed about, the Kalman filter will often do a very good job of figuring out what actually happened. And it can take advantage of correlations between crazy phenomena that you maybe wouldn’t have thought to exploit!

Kalman filters are ideal for systems which are continuously changing. They have the advantage that they are light on memory (they don’t need to keep any history other than the previous state), and they are very fast, making them well suited for real time problems and embedded systems.

So lets start to import the relevant modules we will need for our strategy backtest:

##### import the necessary modules and set chart style#### import numpy as np import pandas as pd import seaborn as sns import matplotlib as mpl mpl.style.use('bmh') import pandas_datareader.data as web import matplotlib.pylab as plt from datetime import datetime import statsmodels.api as sm from pykalman import KalmanFilter from math import sqrt

And lets use the Pandas and the data-reader module to scrape the relevant tech stock tickers from the www.marketwatch.com website.

#scrape html from website and store 3rd DataFrame as our stock tickers - this is dictated to us by the structure of the html stock_list = pd.read_html("https://www.marketwatch.com/tools/industry/stocklist.asp?bcind_ind=9535&bcind_period=3mo")[3] #convert the DataFrame of stocks into a list so we can easily iterate over it stocks = stock_list[1].dropna()[1:].tolist() #set empty list o hold the stock price DataFrames that we can later concatenate into a master frame df_list = [] #not all stocks will return data so set up an empty list to store the stock tickers that actually successfully returns data used_stocks = [] #iterate over stock tickers in list and download relevant data, storing said data and successfully downloaded tickers along the way for stock in stocks: try: data = pd.DataFrame(web.DataReader(stock,data_source='iex',start='01/01/2013')['close']) data.columns = [stock] df_list.append(data) used_stocks.append(stock) except: pass #concatenate list of individual tciker price DataFrames into one master DataFrame df = pd.concat(df_list,axis=1)

Lets plot the resulting DataFrame of price data just to make sure we have what we need and as a quick sanity check:

df.plot(figsize=(20,10))

Ok so it looks from the chart as if we have around price data downloaded for around 25-30 stocks; this should be more than enough to find at least a couple of co-integrated pairs to run our backtest over.

We will now define a quick function that will run our stocks, combining them into pairs one by one and running co-integration tests on each pair. That result will then be stored in a matrix that we initialise,

and then we will be able to plot that matrix as a heatmap. Also, if the co-integration test meets our threshold statistical significance (in our case 5%), then that pair of stock tciokers will be stored in a list for later retrieval.

#NOTE CRITICAL LEVEL HAS BEEN SET TO 5% FOR COINTEGRATION TEST def find_cointegrated_pairs(dataframe, critial_level = 0.05): n = dataframe.shape[1] # the length of dateframe pvalue_matrix = np.ones((n, n)) # initialize the matrix of p keys = dataframe.columns # get the column names pairs = [] # initilize the list for cointegration for i in range(n): for j in range(i+1, n): # for j bigger than i stock1 = dataframe[keys[i]] # obtain the price of "stock1" stock2 = dataframe[keys[j]]# obtain the price of "stock2" result = sm.tsa.stattools.coint(stock1, stock2) # get conintegration pvalue = result[1] # get the pvalue pvalue_matrix[i, j] = pvalue if pvalue < critial_level: # if p-value less than the critical level pairs.append((keys[i], keys[j], pvalue)) # record the contract with that p-value return pvalue_matrix, pairs

Let’s now run our data through our function, save the results and plot the heatmap:

#set up the split point for our "training data" on which to perform the co-integration test (the remaining dat awill be fed to our backtest function) split = int(len(df) * .4) #run our dataframe (up to the split point) of ticker price data through our co-integration function and store results pvalue_matrix,pairs = find_cointegrated_pairs(df[:split]) #convert our matrix of stored results into a DataFrame pvalue_matrix_df = pd.DataFrame(pvalue_matrix) #use Seaborn to plot a heatmap of our results matrix fig, ax = plt.subplots(figsize=(15,10)) sns.heatmap(pvalue_matrix_df,xticklabels=used_stocks,yticklabels=used_stocks,ax=ax)

So we can see from the very dark red squares that it looks as though there are indeed a few pairs of stocks who’s co-integration score is below the 5% threshold

hardcoded into the function we defined. To see more explicitly which pairs these are, let’s print out our list of stored pairs that was part of the fucntion results we stored:

for pair in pairs: print("Stock {} and stock {} has a co-integration score of {}".format(pair[0],pair[1],round(pair[2],4)))

Stock AKAM and stock TCX has a co-integration score of 0.027 Stock AKAM and stock YNDX has a co-integration score of 0.0484 Stock BIDU and stock WEB has a co-integration score of 0.0377 Stock WIFI and stock JCOM has a co-integration score of 0.0039 Stock WIFI and stock LLNW has a co-integration score of 0.0187 Stock WIFI and stock NTES has a co-integration score of 0.0215 Stock WIFI and stock SIFY has a co-integration score of 0.0026 Stock WIFI and stock YNDX has a co-integration score of 0.0092 Stock ENV and stock SINA has a co-integration score of 0.0294 Stock IPAS and stock LLNW has a co-integration score of 0.0199 Stock IPAS and stock EGOV has a co-integration score of 0.0405 Stock JCOM and stock SIFY has a co-integration score of 0.0388 Stock LLNW and stock NTES has a co-integration score of 0.0109 Stock LLNW and stock TCX has a co-integration score of 0.032

We will now use the “pykalman” module to set up a couple of functions that will allow us to generate Kalman filters which we will apply to our data and in turn our regression that is fed the said data.

I will also define a function for “Halflife” which just recycles some tof the code from my mean reversion pairs trading blog post from a couple of years ago, which can be found

here.

def KalmanFilterAverage(x): # Construct a Kalman filter kf = KalmanFilter(transition_matrices = [1], observation_matrices = [1], initial_state_mean = 0, initial_state_covariance = 1, observation_covariance=1, transition_covariance=.01) # Use the observed values of the price to get a rolling mean state_means, _ = kf.filter(x.values) state_means = pd.Series(state_means.flatten(), index=x.index) return state_means # Kalman filter regression def KalmanFilterRegression(x,y): delta = 1e-3 trans_cov = delta / (1 - delta) * np.eye(2) # How much random walk wiggles obs_mat = np.expand_dims(np.vstack([[x], [np.ones(len(x))]]).T, axis=1) kf = KalmanFilter(n_dim_obs=1, n_dim_state=2, # y is 1-dimensional, (alpha, beta) is 2-dimensional initial_state_mean=[0,0], initial_state_covariance=np.ones((2, 2)), transition_matrices=np.eye(2), observation_matrices=obs_mat, observation_covariance=2, transition_covariance=trans_cov) # Use the observations y to get running estimates and errors for the state parameters state_means, state_covs = kf.filter(y.values) return state_means def half_life(spread): spread_lag = spread.shift(1) spread_lag.iloc[0] = spread_lag.iloc[1] spread_ret = spread - spread_lag spread_ret.iloc[0] = spread_ret.iloc[1] spread_lag2 = sm.add_constant(spread_lag) model = sm.OLS(spread_ret,spread_lag2) res = model.fit() halflife = int(round(-np.log(2) / res.params[1],0)) if halflife <= 0: halflife = 1 return halflife

Now let us define our main “Backtest” function that we will run our data through. The fucntion takes one pair of tickers at a time, and then returns several outputs, namely the DataFrame of cumulative returns,

the Sharpe Ratio and the Compound Annual Growth Rate (CAGR). Once we have defined our function, we can iterate over our list of pairs and feed the relevant data, pair by pair, into the function, storing the outputs for each pair for

later use and retrieval.

def backtest(df,s1, s2): ############################################################# # INPUT: # DataFrame of prices # s1: the symbol of contract one # s2: the symbol of contract two # x: the price series of contract one # y: the price series of contract two # OUTPUT: # df1['cum rets']: cumulative returns in pandas data frame # sharpe: Sharpe ratio # CAGR: Compound Annual Growth Rate x = df[s1] y = df[s2] # run regression (including Kalman Filter) to find hedge ratio and then create spread series df1 = pd.DataFrame({'y':y,'x':x}) df1.index = pd.to_datetime(df1.index) state_means = KalmanFilterRegression(KalmanFilterAverage(x),KalmanFilterAverage(y)) df1['hr'] = - state_means[:,0] df1['spread'] = df1.y + (df1.x * df1.hr) # calculate half life halflife = half_life(df1['spread']) # calculate z-score with window = half life period meanSpread = df1.spread.rolling(window=halflife).mean() stdSpread = df1.spread.rolling(window=halflife).std() df1['zScore'] = (df1.spread-meanSpread)/stdSpread ############################################################## # trading logic entryZscore = 2 exitZscore = 0 #set up num units long df1['long entry'] = ((df1.zScore < - entryZscore) & ( df1.zScore.shift(1) > - entryZscore)) df1['long exit'] = ((df1.zScore > - exitZscore) & (df1.zScore.shift(1) < - exitZscore)) df1['num units long'] = np.nan df1.loc[df1['long entry'],'num units long'] = 1 df1.loc[df1['long exit'],'num units long'] = 0 df1['num units long'][0] = 0 df1['num units long'] = df1['num units long'].fillna(method='pad') #set up num units short df1['short entry'] = ((df1.zScore > entryZscore) & ( df1.zScore.shift(1) < entryZscore)) df1['short exit'] = ((df1.zScore < exitZscore) & (df1.zScore.shift(1) > exitZscore)) df1.loc[df1['short entry'],'num units short'] = -1 df1.loc[df1['short exit'],'num units short'] = 0 df1['num units short'][0] = 0 df1['num units short'] = df1['num units short'].fillna(method='pad') df1['numUnits'] = df1['num units long'] + df1['num units short'] df1['spread pct ch'] = (df1['spread'] - df1['spread'].shift(1)) / ((df1['x'] * abs(df1['hr'])) + df1['y']) df1['port rets'] = df1['spread pct ch'] * df1['numUnits'].shift(1) df1['cum rets'] = df1['port rets'].cumsum() df1['cum rets'] = df1['cum rets'] + 1 ############################################################## try: sharpe = ((df1['port rets'].mean() / df1['port rets'].std()) * sqrt(252)) except ZeroDivisionError: sharpe = 0.0 ############################################################## start_val = 1 end_val = df1['cum rets'].iat[-1] start_date = df1.iloc[0].name end_date = df1.iloc[-1].name days = (end_date - start_date).days CAGR = round(((float(end_val) / float(start_val)) ** (252.0/days)) - 1,4) df1[s1+ " "+s2] = df1['cum rets'] return df1[s1+" "+s2], sharpe, CAGR

So now let’s run our full list of pairs through our Backtest function, and print out some results along the way, and finally after storing the equity curve for each pair,

produce a chart that plots out each curve.

results = [] for pair in pairs: rets, sharpe, CAGR = backtest(df[split:],pair[0],pair[1]) results.append(rets) print("The pair {} and {} produced a Sharpe Ratio of {} and a CAGR of {}".format(pair[0],pair[1],round(sharpe,2),round(CAGR,4))) rets.plot(figsize=(20,15),legend=True)

The pair AKAM and TCX produced a Sharpe Ratio of 1.69 and a CAGR of 0.0701 The pair AKAM and YNDX produced a Sharpe Ratio of 1.95 and a CAGR of 0.0974 The pair BIDU and WEB produced a Sharpe Ratio of 0.48 and a CAGR of 0.0163 The pair WIFI and JCOM produced a Sharpe Ratio of 1.31 and a CAGR of 0.0866 The pair WIFI and LLNW produced a Sharpe Ratio of 1.22 and a CAGR of 0.13 The pair WIFI and NTES produced a Sharpe Ratio of 0.36 and a CAGR of 0.0385 The pair WIFI and SIFY produced a Sharpe Ratio of 1.7 and a CAGR of 0.0942 The pair WIFI and YNDX produced a Sharpe Ratio of 1.05 and a CAGR of 0.065 The pair ENV and SINA produced a Sharpe Ratio of 0.83 and a CAGR of 0.0163 The pair IPAS and LLNW produced a Sharpe Ratio of -0.72 and a CAGR of -0.1943 The pair IPAS and EGOV produced a Sharpe Ratio of 1.25 and a CAGR of 0.1504 The pair JCOM and SIFY produced a Sharpe Ratio of 0.0 and a CAGR of 0.0 The pair LLNW and NTES produced a Sharpe Ratio of 0.24 and a CAGR of 0.0338 The pair LLNW and TCX produced a Sharpe Ratio of 0.95 and a CAGR of 0.1073

Now we run a few extra lines of code to combine, equally weight, and print our our final equity curve:

#concatenate together the individual equity curves into a single DataFrame results_df = pd.concat(results,axis=1).dropna() #equally weight each equity curve by dividing each by the number of pairs held in the DataFrame results_df /= len(results_df.columns) #sum up the equally weighted equity curves to get our final equity curve final_res = results_df.sum(axis=1) #plot the chart of our final equity curve final_res.plot(figsize=(20,15))

#calculate and print our some final stats for our combined equity curve sharpe = (final_res.pct_change().mean() / final_res.pct_change().std()) * (sqrt(252)) start_val = 1 end_val = final_res.iloc[-1] start_date = final_res.index[0] end_date = final_res.index[-1] days = (end_date - start_date).days CAGR = round(((float(end_val) / float(start_val)) ** (252.0/days)) - 1,4) print("Sharpe Ratio is {} and CAGR is {}".format(round(sharpe,2),round(CAGR,4)))

Sharpe Ratio is 2.14 and CAGR is 0.06

## 19 comments

Hi, nice post!

PS: the link to Kalman filter does not work unfortunately.

That’s strange, it works for me…make sure you click the word “here” rather than “click”.

Great article!

I have found one issue: The first (halflife -1) entries in the meanSpread to be nan’s. This causes the first entries of df1.zScore to be nan’s and therefore the comparison with the entryZscore fails. How can I make this work?

Thanks and keep up the good work!

JC

I dont understand why you define and use 2 kalman fileter functions? I apreciatte your answer!

nice!

Great article! But the hedge ratio is changing every day, and in real situation, the hedge ratio is fixed while executing buy and sell trading, until long or short exit.

The hedge ratio should be online(should change every day)

Hello S666,

Firstly I would like to thank you for your very interesting posts on pair trading. I have two questions regarding your implementation:

1. You calculate the daily return when in position as: (spread – spread.lag(1)) / (x * hr + y). Can you please explain where it comes from and which position sizing you are assuming for each leg of the pair?

2. Your implementation of the Kalman Filter is to first filter x and y through a Kalman average (works like some sort of a moving average) and then feed the result to the main Kalman filter that calculates the hedge ratio and intercept. Could you please explain why is the hedge ration calculated on the smoothed prices rather than the true prices?

Thank you,

Nathan

Apologies for the delay – I shall get to this question and reply shortly!

Hi

I’m trying to build the spread slightly differently by adding the intercept as well. (i.e spread = stock1 – beta*stock2 -alpha). Just to confirm my alpha would be contained in state_means[:,1] is it? I’d assume so but wanted to double check

There is an error in the backtest function related to calculation of hedge ratio. It recalculates at each timestamp, i.e. it is assumed that position sizes are added/reduced every day (if it is a daily data). Though when you open the trades you fix the hedge ratio until you close them. It gives you an extra income. This error presents also in the source of your code (QI) as well. The true backtesting will not like the current one at all, unforunately. Don’t fall into that trap.

Hello, I am trying to replicate the portfolio as a way to improve my programming. However the download of the prices from yhaoo I think has been desabled.

What tools are your using to download the data now?

I have come across a module called https://github.com/JECSand/yahoofinancials but the way it downloads the data has a very complicated formatting and I am struggling to get the adjusted closes.

Thank you very much.

Hi Nick,

I think the Pandas Datareader Yahoo download has been “fixed” somewhat. Please refer to this post (http://pythonforfinance.net//2019/05/30/python-monte-carlo-vs-bootstrapping/) which is the latest on the blog – it uses Datareader to pull down the Adjusted Close for a number of tickers in one go.

Hopefully that gets you what you want. If you are still experiencing issues, let me know.

Hi, thanks for getting back to me.

I am using a list of tickers for all the technology stocks from the nasdaq. and I am using the formula

asset_universe = pd.DataFrame([web.DataReader(ticker, ‘yahoo’, start, end).loc[:, ‘Adj Close’] for ticker in clean_names],index=clean_names).T.fillna(method=’ffill’)

I get

Keyerror Data

Have you ever had the same error?

Hi Nick,

It’s a bit difficult to debug without having the full list of tickers you are using (so I can try to recreate the problem), or having the full error message. Do you have a ticker in your list named “Data” by any chance?

If you could post the full error message and also perhaps paste your list of tickers I can take a closer look.

I found this link on Google: https://github.com/pydata/pandas-datareader/issues/487

It suggests using the “fix_yahoo_finance” package to solves the problem – although the official fix should have been integrated into pandas_datareader.

You could either try updating your pandas_datareader with the following command in the command prompt:

Or you could follow the advice on the above link and add the below lines and your script should work.

Make sure you have pip installed fix_yahoo_finance already.

Hi and thank you for your post, it is very interesting approach!

Have you tried working with Futures? I would like to apply a similar logic to oil futures. For example you have the prices for September and December as pair AND you get the data for the Sep-Dec 2018,2017,2016 contracts and so on.

I would like to use for example the 2013-2017 historical timeseries as training set and then the 2018 timeseries as a test set. The problem is this is not a continuous timeseries, ie the 2013 might close with oil at Sep, Dec= 60, 55 and the 2014 might start at Sep, Dec= 80,75.

How would you merge and normalize these series together before feeding them into your model?

Thanks!

Hi Pete, thanks for your comment and thanks for the kind words – its nice to hear you find it of interest.

There are a number of ways to deal with creating a “continuous” futures contract but they all have their pros and cons – with one of the methods perhaps being seen as the “best” way forward (that would be the “perpetual” method).

There is a great blog post doing just what you are looking for – it runs through an example of combining oil futures contracts together – take a look here https://www.quantstart.com/articles/Continuous-Futures-Contracts-for-Backtesting-Purposes and if you have any follow up questions I would be more than happy to assist.

Thanks very much for your article, great material !

There is however one line I don’t understand:

where does this come from ?

Why not: (df1[‘spread’] – df1[‘spread’].shift(1)) / (df1[‘spread’].shift(1)) ?

Many thanks !