Introduction
Recently, I saw BuOu's Quantitative Diary mentioning that you can use negatively correlated currencies to select currencies, and open positions to make profits based on price difference breakthroughs. Digital currencies are basically positively correlated, and only a few currencies are negatively correlated, often with special market conditions, such as the independent market conditions of MEME coins, which are completely different from the market trend. These currencies can be selected and go long after the breakthrough. This method can make profits under specific market conditions. However, the most common method in the field of quantitative trading is to use positive correlation for paired trading. This article will introduce this strategy briefly.
Digital currency pair trading is a trading strategy based on statistical arbitrage, which simultaneously buys and sells two highly correlated cryptocurrencies to obtain profits from price deviations. This article will introduce the principles of this strategy, profit mechanism, methods of selecting currencies, potential risks and ways to improve them, and provide some practical Python code examples.
Strategy Principle
Pair trading strategies rely on the historical correlation between the prices of two digital currencies. When the prices of two currencies show a strong correlation, their price trends are generally in sync. If the price ratio between the two deviates significantly at a certain moment, it can be considered a temporary abnormality and the price will tend to return to normal levels. The digital currency market is highly interconnected. When a major digital currency (such as Bitcoin) fluctuates significantly, it will usually trigger a coordinated reaction in other digital currencies. Some currencies may have a very obvious positive correlation that can last due to the same investment institutions, the same market makers, and the same track. Some currencies are negatively correlated, but there are fewer negatively correlated currencies, and since they are all affected by the market trend, they will often have consistent market trends.
Assume that currency A and currency B have a high price correlation. At a certain moment, the average value of the A/B price ratio is 1. If at a certain moment, the A/B price ratio deviates by more than 0.001, that is, more than 1.001, then you can trade in the following ways: Open a long position on B and open a short position on A. On the contrary, when the A/B price ratio is lower than 0.999: Open a long position on A and open a short position on B.
The key to profitability lies in the spread gains when prices deviate from the mean and return to normal. Since price deviations are usually short-lived, traders can close their positions when prices return to the mean and profit from the spread.
Prepare the Data
Import the corresponding library
These codes can be used directly. It is best to download Anancoda and debug it in Jupyer notebook. It includes packages for commonly used data analysis directly.
import requests from datetime import date,datetime import time import pandas as pd import numpy as np import matplotlib.pyplotas plt import requests, zipfile, io %matplotlib inline
Get all trading pairs being traded
Info = requests.get('https://fapi.binance.com/fapi/v1/exchangeInfo') b_symbols = [s['symbol'] for s in Info.json()['symbols'] if s['contractType'] == 'PERPETUAL'and s['status'] == 'TRADING'and s['quoteAsset'] == 'USDT'] b_symbols = list(filter(lambda x: x[-4:] == 'USDT', [s.split('_')[0] for s in b_symbols])) b_symbols = [x[:-4] for x in b_symbols] print(b_symbols) # Get all trading pairs being traded
Download K-line function
The main function of the GetKlines function is to obtain the historical K-line data of the specified trading pair perpetual contract from the Binance exchange and store the data in a Pandas DataFrame. The K-line data includes information such as opening price, highest price, lowest price, closing price, and trading volume. This time we mainly use the closing price data.
defGetKlines(symbol='BTCUSDT',start='2020-8-10',end='2024-7-01',period='1h',base='fapi',v = 'v1'): Klines = [] start_time = int(time.mktime(datetime.strptime(start, "%Y-%m-%d").timetuple()))*1000 + 8*60*60*1000 end_time = min(int(time.mktime(datetime.strptime(end, "%Y-%m-%d").timetuple()))*1000 + 8*60*60*1000,time.time()*1000) intervel_map = {'m':60*1000,'h':60*60*1000,'d':24*60*60*1000} while start_time < end_time: time.sleep(0.3) mid_time = start_time+1000*int(period[:-1])*intervel_map[period[-1]] url = 'https://'+base+'.binance.com/'+base+'/'+v+'/klines?symbol=%s&interval=%s&startTime=%s&endTime=%s&limit=1000'%(symbol,period,start_time,mid_time) res = requests.get(url) res_list = res.json() iftype(res_list) == listandlen(res_list) > 0: start_time = res_list[-1][0]+int(period[:-1])*intervel_map[period[-1]] Klines += res_list iftype(res_list) == listandlen(res_list) == 0: start_time = start_time+1000*int(period[:-1])*intervel_map[period[-1]] if mid_time >= end_time: break df = pd.DataFrame(Klines,columns=['time','open','high','low','close','amount','end_time','volume','count','buy_amount','buy_volume','null']).astype('float') df.index = pd.to_datetime(df.time,unit='ms') return df
Download data
The data volume is relatively large. In order to download faster, only the hourly K-line data of the last three months is obtained. df_close contains the closing price data of all currencies.
start_date = '2024-04-01' end_date = '2024-07-05' period = '1h' df_dict = {} for symbol in b_symbols: print(symbol) if symbol in df_dict.keys(): continue df_s = GetKlines(symbol=symbol+'USDT',start=start_date,end=end_date,period=period) if not df_s.empty: df_dict[symbol] = df_s df_close = pd.DataFrame(index=pd.date_range(start=start_date, end=end_date, freq=period),columns=df_dict.keys()) for symbol in symbols: df_close[symbol] = df_dict[symbol].close df_close = df_close.dropna(how='all')
Backtesting Engine
We define an exchange object for the following backtest.
classExchange: def__init__(self, trade_symbols, fee=0.0002, initial_balance=10000): self.initial_balance = initial_balance #Initial assets self.fee = fee self.trade_symbols = trade_symbols self.account = {'USDT':{'realised_profit':0, 'unrealised_profit':0, 'total':initial_balance, 'fee':0, 'leverage':0, 'hold':0, 'long':0, 'short':0}} for symbol in trade_symbols: self.account[symbol] = {'amount':0, 'hold_price':0, 'value':0, 'price':0, 'realised_profit':0,'unrealised_profit':0,'fee':0} defTrade(self, symbol, direction, price, amount): cover_amount = 0if direction*self.account[symbol]['amount'] >=0elsemin(abs(self.account[symbol]['amount']), amount) open_amount = amount - cover_amount self.account['USDT']['realised_profit'] -= price*amount*self.fee #Deduction fee self.account['USDT']['fee'] += price*amount*self.fee self.account[symbol]['fee'] += price*amount*self.fee if cover_amount > 0: #Close the position first self.account['USDT']['realised_profit'] += -direction*(price - self.account[symbol]['hold_price'])*cover_amount #profit self.account[symbol]['realised_profit'] += -direction*(price - self.account[symbol]['hold_price'])*cover_amount self.account[symbol]['amount'] -= -direction*cover_amount self.account[symbol]['hold_price'] = 0if self.account[symbol]['amount'] == 0else self.account[symbol]['hold_price'] if open_amount > 0: total_cost = self.account[symbol]['hold_price']*direction*self.account[symbol]['amount'] + price*open_amount total_amount = direction*self.account[symbol]['amount']+open_amount self.account[symbol]['hold_price'] = total_cost/total_amount self.account[symbol]['amount'] += direction*open_amount defBuy(self, symbol, price, amount): self.Trade(symbol, 1, price, amount) defSell(self, symbol, price, amount): self.Trade(symbol, -1, price, amount) defUpdate(self, close_price): #Update the assets self.account['USDT']['unrealised_profit'] = 0 self.account['USDT']['hold'] = 0 self.account['USDT']['long'] = 0 self.account['USDT']['short'] = 0for symbol in self.trade_symbols: ifnot np.isnan(close_price[symbol]): self.account[symbol]['unrealised_profit'] = (close_price[symbol] - self.account[symbol]['hold_price'])*self.account[symbol]['amount'] self.account[symbol]['price'] = close_price[symbol] self.account[symbol]['value'] = self.account[symbol]['amount']*close_price[symbol] if self.account[symbol]['amount'] > 0: self.account['USDT']['long'] += self.account[symbol]['value'] if self.account[symbol]['amount'] < 0: self.account['USDT']['short'] += self.account[symbol]['value'] self.account['USDT']['hold'] += abs(self.account[symbol]['value']) self.account['USDT']['unrealised_profit'] += self.account[symbol]['unrealised_profit'] self.account['USDT']['total'] = round(self.account['USDT']['realised_profit'] + self.initial_balance + self.account['USDT']['unrealised_profit'],6) self.account['USDT']['leverage'] = round(self.account['USDT']['hold']/self.account['USDT']['total'],3)
Correlation Analysis to Filter Currencies
Correlation calculation is a method in statistics used to measure the linear relationship between two variables. The most commonly used correlation calculation method is the Pearson correlation coefficient. The following is the principle, formula and implementation method of correlation calculation. The Pearson correlation coefficient is used to measure the linear relationship between two variables, and its value range is between -1 and 1:
- 1 indicates a perfect positive correlation, where the two variables always change in sync. When one variable increases, the other also increases proportionally. The closer it is to 1, the stronger the correlation.
- -1 indicates a perfect negative correlation, where the two variables always change in opposite directions. The closer it is to -1, the stronger the negative correlation.
- 0 means no linear correlation, there is no straight line relationship between the two variables.
The Pearson correlation coefficient determines the correlation between two variables by calculating their covariance and standard deviation. The formula is as follows:
in which:
- is the Pearson correlation coefficient between variables X and Y.
- is the covariance of X and Y.
- and are the standard deviations of X and Y respectively.
Of course, you don't need to worry too much about how it is calculated. You can use 1 line of code in Python to calculate the correlation of all currencies. The figure shows a correlation heat map. Red represents positive correlation, blue represents negative correlation, and the darker the color, the stronger the correlation. You can see that most of the area is dark red, so the positive correlation of digital currencies is very strong.
import seaborn as sns corr = df_close.corr() plt.figure(figsize=(20, 20)) sns.heatmap(corr, annot=False, cmap='coolwarm', vmin=-1, vmax=1) plt.title('Correlation Heatmap of Cryptocurrency Closing Prices', fontsize=20);
Based on the correlation, the top 20 most correlated currency pairs are selected. The results are as follows. Their correlations are very strong, all above 0.99.
MANA SAND 0.996562 ICX ZIL 0.996000 STORJ FLOW 0.994193 FLOW SXP 0.993861 STORJ SXP 0.993822 IOTA ZIL 0.993204 SAND 0.993095 KAVA SAND 0.992303 ZIL SXP 0.992285 SAND 0.992103 DYDX ZIL 0.992053 DENT REEF 0.991789 RDNT MANTA 0.991690 STMX STORJ 0.991222 BIGTIME ACE 0.990987 RDNT HOOK 0.990718 IOST GAS 0.990643 ZIL HOOK 0.990576 MATIC FLOW 0.990564 MANTA HOOK 0.990563
The corresponding code is as follows:
corr_pairs = corr.unstack() # Remove self-correlation (i.e. values on the diagonal) corr_pairs = corr_pairs[corr_pairs != 1] sorted_corr_pairs = corr_pairs.sort_values(kind="quicksort") # Extract the top 20 most and least correlated currency pairs most_correlated = sorted_corr_pairs.tail(40)[::-2] print("The top 20 most correlated currency pairs are:") print(most_correlated)
Backtesting Verification
The specific backtest code is as follows. The demonstration strategy mainly observes the price ratio of two cryptocurrencies (IOTA and ZIL) and trades according to the changes in this ratio. The specific steps are as follows:
- Initialization:
- Define trading pairs (pair_a = 'IOTA', pair_b = 'ZIL').
- Create an exchange object
e
with an initial balance of $10,000 and a transaction fee of 0.02%. - Calculate the initial average price ratio
avg
. - Set an initial transaction value
value = 1000
.
- Iterate over price data:
- Traverse the price data at each time point
df_close
. - Calculates the deviation of the current price ratio from the mean
diff
. - The target transaction value is calculated based on the deviation
aim_value
, and one value is traded for every 0.01 deviation. Buying and selling operations are determined based on the current account position and price situation. - If the deviation is too large, execute sell
pair_a
and buypair_b
operations. - If the deviation is too small, buy
pair_a
and sellpair_b
operations are performed.
- Adjust the mean:
- Updates the average price ratio
avg
to reflect the latest price ratios.
- Update accounts and records :
- Update the position and balance information of the exchange account.
- Record the account status at each step (total assets, held assets, transaction fees, long and short positions) to
res_list
.
- Result output:
- Convert
res_list
to dataframeres
for further analysis and presentation.
pair_a = 'IOTA' pair_b = "ZIL" e = Exchange([pair_a,pair_b], fee=0.0002, initial_balance=10000) #Exchange definition is placed in the comments section res_list = [] index_list = [] avg = df_close[pair_a][0] / df_close[pair_b][0] value = 1000for idx, row in df_close.iterrows(): diff = (row[pair_a] / row[pair_b] - avg)/avg aim_value = -value * diff / 0.01if -aim_value + e.account[pair_a]['amount']*row[pair_a] > 0.5*value: e.Sell(pair_a,row[pair_a],(-aim_value + e.account[pair_a]['amount']*row[pair_a])/row[pair_a]) e.Buy(pair_b,row[pair_b],(-aim_value - e.account[pair_b]['amount']*row[pair_b])/row[pair_b]) if -aim_value + e.account[pair_a]['amount']*row[pair_a] < -0.5*value: e.Buy(pair_a, row[pair_a],(aim_value - e.account[pair_a]['amount']*row[pair_a])/row[pair_a]) e.Sell(pair_b, row[pair_b],(aim_value + e.account[pair_b]['amount']*row[pair_b])/row[pair_b]) avg = 0.99*avg + 0.01*row[pair_a] / row[pair_b] index_list.append(idx) e.Update(row) res_list.append([e.account['USDT']['total'],e.account['USDT']['hold'], e.account['USDT']['fee'],e.account['USDT']['long'],e.account['USDT']['short']]) res = pd.DataFrame(data=res_list, columns=['total','hold', 'fee', 'long', 'short'],index = index_list) res['total'].plot(grid=True);
A total of 4 groups of currencies were backtested, and the results were ideal. The current correlation calculation uses future data, so it is not very accurate. This article also divides the data into two parts, based on the previous calculation of correlation and the subsequent backtest trading. The results are a little different but not bad. We leave it to the user to practice and verify.
Potential Risks and Ways to Improve
Although the pair trading strategy can be profitable in theory, there are still some risks in actual operation: the correlation between currencies may change over time, causing the strategy to fail; under extreme market conditions, price deviations may increase, resulting in larger losses; the low liquidity of certain currencies may make transactions difficult to execute or increase costs; and the fees generated by frequent transactions may erode profits.
To reduce risks and improve the stability of strategies, the following improvement measures can be considered: regularly recalculate the correlation between currencies and adjust trading pairs in a timely manner; set stop loss and take profit points to control the maximum loss of a single transaction; trade multiple currency pairs at the same time to diversify risks.
Conclusion
The digital currency pair trading strategy achieves profit by taking advantage of the correlation of currency prices and performing arbitrage operations when prices deviate. This strategy has high theoretical feasibility. A simple live trading strategy source code based on this strategy will be released later. If you have more questions or need further discussion, please feel free to communicate.
From: Detailed Explanation of Digital Currency Pair Trading Strategy (fmz.com)