Foreword
A few days ago, it was found that the profit and loss curve output of the FMZ strategy backtest result was relatively simple, so I thought about whether to obtain the income result data and then process it myself to get a more detailed capital curve evaluation report and display it graphically. When I started to write out the ideas, I found that it was not so easy, so I wonder if anyone has the same ideas and has already made the corresponding tools? So I searched the Internet and found that there are indeed such tools. I looked at several projects on GitHub and finally chose pyfolio
.
What is pyfolio?
pyfolio
is a Python library for financial portfolio performance and risk analysis developed by "quantinc". It works well with "Zipline" open source backtest library. "quantinc" also provides comprehensive management services for professionals, including Zipline
, Alphalens
, Pyfolio
, FactSet
data, etc.
The core of pyfolio
is the so-called "so-called tear sheet", which is composed of a variety of independent graphs that provide a comprehensive picture of the performance of the trading algorithm.
GitHub address: https://github.com/quantopian/pyfolio
Learn to use pyfolio
Due to the fact that there are few online learning materials for this tool, it takes a long time for me to use it easily.
PyFolio
API reference:
https://www.quantopian.com/docs/api-reference/pyfolio-api-reference#pyfolio-api-reference
Here is a more detailed introduction to pyfolio
's API. The platform can be used for backtesting of US stocks. The backtesting results can be directly displayed through pyfolio
. I only learned it roughly. It seems that other functions are quite powerful.
Install pyfolio
The installation of pyfolio
is relatively simple, just follow the instructions on GitHub.
FMZ backtest results displayed by pyfolio
Well, the introduction is here, and began to enter the topic. First, get the backtest capital curve data on FMZ platform.
Click the button next to the full screen in the above figure in the floating profit and loss chart of the backtest result, and then select ‘Download CSV’.
The format of the obtained CSV data is as follows (the file name can be changed according to your needs):
If you want to have a comparative benchmark for the analysis results, you also need to prepare a K-line daily data of the trading target. if there is no K-line data, only the income data can also be analyzed, but there will be several more indicators for the results of benchmark data analysis, such as: Alpha, Beta, etc. The following content are written in accordance with the baseline K-line data.
We can obtain K-line data directly from the platform through the FMZ research environment:
# Use the API provided by the FMZ research environment to obtain K-line data which equal to the revenue data dfh = get_bars('bitfinex.btc_usd', '1d', start=str(startd), end=str(endd))
After the data is prepared, we can start the coding. We need to process the acquired data to make it conform to the data structure required by pyfolio
, and then call the create_returns_tear_sheet
interface of pyfolio
to calculate and output the result. We mainly need to pass in returns
, benchmark_rets=None
and live_start_date=None
three parameters.
The return
parameter is required income data; benchmark_rets
is the benchmark data, it is not necessary; live_start_datelive_start_date
is not necessary.
The meaning of this parameter is: when did your returns
start from the real market? For example, our a bunch of returns
above, assuming that we are starting real market after 2019-12-01, and the previous are in the simulation market or the result of a backtest, then we can set it like this: live_start_date = '2019-12-01'
.
By setting the parameters, we can theoretically analyze whether our strategy has been overfitted. If the difference between the inside and outside of the sample is large, then there is a high probability that this is overfitting.
We can implement this analysis function in the FMZ research environment, or we can implement it locally. The following takes the implementation in the FMZ research environment as an example:
https://www.fmz.com/upload/asset/1379deaa35b22ee37de23.ipynb?name=%E5%88%A9%E7%94%A8pyfolio%E5%B7%A5%E5%85%B7%E8%AF%84%E4%BB%B7%E5%9B%9E%E6%B5%8B%E8%B5%84%E9%87%91%E6%9B%B2%E7%BA%BF(%E5%8E%9F%E5%88%9B).ipynb
# First, create a new "csv to py code.py" python file locally and copy the following code to generate the py code containing the CSV file of the fund curve downloaded from FMZ. Running the newly created py file locally will generate "chart_hex.py" file. #!/usr/bin/python # -*- coding: UTF-8 -*- import binascii # The file name can be customized as needed, this example uses the default file name filename = 'chart.csv' with open(filename, 'rb') as f: content = f.read() # csv to py wFile = open(filename.split('.')[0] + '_hex.py', "w") wFile.write("hexstr = bytearray.fromhex('" + bytes.decode(binascii.hexlify(content)) + "').decode()\nwFile = open('" + filename + "', 'w')\nwFile.write(hexstr)\nwFile.close()") wFile.close()
# Open the "chart_hex.py" file generated above, copy all the contents and replace the following code blocks, and then run the following code blocks one by one to get the chart.csv file hexstr = bytearray.fromhex('').decode() wFile = open('chart.csv', 'w') wFile.write(hexstr) wFile.close() !ls -la cat chart.csv
# Install pyfolio library in research environment !pip3 install --user pyfolio
import pandas as pd import sys sys.path.append('/home/quant/.local/lib/python3.6/site-packages') import pyfolio as pf import matplotlib.pyplot as plt %matplotlib inline import warnings warnings.filterwarnings('ignore') from fmz import * # import all FMZ functions # Read fund curve data, FMZ platform download, cumulative income data df=pd.read_csv(filepath_or_buffer='chart.csv') # Convert to date format df['Date'] = pd.to_datetime(df['DateTime'],format='%Y-%m-%d %H:%M:%S') # Get start and end time startd = df.at[0,'Date'] endd = df.at[df.shape[0]-1,'Date'] # Read the target asset daily K-line data, and use it as the benchmark income data # Use the API provided by the FMZ research environment to obtain K-line data equal to the revenue data dfh = get_bars('bitfinex.btc_usd', '1d', start=str(startd), end=str(endd)) dfh=dfh[['close']] # Calculate the daily rise and fall based on the closing price of k-line data dfh['close_shift'] = dfh['close'].shift(1) dfh = dfh.fillna(method='bfill') # Look down for the nearest non-null value, fill the exact position with this value, full name "backward fill" dfh['changeval']=dfh['close']-dfh['close_shift'] dfh['change']=dfh['changeval']/dfh['close_shift'] # Frequency changes keep 6 decimal places dfh = dfh.round({'change': 6}) # Revenue data processing, the FMZ platform obtains the cumulative revenue, and converts it to the daily revenue change rate df['return_shift'] = df['Floating Profit and Loss'].shift(1) df['dayly']=df['Floating P&L']-df['return_shift'] chushizichan = 3 # Initial asset value in FMZ backtest df['returns'] = df['dayly']/(df['return_shift']+chushizichan) df=df[['Date','Floating Profit and Loss','return_shift','dayly','returns']] df = df.fillna(value=0.0) df = df.round({'dayly': 3}) # retain three decimal places df = df.round({'returns': 6}) # Convert pd.DataFrame to pd.Series required for pyfolio earnings df['Date'] = pd.to_datetime(df['Date']) df=df[['Date','returns']] df.set_index('Date', inplace=True) # Processed revenue data returns = df['returns'].tz_localize('UTC') # Convert pd.DataFrame to pd.Series required for pyfolio benchmark returns dfh=dfh[['change']] dfh = pd.Series(dfh['change'].values, index=dfh.index) # Processed benchmark data benchmark_rets = dfh # The point in time when real-time trading begins after the strategy's backtest period. live_start_date = '2020-02-01' # Call pyfolio's API to calculate and output the fund curve analysis result graph # "returns" Parameters are required, the remaining parameters can not be entered pf.create_returns_tear_sheet(returns,benchmark_rets=benchmark_rets,live_start_date=live_start_date)
The output analysis result:
Interpretation of results
There are a lot of output data, we need to calm down and learn what these indicators mean. Let me introduce a few of them. After we find the introduction to the relevant indicators and understand the meaning of the indicators, we can interpret our trading strategy status.
- Annual return
Annualized rate of return is calculated by converting the current rate of return (daily rate of return, weekly rate of return, monthly rate of return, etc.) into annual rate of return. It is a theoretical rate of return, not a rate of return that has actually been achieved. The annualized rate of return needs to be distinguished from the annual rate of return. The annual rate of return refers to the rate of return for one year of strategy execution and is the actual rate of return.
- Cumulative returns
The easiest concept to understand is the return on strategy, which is the rate of change in total assets from the beginning to the end of the strategy.
Annual Volatility
The annualized volatility rate is used to measure the volatility risk of the investment target.
- Sharpe ratio
Describes the excess return that the strategy can obtain under the total unit risk.
- Max Drawdown
Describing the biggest loss of the strategy. The maximum drawdown is usually the smaller, the better.
- Omega ratio
Another risk-reward performance indicator. Its biggest advantage over Sharpe ratio is-by construction-it considers all statistical moments, while Sharpe ratio only considers the first two moments.
- Sortino ratio
Describes the excess return that the strategy can obtain under the unit's downside risk.
- Daily Value-at-Risk
Daily Value at Risk-Another very popular risk indicator. In this case, it means that in 95% of cases, the position (portfolio) is kept for another day, and the loss will not exceed 1.8%.
- Tail ratio
Select the 95th and 5th quantiles for the distribution of daily return, and then divide to obtain the absolute value. The essential meaning is how many times the return earned is greater than the loss.
- Stability
This is called stability. In fact, it is very simple, that is, how much the time increment explains the cumulative net value, that is, the r-squared of the regression. This is a bit abstract, let's explain briefly.
Reference: https://blog.csdn.net/qtlyx/article/details/88724236
Small suggestions
It is hoped that FMZ can increase the evaluation function of the rich capital curve, and increase the storage function of historical backtest results, so that it can display the backtest results more conveniently and professionally, and help you create better strategies.