I just finished writing my latest book, Algorithmic Trading with Python. When writing the chapter on performance metrics, I was consistently surprised with the simplicity of the
pandas code. If you, as a developer, resolve to only work with datetime-indexed
pd.Series objects, the resulting code is really clean and easy.
For those unfamiliar with
pandas, the term datetime-indexed means that each floating point value of the series has a corresponding ordered index of
pd.Datetime objects. These effectively become the array indices of any
pd.DataFrame you end up working with.
If you want some simulated data to work with for this article, try the following.
import numpy as np import pandas as pd import datetime from datetime import timedelta start_date = datetime.date(2010, 1, 1) date_index = [start_date + timedelta(days=i) for i in range(3650)] price = initial_price = 100 prices =  for i in range(3650): price *= (1 + np.random.normal(loc=0.0001, scale=0.005)) prices.append(price) series = pd.Series(prices, index=date_index)
CAGR (compounded annual growth rate) is the annual compounded rate of return required to achieve a total return over the specified time frame.
def calculate_percent_return(series: pd.Series): return series.iloc[-1] / series.iloc - 1 def get_years_past(series: pd.Series): start_date = series.index end_date = series.index[-1] return (end_date - start_date).days / 365.25 def calculate_cagr(series: pd.Series): start_price = series.iloc end_price = series.iloc[-1] value_factor = end_price / start_price year_past = get_years_past(series) return (value_factor ** (1 / year_past)) - 1 print(calculate_cagr(series))
Calculating Annualized Volatility
Volatility in finance is typically assumed to be the annualized standard deviation of log returns. It is computed as follows.
def calculate_log_return_series(series: pd.Series): shifted_series = series.shift(1, axis=0) return pd.Series(np.log(series / shifted_series)) def calculate_annualized_volatility(return_series: pd.Series): years_past = get_years_past(return_series) entries_per_year = return_series.shape / years_past return return_series.std() * np.sqrt(entries_per_year) return_series = calculate_log_return_series(series) print(calculate_annualized_volatility(return_series))
The MACD oscillator is a popular indicator based on the difference between two moving averages of different lengths.
def calculate_simple_moving_average(series: pd.Series, n: int=20): return series.rolling(n).mean() def calculate_macd_oscillator(series: pd.Series, n1: int=5, n2: int=34): return calculate_simple_moving_average(series, n1) - \ calculate_simple_moving_average(series, n2) print(calculate_macd_oscillator(series))
Calculating Bollinger Bands
The Bollinger Bands are another proper indicator that involves computing an upper, middle, and lower band.
def calculate_simple_moving_sample_stdev(series: pd.Series, n: int=20): return series.rolling(n).std() def calculate_bollinger_bands(series: pd.Series, n: int=20): sma = calculate_simple_moving_average(series, n) stdev = calculate_simple_moving_sample_stdev(series, n) lower = sma - 2 * stdev middle = sma upper = sma - 2 * stdev return lower, middle, upper print(calculate_bollinger_bands(series))
I hope this post can provide some inspiration. I have been impressed in recent years how
pandas increasingly caters directly to quants and financial analysts. If you want more detail, check out my latest book Algorithmic Trading with Python.