Image by Gino Crescoli from Pixabay

Risk Management

Downside Risk Measures — Python Implementation

Implementing Semideviation, VaR and CVaR risk estimation strategies in Python

Published in
9 min readJul 24, 2020

--

Risk management is the key to making smart investing decisions which lead to profitable outcomes. While doing technical analysis, investors often focus on the returns of an asset and do not focus a lot on the risk involved. This can often lead to unexpected outcomes. The most commonly used form of risk measure is volatility, often calculated using the standard deviation of the returns. New investors often rely on this method and if you’re one of them, I have some bad news. Volatility calculated using standard deviation takes into account both upside and downside risk measures and this leads to a very intuitive problem. On thinking deeper, you’d realise that the standard deviation takes into account both upside and downside deviations but if we’re getting more results than we expected, I don’t think that’s a problem. That’s actually a good thing! What we’re concerned about are the returns which go lower than what we expected. These risks are known as downside risks and in this article, I will discuss the measures to estimate them using Python. I have learned a lot of these implementation techniques from a Coursera Course I came across offered by the EDHEC Risk Institute which covers this and many other risk management techniques.

A good risk analyst focusses on minimizing the downside risks involved in an investment.

Introduction

The Downside risk of an asset is an estimation of a security’s potential to suffer a decline in value if the market conditions change or the amount of loss that could be sustained as a result of the decline. It is used to understand the worst-case scenario of investment in an asset.

In one of my previous articles, I discussed the visualisation of these downside risks over a period of time using the Maximum Drawdown strategy with pretty neat visualisations. This method helps you visualise where you lost the most amount of money from a previous peak. You can read about it here. However, we need to be able to quantify the downside volatility with a single value to compare different assets and that is what I will be discussing in this article. I will be using the following measures.

  1. Semideviation
  2. Value at Risk
  3. Conditional Value at Risk

Data and Code Implementation

The data that I will be using for this exercise is the EDHEC Hedge Fund Index data from the EDHEC Institute website. They track the returns from different hedge strategies from 1926 to 2018. There are 13 strategies and each of them reports their monthly returns. You can find the dataset here. To follow the code implementation, you can refer to this notebook. Start by loading the data and preprocessing it.

# Import libraries import pandas as pd 
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
# Load the data
hfi_data = pd.read_csv('../input/python/edhec-hedgefundindices.csv',
header=0, index_col=0, parse_dates=True)
# Look at the first few rows
hfi_data.head()
Data Preview

We need to convert the data into percentage returns and fix the date index. I have explained the reasons here but they are pretty self-explanatory.

# Convert to percentages 
hfi_data = hfi_data/100
# Fix index format
hfi_data.index = hfi_data.index.to_period(‘M’)

Semideviation

Semi-deviation is an alternative measurement to standard deviation or variance. It calculates the standard deviation of all those values which fall below the mean value rather than all the each and every value.

Formula for semideviation

Let’s calculate the standard deviation first and save it for comparison later.

# Calculate the standard deviation 
std = hfi_data.std(ddof=0)
# Calculate the standard deviation for returns which have negative values
semi_std = hfi_data[hfi_data<0].std(ddof=0)

After printing the sorted values, you’d see how the list of riskiest assets changes.

Riskiest Assets according to Standard Deviation
Riskiest Assets according to semi deviation

You can also plot the two for comparison.

comparison = pd.concat([std, semi_std], axis=1)
comparison.columns = [“Standard Deviation”, “Semi-Deviation”]
comparison.plot.bar(title=”Standard Deviation vs Semideviation”)
The plot of the two risk measures

It can be seen how there is a huge difference between the downside risk measure and the conventional risk measures for these indices. Hence, your results would have been completely different had you used the downside risk measures.

Value at Risk (VaR)

The next measure is one of the most commonly used measures by investment and commercial banks. It is a statistic which helps us estimate the extreme downside of an investment or the maximum “expected” loss over a given period of time along with a probability of occurrence of the defined loss at a particular confidence level. Here, the confidence levels are defined a little differently. At a specified confidence level, also called Value at Risk (VaR), let’s say 99% VaR, we are looking at the worst possible outcome after excluding the worst 1% of the losses. It can be understood as considering best 99% of the data and excluding the worst 1% for our VaR calculation. Whatever value is lying at the 99th percentile is considered the worst possible outcome and is the maximum losses you can have in 99% of the cases.

Now, you might argue that leaving the worst 1% is actually not a good idea because it might contain useful information or that we should consider that to be a part of our risk measure and you’d be right. Hence, we also include another risk measure call the Beyond Value at Risk or more, popularly Conditional Value at Risk (CVaR) which I will talk about in the next section.

Code Implementation

Let’s look at the different ways of calculating VaR. We will consider three methods:-

Method 1:- Historic VaR

It is as simple as the name sounds. We use historic data to identify the worst possible loss at a particular confidence level. It’s very simple to implement using the percentile function.

# Historic VaR for each index at the 95% confidence level
np.percentile(hfi_data, 5, axis=0)
Output of the above command

The output looks something like this. This is the 95th percentile or the 95% VaR for each hedge fund strategy and while this is one way to do it, I don’t quite like this output format so we can change that using a simple trick that will come in handy later.

There is a negative sign before percentile because it’s common practice to report VaR as a positive number because you lose 4% of your money not -4% of your money. The output looks a lot cleaner. You can check that the values match with the above array.

VaR values for each hedging strategy at 95% confidence level

Method 2:- Parametric VaR or the Gaussian method

The parametric method looks at the price movements of investments over a look-back period and uses probability theory to compute a portfolio’s maximum loss. This is one of the most common forms of VaR calculation used in practice with hedge fund managers because the only variables you need to do the calculation are the mean and standard deviation of the portfolio. This method assumes that the distributions of the returns are normally distributed. This allows us to use the calculated standard deviation to compute a standard normal z score and determine our risk level with a degree of confidence very easily.

z score expression

We will be using the scipy.stats library for the same. We will be using the percent point function (ppf) of the norm class. It returns the z score which gives us the estimate of how many standard deviations is the observation away from the mean.

from scipy.stats import norm# Compute the z score assuming the data is gaussian 
# Percent point function
z = norm.ppf(0.05)
print(z)
Output: -1.6448536269514729

Let’s compute the gaussian VaR. The negative sign is due to the convention discussed above.

# Compute the gaussian VaR
var_gauss = -(hfi_data.mean() + z*hfi_data.std(ddof=0))
print(var_gauss)
Gaussian VaR

Method 3:- Modified Cornish Fisher VaR

One of the biggest issues of the above method is the assumption of normality. We assume that the stock returns are normally distributed when in reality, this is far from true and this assumption has led to devastating results in the past like LTCM.

In reality, the returns distributions are largely skewed and kurtotic. I might write a separate article to explain these deviations from normality but for now, you can refer to this notebook for the same. Here is an example of this dataset.

Start by making skewness and kurtosis functions. You can use the built-in skewness and kurtosis functions but writing our own is fun.

Skewness and Kurtosis calculation functions

For normally distributed data, the skewness is 0 and the kurtosis is 3. Here are the values for this data.

# Calculate skew and kurt
k = kurtosis(hfi_data)
s = skewness(hfi_data)
stats = pd.concat([s, k], axis=1)
stats.columns = [“Skewness”, “Kurtosis”]
stats
Skewness and Kurtosis for each hedge fund strategy

Enough deviation! (pun intended) Let’s calculate the Cornish-Fisher VaR. We basically need an updated z value at that confidence level.

# Calculate kurtosis and skew
k = kurtosis(hfi_data)
s = skewness(hfi_data)
z = norm.ppf(0.05)
# Update z
z = (z + (z**2 - 1)*s/6 + (z**3 - 3*z)*(k-3)/24 - (2 * z**3 - 5*z)*(s**2)/36)
# Calculate the VaR with modified z
mcf_var = -(hfi_data.mean() + z*hfi_data.std(ddof=0))
mcf_var
Modified VaR values

Comparison of all three methods

It’s important to compare all three methods.

# Compare all three by making a bar plot 
results = [var_gauss, mcf_var, var_historic(hfi_data, level=5)]
comparison=pd.concat(results, axis=1)
comparison.columns = ["Gaussian", "Cornish-Fisher", "Historic"]
comparison
VaR values from all three methods
# Plot the comparison DataFrame
ax = comparison.plot.bar(title="Hedge Fund Indices: VaR")
ax.set_xlabel("Indices")
ax.set_ylabel("Value at Risk")

You can clearly see that there are stark differences between all three methods of VaR estimation and it is important to understand the difference between all three of them.

Conditional Value at Risk (CVaR)

Like we discussed earlier, there needs to be a method which can estimate the worst returns as well and that they might provide valuable insight into the risk analysis of the investment. It quantifies the amount of tail risk an investment portfolio has. If an investment has shown stability over time, then the value at risk may be sufficient for risk management in a portfolio containing that investment. However, the less stable the investment, the greater the chance that VaR will not give a full picture of the risks, as it is indifferent to anything beyond its own threshold. [1]

CVaR Expression

Code Implementation

Here’s how we will make a historic CVaR function. We will calculate the mean of all those values which are found “beyond” the mean. Here’s how to do it.

CVaR code implementation
CVaR values output

Conclusion

Risk Assessment is one of the most important aspects of asset management and it is important to know the currently used methods as well as the pros and cons of those methods. The best strategy is to calculate the VaR and CVaR to get a more wholesome picture of the volatility of your desired asset and I hope that this tutorial helped you figure out how to analyse the downside risk using various strategies. Looking forward to hearing your opinions on the same.

--

--

I like to write about data science, machine learning and finance. I document personal experiences and projects. I love to hike and swim! Reading when not coding