# Correlation Coefficient

Correlation coefficient measures the degree to which two variables move together. Its value ranges between -1 and 1. -1 indicates perfectly negative relationship, 1 shows a perfectly positive relationship and zero means there is no linear relationship between the variables. Correlation doesn’t necessarily mean causation.

Correlation coefficient is a very important number in finance because it helps tell whether there is a relationship between say population growth and GDP growth, crude oil price and stock price of oil and gas companies, a mutual fund and the broad market index, etc. However, correlation coefficient must be used with a caveat: it doesn’t infer causation. Two variables might have a very high correlation, but it might not necessarily mean that one causes the other.

## Formula

The most common measure of correlation is called the Pearson correlation which can be calculated using the following formula:

$$\text{r}=\frac{\text{n} \sum{\text{xy}} - \sum{\text{x}} \sum{\text{y}}}{\sqrt{\left[ \text{n} \sum{\text{x}^\text{2}} - \left( \sum{\text{x}} \right)^\text{2} \right] \times \left[ \text{n} \sum{\text{y}^\text{2}} - \left( \sum{\text{y}} \right)^\text{2} \right]}}$$

If you already know the covariance between two investments, you can find correlation coefficient using the following formula:

$$\text{Correlation Coefficient}\ (\text{r})=\frac{\text{Covariance}(\text{x} \text{,} \text{y})}{\sigma _ \text{x}\sigma _ \text{y}}$$

Where σx and σy represent the standard deviation of variable x and y respectively.

Correlation coefficient can also be calculated using Excel CORREL function. It’s syntax is CORREL(array1, array2). array1 and array2 represent the series of x and y values.

## Example

The following table shows the monthly close data for SPDR S&P Oil & Gas Explore & Prod. (ETF) (XOP) (designated as y variable) and Brent Crude Oil Price (designated as x variable) from 1 January 2014 to 31 December 2017:

Date x y
1/1/2014 109.95 65.75
2/1/2014 108.16 69.69
3/1/2014 108.98 71.83
4/1/2014 105.7 77.61
5/1/2014 108.63 77.04
6/1/2014 109.21 82.28
7/1/2014 110.84 75.29
8/1/2014 103.45 79.05
9/1/2014 101.12 68.83
10/1/2014 94.57 60.87
11/1/2014 84.17 51.08
12/1/2014 70.87 47.86
1/1/2015 55.27 46.18
2/1/2015 47.52 50.81
3/1/2015 61.89 51.66
4/1/2015 55.73 55.09
5/1/2015 64.13 49.53
6/1/2015 64.88 46.66
7/1/2015 62.01 38.35
8/1/2015 52.21 36.00
9/1/2015 49.56 32.84
10/1/2015 47.69 36.61
11/1/2015 49.56 37.13
12/1/2015 44.44 30.22
1/1/2016 37.28 28.49
2/1/2016 34.24 24.60
3/1/2016 36.81 30.35
4/1/2016 38.67 35.74
5/1/2016 48.13 35.52
6/1/2016 49.72 34.81
7/1/2016 50.35 34.25
8/1/2016 42.14 36.79
9/1/2016 45.45 38.46
10/1/2016 49.06 35.35
11/1/2016 48.14 41.93
12/1/2016 53.94 43.18
1/1/2017 56.82 40.08
2/1/2017 56.8 37.86
3/1/2017 56.36 37.44
4/1/2017 52.83 34.95
5/1/2017 51.52 32.57
6/1/2017 50.63 31.92
7/1/2017 47.92 32.52
8/1/2017 51.78 30.16
9/1/2017 52.75 34.09
10/1/2017 57.54 34.28
11/1/2017 60.49 35.72
12/1/2017 63.73 37.18

After working out the necessary intermediate numbers, our Pearson correlation equation looks like this:

$$\text{Correlation Coefficient}\ (\text{r})\\=\frac{\text{48} \times \text{155,384} - \text{3,064} \times \text{2,177}}{\sqrt{\left[ \text{48} \times \text{221,774} - \text{3,065}^\text{2} \right] \times \left[ \text{48} \times \text{110,526} - \text{2,177}^\text{2} \right]}}\\=\text{0.93}$$

We get the same result using Excel CORREL function as illustrated in the attached Excel Worksheet.