Correlation coefficients are indicators of the strength of the linear relationship between two different variables, x and y. A linear correlation coefficient that is greater than zero indicates a positive relationship. A value that is less than zero signifies a negative relationship. Finally, a value of zero indicates no relationship between the two variables.
This article explains the significance of linear correlation coefficients for investors, how to calculate covariance for stocks, and how investors can use correlation to predict the market.
Key Takeaways:
- Correlation coefficients are used to measure the strength of the linear relationship between two variables.
- A correlation coefficient greater than zero indicates a positive relationship while a value less than zero signifies a negative relationship.
- A value close to zero indicates a weak relationship between the two variables being compared.
- A negative correlation, or inverse correlation, is a key concept in the creation of diversified portfolios that can better withstand portfolio volatility.
- Calculating the correlation coefficient is time-consuming, so data is often plugged into a calculator, computer, or statistics program to find the coefficient.
Understanding Correlation
The correlation coefficient (ρ) is a measure that determines the degree to which the movement of two different variables is associated. The most common correlation coefficient, generated by the Pearson product-moment correlation, is used to measure the linear relationship between two variables. However, in a non-linear relationship, this correlation coefficient may not always be a suitable measure of dependence.
The possible range of values for the correlation coefficient is -1.0 to 1.0. In other words, the values cannot exceed 1.0 or be less than -1.0. A correlation of -1.0 indicates a perfect negative correlation and a correlation of 1.0 indicates a perfect positive correlation. If the correlation coefficient is greater than zero, it is a positive relationship. Conversely, if the value is less than zero, it is a negative relationship. A value of zero indicates that there is no relationship between the two variables.
When interpreting correlation, it's important to remember that just because two variables are correlated, it does not mean that one causes the other.
Correlation and the Financial Markets
In the financial markets, the correlation coefficient is used to measure the correlation between two securities. For example, when two stocks move in the same direction, the correlation coefficient is positive. Conversely, when two stocks move in opposite directions, the correlation coefficient is negative.
If the correlation coefficient of two variables is zero, there is no linear relationship between the variables. However, this is only for a linear relationship. Two variables can have a strong relationship but a weak correlation coefficient if the relationship between them is nonlinear. When the value of ρ is close to zero, generally between -0.1 and +0.1, the variables are said to have no linear relationship (or a very weak linear relationship).
For example, suppose that the prices of coffee and computers are observed and found to have a correlation of +.0008. This means that there is only a very weak correlation, or relationship, between the two prices.
Calculating ρ
The covariance of the two variables in question must be calculated before the correlation can be determined. Next, each variable's standard deviation is required. The correlation coefficient is determined by dividing the covariance by the product of the two variables' standard deviations.
Standard deviation is a measure of the dispersion of data from its average. Covariance is a measure of how two variables change together. However, its magnitude is unbounded, so it is difficult to interpret. The normalized version of the statistic is calculated by dividing covariance by the product of the two standard deviations. This is the correlation coefficient.
Correlation=ρ=σXσYcov(X,Y)
Positive Correlation
A positive correlation—when the correlation coefficient is greater than 0—signifies that both variables tend to move in the same direction. When ρ is +1, it signifies that the two variables being compared have a perfect positive relationship; when one variable moves higher or lower, the other variable moves in the same direction with the same magnitude.
The closer the value of ρ is to +1, the stronger the linear relationship. For example, suppose the value of oil prices is directly related to the prices of airplane tickets, with a correlation coefficient of +0.95. The relationship between oil prices and airfares has a very strong positive correlation since the value is close to +1. So, if the price of oil decreases, airfares also decrease, and if the price of oil increases, so do the prices of airplane tickets.
In the chart below, we compare one of the largest U.S. banks, JPMorgan Chase & Co. (JPM), with the Financial Select SPDR Exchange Traded Fund (ETF) (XLF). As you can imagine, JPMorgan Chase & Co. should have a positive correlation to the banking industry as a whole. From Oct. 2022 to Oct. 2023, we can see the correlation coefficient was +0.34, which signals a positive correlation, as expected; however, it is a weak correlation, due to JPM's approximate 13% increase in the past year and XLF's approximate 2.8% decrease.
Understanding the correlation between two stocks (or a single stock) and their industry can help investors gauge how the stock is trading relative to its peers. All types of securities, including bonds, sectors, and ETFs, can be compared with the correlation coefficient.
Negative Correlation
A negative (inverse) correlation occurs when the correlation coefficient is less than 0. This is an indication that both variables move in the opposite direction. In short, any reading between 0 and -1 means that the two securities move in opposite directions. When ρ is -1, the relationship is said to be perfectly negatively correlated.
In short, if one variable increases, the other variable decreases with the same magnitude (and vice versa). However, the degree to which two securities are negatively correlated might vary over time (and they are almost never exactly correlated all the time).
Examples of Negative Correlation
For example, suppose a study is conducted to assess the relationship between the outside temperature and heating bills. The study concludes that there is a negative correlation between the prices of heating bills and the outdoor temperature. The correlation coefficient is calculated to be -0.96. This strong negative correlation signifies that as the temperature decreases outside, the prices of heating bills increase (and vice versa).
When it comes to investing, a negative correlation does not necessarily mean that the securities should be avoided. The correlation coefficient can help investors diversify their portfolios by including a mix of investments that have a negative, or low, correlation to the stock market. In short, when reducing volatility risk in a portfolio, sometimes opposites do attract.
For example, assume you have a $100,000 balanced portfolio that is invested 60% in stocks and 40% in bonds. In a year of strong economic performance, the stock component of your portfolio might generate a return of 12% while the bond component may return -2% because interest rates are rising (which means that bond prices are falling).
Thus, the overall return on your portfolio would be 6.4% ((12% x 0.6) + (-2% x 0.4). The following year, as the economy slows markedly and interest rates are lowered, your stock portfolio might generate -5% while your bond portfolio may return 8%, giving you an overall portfolio return of 0.2%.
What if, instead of a balanced portfolio, your portfolio were 100% equities? Using the same return assumptions, your all-equity portfolio would have a return of 12% in the first year and -5% in the second year. These figures are clearly more volatile than the balanced portfolio's returns of 6.4% and 0.2%.
Linear Correlation Coefficient
The linear correlation coefficient is a number calculated from given data that measures the strength of the linear relationship between two variables: x and y. The sign of the linear correlation coefficient indicates the direction of the linear relationship between x and y. When r (the correlation coefficient) is near 1 or −1, the linear relationship is strong; when it is near 0, the linear relationship is weak.
Even for small datasets, the computations for the linear correlation coefficient can be too long to do manually. Thus, data is often plugged into a calculator or, more likely, a computer or statistics program to find the coefficient.
The Pearson Coefficient
Both the Pearson coefficient calculation and basic linear regression are ways to determine how statistical variables are linearly related. However, the two methods do differ. The Pearson coefficient is a measure of the strength and direction of the linear association between two variables with no assumption of causality.
The Pearson coefficient shows correlation, not causation. Pearson coefficients range from +1 to -1, with +1 representing a positive correlation, -1 representing a negative correlation, and 0 representing no relationship.
Simple linear regression describes the linear relationship between a response variable (denoted by y) and an explanatory variable (denoted by x) using a statistical model. Statistical models are used to make predictions.
Simplify linear regression by calculating correlation with software such as Excel.
In finance, for example, correlation is used in several analyses including the calculation of portfolio standard deviation. Because it is so time-consuming, correlation is best calculated using software like Excel. Correlation combines statistical concepts, namely, variance and standard deviation. Variance is the dispersion of a variable around the mean, and standard deviation is the square root of variance.
How to Calculate the Correlation Coefficient
Correlation combines several important and related statistical concepts, namely, variance and standard deviation. Variance is the dispersion of a variable around the mean, and standard deviation is the square root of variance.
The formula is:
r=[n∑x2−(∑x)2][n∑y2−(∑y)2)]n(∑xy)−(∑x)(∑y)
The computing is too long to do manually, and software, such as Excel, or a statistics program, are tools used to calculate the coefficient.
Finding Correlation Using Excel
There are several methods to calculate correlation in Excel. The simplest is to get two data sets side-by-side and use the built-in correlation formula:
If you want to create a correlation matrix across a range of data sets, Excel has a Data Analysis plugin that is found on the Data tab, under Analyze.
Select the table of returns. In this case, our columns are titled, so we want to check the box "Labels in first row," so Excel knows to treat these as titles. Then you can choose to output on the same sheet or on a new sheet.
Once you hit enter, the data is automatically created. You can add some text and conditional formatting to clean up the result.
Finding Correlation on a Graphing Calculator
A graphing calculator, such as a TI-84, can also be used to calculate the correlation coefficient. The following instructions are provided by Statology.
Step 1: Turn on Diagnostics
You will only need to do this step once on your calculator. After that, you can always start at step 2 below. If you don’t do this, r (the correlation coefficient) will not show up when you run the linear regression function.
Press [2nd] and then [0] to enter your calculator’s catalog. Scroll until you see “diagnosticsOn”.
Press enter until the calculator screen says “Done”.
This is important to repeat: You never have to do this again unless you reset your calculator.
Step 2: Enter Data
Enter your data into the calculator by pressing [STAT] and then selecting 1:Edit. To make things easier, you should enter all of your “x data” into L1 and all of your “y data” into L2.
Step 3: Calculate!
Once you have your data in, you will now go to [STAT] and then the CALC menu up top. Finally, select 4:LinReg and press enter.
That’s it! You’re done! Now you can simply read off the correlation coefficient right from the screen (its r). Remember, if r doesn’t show on your calculator, then diagnostics need to be turned on. This is also the same place on the calculator where you will find the linear regression equation and the coefficient of determination.
The linear correlation coefficient is a number calculated from given data that measures the strength of the linear relationship between two variables, x and y.
What Is the Linear Correlation Coefficient?
The linear correlation coefficient is a number calculated from given data that measures the strength of the linear relationship between two variables.
What Is Meant by Linear Correlation?
The correlation coefficient is a value between -1 and +1. A correlation coefficient of +1 indicates a perfect positive correlation. As variable x increases, variable y increases. As variable x decreases, variable y decreases. A correlation coefficient of -1 indicates a perfect negative correlation. As variable x increases, variable z decreases. As variable x decreases, variable z increases.
What Is Considered a Strong Correlation Coefficient?
Generally, the closer a correlation coefficient is to 1.0 (or -1.0) the stronger the relationship between the two variables is said to be. While there is no clear boundary to what makes a "strong" correlation, a coefficient above 0.75 (or below -0.75) is considered a high degree of correlation, while one between -0.3 and 0.3 is a sign of weak or no correlation. In experimental science, researchers will sometimes repeat the same study to see if a high degree of correlation can be reproduced.
The Bottom Line
The linear correlation coefficient can be helpful in determining the relationship between an investment and the overall market or other securities. It is often used to predict stock market returns. This statistical measurement is useful in many ways, particularly in the finance industry.
For example, it can be helpful in determining how well a mutual fund is behaving compared to its benchmark index, or it can be used to determine how a mutual fund behaves in relation to another fund or asset class. By adding a low, or negatively correlated, mutual fund to an existing portfolio, diversification benefits are gained.