Line of Best Fit: Definition, How It Works, and Calculation

What Is the Line of Best Fit?

Line of best fit refers to a line through a scatter plot of data points that best expresses the relationship between those points. Statisticians typically use the least squares method (sometimes known as ordinary least squares, or OLS) to arrive at the geometric equation for the line, either through manual calculations or by using software.

A straight line will result from a simple linear regression analysis of two or more independent variables. A multiple regression involving several related variables can produce a curved line in some cases.

Key Takeaways

  • A line of best fit is a straight line that minimizes the distance between it and some data.
  • The line of best fit is used to express a relationship in a scatter plot of different data points.
  • It is an output of regression analysis and can be used as a prediction tool for indicators and price movements.
  • In finance, the line of best fit is used to identify trends or correlations in market returns between assets or over time.
Line of Best Fit: A line through a scatter plot of data points that best expresses the relationship between those points.

investopedia / Eliana Rodgers

Understanding the Line of Best Fit

The line of best fit estimates a straight line that minimizes the distance between itself and where observations fall in some data set. The line of best fit is used to show a trend or correlation between the dependent variable and independent variable(s). It can be depicted visually, or as a mathematical expression.

Line of best fit is one of the most important concepts in regression analysis. Regression refers to a quantitative measure of the relationship between one or more independent variables and a resulting dependent variable. Regression is of use to professionals in a wide range of fields from science and public service to financial analysis.

Line of Best Fit
Line of Best Fit.

Line of Best Fit and Regression Analysis

To perform a regression analysis, a statistician collects a set of data points, each including a complete set of dependent and independent variables. For example, the dependent variable could be a firm’s stock price and the independent variables could be the Standard and Poor’s 500 index and the national unemployment rate, assuming that the stock is not listed in the S&P 500. The sample set could be each of these three data sets for the past 20 years.

On a chart, these data points would appear as a scatter plot, a set of points that may or may not appear to be organized along any line. If a linear pattern is apparent, it may be possible to sketch a line of best fit that minimizes the distance of those points from that line. If no organizing axis is visually apparent, regression analysis can generate a line based on the least squares method. This method builds the line which minimizes the squared distance of each point from the line of best fit.

To determine the formula for this line, the statistician enters these three results for the past 20 years into a regression software application. The software produces a linear formula that expresses the causal relationship between the S&P 500, the unemployment rate, and the stock price of the company in question. This equation is the formula for the line of best fit. It is a predictive tool, providing analysts and traders with a mechanism to project the firm’s future stock price based on those two independent variables.

How to Calculate the Line of Best Fit

A regression with two independent variables such as the example discussed above will produce a formula with this basic structure:

y= c + b1(x1) + b2(x2)

In this equation, y is the dependent variable, c is a constant, b1 is the first regression coefficient and x1 is the first independent variable. The second coefficient and second independent variable are b2 and x2, respectively. Drawing from the above example, the stock price would be y, the S&P 500 would be x1 and the unemployment rate would be x2. The coefficient of each independent variable represents the degree of change in y for each additional unit in that variable.

If the S&P 500 increases by one, the resulting y or share price will go up by the amount of the coefficient. The same is true for the second independent variable, the unemployment rate. In a simple regression with one independent variable, that coefficient is the slope of the line of best fit. In this example or any regression with two independent variables, the slope is a mix of the two coefficients. The constant c is the y-intercept of the line of best fit.

How Do You Find the Line of Best Fit?

There are several approaches to estimating a line of best fit to some data. The simplest, and crudest, involves visually estimating such a line on a scatter plot and drawing it in to your best ability.

The more precise method involves the least squares method. This is a statistical procedure to find the best fit for a set of data points by minimizing the sum of the offsets or residuals of points from the plotted curve. This is the primary technique used in regression analysis.

Is a Line of Best Fit Always Straight?

By definition a line is always straight, so a best fit line is linear. However, a curve may also be used to describe the best fit in a set of data. Indeed, a best fit curve may be squared (x2), cubic (x3), quadratic (x4), logarithmic (ln), a square root (√), or anything else that can be described mathematically with an equation. Note, however, that simpler explanations of fit are often preferred.

How Is a Line of Best Fit Used in Finance?

For financial analysts, the method of estimating a line of best fit can help to quantify the relationship between two or more variables—such as a stock’s share price and its earnings per share (EPS). By performing this type of analysis investors often try to predict the future behavior of stock prices or other factors by extrapolating that line out in time.

The Bottom Line

A line of best fit estimates the one line that minimizes the distance between it and observed data. Estimating a line of best fit is a key component of regression analysis in statistics in order to infer the relationships between some dependent variable and one or more explanatory variables. In finance, the line of best fit is utilized in this way to conduct econometric studies and in certain tools used in technical analysis.

Take the Next Step to Invest
×
The offers that appear in this table are from partnerships from which Investopedia receives compensation. This compensation may impact how and where listings appear. Investopedia does not include all offers available in the marketplace.
Take the Next Step to Invest
×
The offers that appear in this table are from partnerships from which Investopedia receives compensation. This compensation may impact how and where listings appear. Investopedia does not include all offers available in the marketplace.