Regression Calculator

Enter X and Y arrays, separated by commas

Comma-separated numbers
Must have same count as X values

What is Linear Regression?

Simple linear regression models the relationship between two variables by fitting a straight line to the data. One variable is the independent variable (X) and the other is the dependent variable (Y).

The goal is to find the line that minimizes the sum of squared residuals (differences between observed and predicted Y values) — this is called the Ordinary Least Squares (OLS) method.

ŷ = b₀ + b₁x

Where:

  • ŷ = predicted Y value
  • b₀ = y-intercept (value of Y when X = 0)
  • b₁ = slope (change in Y for each 1-unit increase in X)
  • x = independent variable value

How to Calculate Slope & Intercept

b₁ = [nΣxy − ΣxΣy] / [nΣx² − (Σx)²]
b₀ = ȳ − b₁x̄

Interpreting the Results

OutputSymbolWhat It Means
Slope b₁ For each 1-unit increase in X, Y changes by this amount. Positive = upward trend.
Intercept b₀ Predicted value of Y when X = 0. May not always be meaningful in context.
R² (R-squared) Proportion of variance in Y explained by X. 0.85 = 85% of Y's variation is explained by X.
Pearson's r r Correlation coefficient (−1 to 1). Values near ±1 indicate strong linear relationships.

R² Interpretation Guide:

0.00–0.29: Weak | 0.30–0.49: Moderate | 0.50–0.69: Moderate–Strong | 0.70–0.89: Strong | 0.90–1.00: Very Strong

Example: Advertising vs. Sales

A company tracked weekly advertising spend ($1000s) and resulting sales ($1000s):

WeekAd Spend (X)Sales (Y)
1114
2217
3321
4527
5836

Result: ŷ = 10.4 + 3.2x, R² = 0.99 — a near-perfect linear relationship. For every additional $1,000 in ad spend, sales increase by approximately $3,200.

FAQ

What's the difference between R and R²?
R (Pearson's r) is the correlation coefficient ranging from −1 to +1. It measures the strength and direction of the linear relationship. R² is simply r squared, and ranges from 0 to 1. R² represents the proportion of variability in Y that is explained by X. For example, r = 0.9 gives R² = 0.81, meaning 81% of Y's variability is explained by X.
What does a negative slope mean?
A negative slope indicates an inverse relationship — as X increases, Y decreases. For example, if you're modeling the relationship between hours of TV watched and GPA, you might find a negative slope.
Do X and Y need to be the same length?
Yes — X and Y arrays must have the same number of values. Each X value is paired with its corresponding Y value. The calculator will alert you if the arrays have different lengths.
What's the minimum dataset size for regression?
Technically regression requires at least 2 data points, but for meaningful analysis you need at least 5–10 data points. With very small samples, the results are unreliable due to high variability. This calculator requires at least 2 pairs to function.