Correlation Coefficient Calculator
Calculate the Pearson r correlation coefficient for paired X,Y data. Shows r, r², and interprets the strength and direction of the relationship.
Results
Pearson Correlation Coefficient
The Pearson r measures the strength and direction of the linear relationship between two variables X and Y. It ranges from −1 (perfect negative linear) to +1 (perfect positive linear), with 0 indicating no linear relationship.
The Formula
r = Σ[(xᵢ − x̄)(yᵢ − ȳ)] / √[Σ(xᵢ − x̄)² × Σ(yᵢ − ȳ)²]
This is the covariance of X and Y divided by the product of their standard deviations. It is scale-invariant — changing units doesn't change r.
Interpreting r²
r² (coefficient of determination) represents the proportion of variance in Y explained by the linear relationship with X. If r = 0.9, then r² = 0.81, meaning 81% of Y's variance is explained by X. The remaining 19% is unexplained.
Limitations
Pearson r only measures linear relationships. Two variables can have r = 0 but still have a strong non-linear relationship (e.g., a U-shape). Always visualize data in a scatter plot alongside the correlation coefficient.
Worked Example
Hours studied (X): 1, 2, 3, 4, 5 and exam scores (Y): 60, 65, 72, 78, 85. Means: x̄=3, ȳ=72. Calculate Σ(xᵢ−x̄)(yᵢ−ȳ) = (−2)(−12)+(−1)(−7)+(0)(0)+(1)(6)+(2)(13) = 24+7+0+6+26 = 63. Σ(xᵢ−x̄)² = 4+1+0+1+4 = 10. Σ(yᵢ−ȳ)² = 144+49+0+36+169 = 398. r = 63 / √(10×398) = 63 / √3980 = 63/63.09 ≈ 0.999. This near-perfect positive correlation confirms that study hours and scores are strongly linearly related.
Correlation Strength Guidelines
| |r| Range | Interpretation | Example Context |
|---|---|---|
| 0.90 – 1.00 | Very strong | Calibrated instruments |
| 0.70 – 0.89 | Strong | Height and weight |
| 0.50 – 0.69 | Moderate | Income and spending |
| 0.30 – 0.49 | Weak | Education and happiness |
| 0.00 – 0.29 | Very weak / none | Shoe size and IQ |
Frequently Asked Questions
r = 0 means no linear correlation. The variables may still be related non-linearly. For example, (x,y) pairs (−2,4),(−1,1),(0,0),(1,1),(2,4) have r=0 but a clear quadratic relationship.
Rough guidelines: |r| < 0.3 = weak, 0.3–0.7 = moderate, |r| > 0.7 = strong. But context matters — in physics, r=0.99 may be expected; in social science, r=0.4 might be considered a strong result.
Pearson r measures linear correlation on raw values; Spearman rho measures monotonic correlation on ranked values. Use Spearman when data isn't normally distributed or when the relationship is monotonic but not necessarily linear.
No. r only measures association, not cause. Ice cream sales and drowning deaths are positively correlated (both rise in summer) but neither causes the other — hot weather is the confounding variable.
The regression slope m = r × (sY/sX), where sY and sX are the standard deviations of Y and X. When variables are standardized (z-scores), the slope equals r exactly. A correlation of r=0.8 doesn't mean Y increases by 0.8 units for every 1-unit increase in X — that depends on the scales of both variables.
Use the t-test for correlation: t = r × √(n−2) / √(1−r²) with df = n−2. For r=0.7 with n=20: t = 0.7 × √18 / √0.51 = 0.7 × 4.243 / 0.714 ≈ 4.16. With df=18, p < 0.001, so the correlation is highly significant. Small samples need higher r values to achieve significance.