What's considered a 'strong' correlation?

It depends on the field. In physics, r 0.50 is often considered strong. Generally: |r| 0.7 is strong. But context matters—a 'weak' r = 0.2 correlation between a drug and survival could be clinically important.

Correlation Calculator

Calculate Pearson correlation coefficient, covariance, and R-squared between two variables.

Enter Your Data

X Values (Independent Variable)

Y Values (Dependent Variable)

Data Points:10 pairs

Summary Statistics

X Variable

Mean:5.5000

Std Dev:3.0277

Y Variable

Mean:11.0700

Std Dev:6.0222

Pearson Correlation (r)

0.9997

Very strong positive correlation

Correlation Analysis

Correlation (r)0.999709

R-Squared (r2)0.999418

99.9% of Y variance explained by X

Covariance18.227778

Correlation Strength Scale

0.9 - 1.0Very strong

0.7 - 0.9Strong

0.5 - 0.7Moderate

0.3 - 0.5Weak

0.0 - 0.3Very weak

Linear Regression

y = 1.9885x + 0.1333

Slope (m)

1.988485

Intercept (b)

0.133333

Standard Error0.154135

Significance Test

t-statistic117.1786

Degrees of Freedom8

Data Pairs

#	X	Y
1	1	2.3
2	2	4.1
3	3	5.8
4	4	8.2
5	5	10.1
6	6	11.9
7	7	14.2
8	8	16
9	9	18.1
10	10	20

What Is Correlation?

Correlation measures the strength and direction of the linear relationship between two variables. It's one of the most important concepts in statistics, used everywhere from scientific research to finance to machine learning. The most common measure is the Pearson correlation coefficient (r), which ranges from -1 to +1.

Correlation (r)	Strength	Interpretation	Example
r = +1	Perfect positive	As X increases, Y increases proportionally	Celsius and Fahrenheit
0.7 ≤ r < 1	Strong positive	Clear upward trend	Height and weight
0.3 ≤ r < 0.7	Moderate positive	Noticeable upward trend	Study time and grades
0 < r < 0.3	Weak positive	Slight upward tendency	Shoe size and IQ
r = 0	No correlation	No linear relationship	Coin flips and die rolls
-0.3 < r < 0	Weak negative	Slight downward tendency	Various
-0.7 < r ≤ -0.3	Moderate negative	Noticeable downward trend	Absences and grades
-1 ≤ r < -0.7	Strong negative	Clear downward trend	Price and demand
r = -1	Perfect negative	As X increases, Y decreases proportionally	Distance and fuel remaining

Pearson Correlation Coefficient

r = Σ[(xᵢ - x̄)(yᵢ - ȳ)] / √[Σ(xᵢ - x̄)² × Σ(yᵢ - ȳ)²]

Where:

r= Pearson correlation coefficient (-1 to +1)
xᵢ, yᵢ= Individual data points
x̄, ȳ= Means of X and Y

Correlation Does Not Imply Causation

The most important rule in statistics: correlation does not imply causation. Just because two variables are correlated doesn't mean one causes the other. There could be a third variable (confounder) causing both, or the correlation could be spurious (coincidence).

Correlation Observed	Possible Explanation	Why Not Causation
Ice cream sales ↔ Drownings	Both increase in summer (temperature confound)	Ice cream doesn't cause drowning
Shoe size ↔ Reading ability	Both increase with age (age confound)	Bigger feet don't cause better reading
Pirates ↔ Global temperature	Coincidence (spurious)	Fewer pirates didn't cause warming
Smoking ↔ Lung cancer	Actual causation (established experimentally)	This one IS causal (proven)

To establish causation: Use controlled experiments, randomized trials, or advanced causal inference methods—not just observational correlation.

R-Squared: Coefficient of Determination

R-squared (r²) is the square of the correlation coefficient. It represents the proportion of variance in one variable that's explained by the other. R² is easier to interpret as a percentage and is crucial in regression analysis.

Correlation (r)	R-Squared (r²)	Interpretation
r = 0.9	r² = 0.81 = 81%	81% of Y's variance explained by X
r = 0.7	r² = 0.49 = 49%	49% of Y's variance explained by X
r = 0.5	r² = 0.25 = 25%	25% of Y's variance explained by X
r = 0.3	r² = 0.09 = 9%	Only 9% of Y's variance explained
r = 0.1	r² = 0.01 = 1%	Virtually no explanatory power

Key insight: A "moderate" correlation of r = 0.5 only explains 25% of the variance. Even a "strong" r = 0.7 leaves 51% unexplained. This shows why multiple factors usually matter.

Types of Correlation Coefficients

Different situations call for different correlation measures. Pearson assumes linear relationships and continuous data; alternatives exist for other scenarios.

Type	Use When	Range	Assumptions
Pearson (r)	Linear relationship, continuous data	-1 to +1	Normality, linearity
Spearman (ρ)	Monotonic relationship, ordinal data	-1 to +1	None (rank-based)
Kendall (τ)	Ordinal data, small samples	-1 to +1	None (rank-based)
Point-Biserial	One binary, one continuous variable	-1 to +1	Normality of continuous
Phi (φ)	Two binary variables	-1 to +1	2×2 table

When to use Spearman: For non-linear monotonic relationships, ordinal data (rankings), or when outliers are present. It's based on ranks, making it robust to extreme values.

Testing Correlation Significance

A correlation might appear strong but be due to chance, especially with small samples. Statistical testing determines if a correlation is significantly different from zero.

Sample Size (n)	Critical r (α = 0.05)	Interpretation
n = 10	r = ±0.632	Need strong correlation for significance
n = 20	r = ±0.444	Moderate correlation can be significant
n = 30	r = ±0.361	Smaller r can be significant
n = 50	r = ±0.279	Weak-moderate correlation significant
n = 100	r = ±0.197	Even weak correlations significant

T-Test for Correlation Significance

t = r × √[(n-2) / (1-r²)] with df = n-2

Where:

t= Test statistic
r= Correlation coefficient
n= Sample size
df= Degrees of freedom

Limitations of Correlation

Correlation is powerful but has important limitations. Understanding these prevents misinterpretation of data.

Limitation	Description	Solution
Only detects linear relationships	Can miss curved relationships	Plot data, use Spearman for monotonic
Sensitive to outliers	One extreme point can dominate r	Use Spearman, remove outliers
Doesn't imply causation	Third variables may explain relationship	Use controlled experiments
Range restriction	Limited range underestimates true r	Use full range of data
Sample size matters	Small samples can show spurious r	Test significance, get larger n

Always visualize: A scatter plot reveals patterns that r alone cannot. Anscombe's quartet shows four datasets with identical r ≈ 0.82 but completely different relationships.

Applications of Correlation

Correlation analysis is used across virtually every field that deals with data. Understanding applications helps contextualize what "strong" or "weak" correlations mean.

Field	Application	Typical r Values
Psychology	Personality traits, test reliability	r = 0.3-0.5 often meaningful
Medicine	Risk factors, treatment outcomes	r = 0.2-0.4 can be clinically important
Finance	Stock correlations, portfolio diversification	r < 0.3 for diversification
Education	Predictors of academic success	r = 0.3-0.6 for standardized tests
Physics	Experimental relationships	r > 0.99 expected for physical laws
Social Science	Survey variables	r = 0.2-0.5 common

Worked Examples

Calculating Pearson Correlation

Problem:

Calculate the correlation between study hours (X) and exam scores (Y) for 5 students: X = [2,3,5,7,8], Y = [65,70,75,85,90]

Solution Steps:

1Calculate means: x̄ = 5, ȳ = 77
2Calculate deviations: x-x̄ = [-3,-2,0,2,3], y-ȳ = [-12,-7,-2,8,13]
3Calculate products: Σ(x-x̄)(y-ȳ) = 36+14+0+16+39 = 105
4Calculate sum of squares: Σ(x-x̄)² = 26, Σ(y-ȳ)² = 426
5Apply formula: r = 105 / √(26 × 426) = 105 / 105.24 = 0.998

Result:

r = 0.998, which is a nearly perfect positive correlation. More study hours strongly predict higher scores (though remember: correlation ≠ causation).

Interpreting R-Squared

Problem:

Height and weight have a correlation of r = 0.70 in a population. What percentage of weight variation is explained by height?

Solution Steps:

1Calculate r-squared: r² = 0.70² = 0.49
2Convert to percentage: 49%
3Interpret: Height explains 49% of weight variation
4Remaining: 51% is due to other factors

Result:

R² = 49%. Height explains about half the variation in weight. Other factors (diet, exercise, genetics, age) account for the other half.

Testing Correlation Significance

Problem:

A study of 25 people finds r = 0.40 between exercise and happiness. Is this significant at α = 0.05?

Solution Steps:

1Calculate t-statistic: t = 0.40 × √[(25-2)/(1-0.16)] = 0.40 × √(23/0.84) = 0.40 × 5.23 = 2.09
2Degrees of freedom: df = 25 - 2 = 23
3Critical t (α=0.05, two-tailed, df=23): t* = 2.069
4Compare: 2.09 > 2.069

Result:

t = 2.09 > critical value 2.069, so the correlation is statistically significant at the 0.05 level. The relationship is unlikely to be due to chance alone.

Tips & Best Practices

✓Always create a scatter plot before calculating correlation—it reveals patterns r cannot show.
✓Remember: correlation ≠ causation. Strong correlation may be due to confounding variables.
✓Use Spearman's correlation for ordinal data, non-linear monotonic relationships, or when outliers exist.
✓r² (R-squared) tells you the proportion of variance explained—often more interpretable than r.
✓Larger samples give more reliable correlation estimates; small samples can show spurious correlations.
✓Test statistical significance—a correlation of r = 0.3 may or may not be 'real' depending on sample size.
✓Context matters: r = 0.3 is 'weak' in physics but may be important in psychology or medicine.

Frequently Asked Questions

Correlation only shows that two variables move together—it can't distinguish between: (1) X causes Y, (2) Y causes X, (3) a third variable causes both (confounding), or (4) coincidence. To prove causation, you need controlled experiments where you manipulate one variable and observe the effect on another while holding other factors constant.

It depends on the field. In physics, r < 0.99 might be weak. In social sciences, r > 0.50 is often considered strong. Generally: |r| < 0.3 is weak, 0.3-0.7 is moderate, > 0.7 is strong. But context matters—a 'weak' r = 0.2 correlation between a drug and survival could be clinically important.

No, Pearson's r is mathematically bounded between -1 and +1. If you calculate a value outside this range, there's an error. Values of exactly ±1 indicate perfect linear relationships (all points on a line). Values near 0 indicate no linear relationship.

Use Spearman when: (1) the relationship is monotonic but not linear; (2) data is ordinal (rankings); (3) there are significant outliers; (4) data isn't normally distributed. Spearman uses ranks rather than raw values, making it robust to non-linearity and outliers.

Correlation measures the strength of relationship between two variables (symmetric—it doesn't matter which is X or Y). Regression predicts one variable from another (asymmetric—you predict Y from X). Correlation gives r; regression gives a prediction equation. Both use r² to measure explained variance.

Not necessarily! r = 0 means no LINEAR relationship. Variables can have strong non-linear relationships (curves, U-shapes) with r = 0. Always plot your data. Anscombe's quartet famously shows datasets with identical correlations but completely different patterns.

Sources & References

NIST/SEMATECH e-Handbook of Statistical Methods (2024)
Statistics How To - Correlation Coefficient (2024)
Khan Academy - Correlation (2024)
American Psychological Association - Statistical Methods (2024)

Last updated: 2026-01-22

Correlation Calculator

Enter Your Data

Summary Statistics

Pearson Correlation (r)

Correlation Analysis

Correlation Strength Scale

Linear Regression

Significance Test

Data Pairs

What Is Correlation?

Pearson Correlation Coefficient

Correlation Does Not Imply Causation

R-Squared: Coefficient of Determination

Types of Correlation Coefficients

Testing Correlation Significance

T-Test for Correlation Significance

Limitations of Correlation

Applications of Correlation

Worked Examples

Calculating Pearson Correlation

Interpreting R-Squared

Testing Correlation Significance

Tips & Best Practices

Related Calculators

Frequently Asked Questions

Sources & References