Department of Economics
Ethiraj College for Women
Chennai 600 008
The method of correlation is expanded by Francis Galton in 1885.
Correlation is a statistical technique that can reveal whether and how
strongly pairs of variables are associated.
Correlation is a term measure the strength of a linear relationship between
two quantitative variables.
Correlation used in measuring the closeness of the relationship between
the variables. Example Price and Demand
Simpson and Kofka
“Correlation analysis deals with the association between two or more variables”.
Ya Lun Chow
“Correlation analysis attempts to determine the degree of relationship between variables”.
Croxton and Cowden
“When the relationship is of a quantitative nature, the appropriate statistical tool for
discovering and measuring the relationship and expressing it in brief formula is known as
Correlation can measure the degree of relationship existing between the variables. It
measures the strength of linear relationship.
Correlation analysis contributes to the understanding of economic behaviour.
Correlation deals executive to estimate costs, prices and other variables.
The effect of correlation is to reduce the range of uncertainty. The prediction based on
correlation analysis is likely to be more reliable and near to reality.
It does not tell us anything about cause and effect relationship.
It establish only covariation. The correlation may be due to pure chance, especially
in a small sample.
The variables may be mutually influencing each other so that neither can be
designated as the cause and the other the effect.
The correlation between the variables is positive or negative depends on
its direction of change.
Two variables are positively correlated when they move together in the same
direction. Example quantity supplied increases as the price increases.
Positive coefficient of correlation 0 to + 1
X 10 12 15 18 20
Y 15 20 22 25 37
A negative correlation is a relationship between two variables in which an increase in one
variable is associated with a decrease in the other. Example the Price of Product decreases
Quantity Demand increases.
An inverse relation between the variables. Negative coefficient of correlation 0 to -1
A zero correlation exists when there is no relationship between two variables. Example their is no
relationship between the amount of tea drunk and level of intelligence.
X 100 90 60 40 30
Y 10 20 30 40 50
y = 1.8382x - 3.7735
R² = 0.8485
y = -0.5108x + 62.688
R² = 0.9704
y = 0.1042x + 72.271
R² = 1E-04
Amount of Tea Drunk
Source: Primary Data
Source: Primary Data
Source: Primary Data
The correlation is said to be simple when only two variables are studied.
The correlation is said to be Multiple when three or more variables are studied
simultaneously. Example the study the relationship between the yield of wheat per
acre and the amount of fertilizers and rainfall.
In partial correlation study more than two variables, but consider only two among
them that would be influencing each other such that the effect of the other
influencing variable is kept constant. Example study the relationship between the
yield and fertilizers used the particular periods - Partial Correlation.
The Correlation is linear when the amount of change in one variable to the amount of
change in another variable tends to bear a constant ratio. It shows that the ratio of
change between the variables is the same.
The correlation is called as non - linear or curvilinear when the amount of change in
one variable does not bear a constant ratio to the amount of change in the other
variable. Example If the amount of fertilizers is doubled the yield of wheat would not
be necessarily be doubled.
X 10 20 30 40 50
Y 20 40 60 80 100
y = 2x
R² = 1
Scatter Diagram Method
Karl Pearson Coefficient Correlation of Method
Spearman’s Rank Correlation Method
Concurrent Deviation Method
Method of Least Squares
The values of dependent series are plotted on X axis and independent
series are plotted on Y axis of graph paper.
The graph lines of two independent series move in upward direction -
The graph line of one series moves upward from left to right and that of
the other independent series moves downward from left to right -
The pairs of values are plotted on the graph paper, graphs of dots are obtained. Its called
scatter diagrams or dotograms.
When the dots appear to be situated on a line which advances upward at 45° angle from the
0 to X axis - Perfect Positive Correlation.
If the dots appear to be situated on a line which moves from left to right in downward
direction at 45° angle from 0 to X axis - Perfect Negative Correlation.
Its is a very simple method of studying correlation between two variables
It explains if the values of the variables have any relation or not
Scatter diagram indicates whether the relationship is positive or negative
Scatter diagram does not measure the precise the extent of correlation
It gives only an approximate idea of the relationship
It is only an qualitative expression of the qualitative change
Karl Pearson’s Coefficient of Correlation is used to calculate the degree and direction of
the relationship between linear related variables.
Pearson’s method is known as a Pearson Coefficient of Correlation, It is denoted by “r”
Pearson’s Coefficent correlation can be transforms formula
Calculate Karl Pearson’s coefficient of correlation from the following data and interpret
Roll No. of Students 1 2 3 4 5
Marks in Accountancy 48 35 17 23 47
Marks in Statistics 45 20 40 25 45
Roll No X X2
1 48 14 196 45 10 100 140
2 35 1 1 20 -15 225 -15
3 17 -17 289 40 5 25 -85
4 23 -11 121 25 -10 100 110
5 47 13 169 45 10 100 130
WHEN DEVIATIONS ARE FROM AN ASSUMED MEAN
Calculate the coefficient of correlation and calculate the probable error.
X dx (X-69)
Y dy(Y-112) dy² dxdy
78 9 81 125 13 169 117
89 20 400 137 25 625 500
99 30 900 156 44 1936 1320
60 -9 81 112 0 0 0
59 -10 100 107 -5 25 50
79 10 100 136 24 576 240
68 -1 1 123 11 121 -11
61 -8 64 108 4 16 32
ƩX= 593 Ʃdx= 41 Ʃdx² =1727 ƩY= 1004 Ʃdy= 108 Ʃdy² =3468 Ʃdxdy= 2248
Conditions of Probable Error
The data must approximate to the bell shaped curve. (Normal Frequency Curve)
The Probable error computed from the statistical measure must have been taken from
the sample .
The Sample items must be selected in an unbiased manner and must be independent of
The Probable Error of Correlation Coeficient helps in determining the accuracy and
reliability of the value of the coefficient that in so far depends on the random
Probable Error =
•Spearman’s Rank Correlation Coefficient is a technique which can be used to
summarise the strength and direction (negative or Positive)of a relationship between
two variables. The result will always between +1 to -1.
•Where R denotes Rank Correlation Coefficient
•D refers to the difference of the rank between paired items in to series
•Rank Correlation (when rank are not given)
•Ranks can be assigned by taking either the highest value as 1 or the lowest value as 1
The value of such co-efficient of correlation lies between +1 and -1.
The sum of the differences between the corresponding ranks i.e. ∑d=0.
It is independent of the nature of distribution from which the sample data are collected for
calculation of the co-efficient.
It is calculation on the basis of the ranks of the individual items rather than their actual
Its result equals with the result of Karl Pearson’s co-efficient of correlation unless there is
repletion of any rank. This is because, Spearman’s correlation is nothing more than the
Pearson’s co-efficient of correlation between the ranks.