Covariance and correlation
To "measure" the tendency of two variables to vary proportionally, the concept of covariance is used, which derives its name from the relationship with the formula of variance: instead of the square of the difference from mean, we take the product of the two differences:
Var(X) = mean( (X–mean(X))2 )
Var(Y) = mean( (Y–mean(Y))2 ) covariance: Cov(X,Y) = mean( (X–mean(X))·(Y–mean(Y)) )
I can interpret it as an indicator that assumes an absolute value that goes down as far as the points tend to be arranged so as to present a vertical or horizontal symmetry and that grows as the points tend to be arranged along an oblique line. In fact the components of the summation (in the mean) represent areas "with sign" of rectangles whose dimensions are the distances "with sign" of the coordinates of the points from the coordinates of the center of gravity. In the figure below on the left (horizontal symmetry) the components of the sum two by two cancel each other out, so the covariance is zero. If you obliquely crush the cloud of points the compensation becomes only partial. In the case of the figure on the right (X and Y in linear relation) there is no compensation (all positive components). The sign will be equal to the sign of the slope of the line along which the points tend to be arranged.
Another possible interpretation is based on the observation that Cov(X,Y) =
To disregard the units of measurement in which X and Y are expressed (and to pass from an "area" to a pure number) the covariance is normalized by dividing by the standard deviation (ie the square root of variance) of X and Y, introducing the:
correlation coefficient: r X,Y = |
|
= |
|
These values refer to the situation in which X and Y are "all possible values". In the experimental case (where I have N observations) the value of the variance (and of standard deviation) is slightly lower:
Here are the commands in R. In experimental cases the initial letter of the commands is capitalized.
x=c(220,300,210,350,270); y=c(32,38,27,50,25); n=length(x); n # 5 mean(x); mean(y) # 270 34.4 mean( (x-mean(x))^2 ); mean( (y-mean(y))^2 ) # 2680 81.04 Var(x); Var(y) # 2680 81.04 mean( (x-mean(x))^2 )*n/(n-1); mean( (y-mean(y))^2 )*n/(n-1) # 3350 101.3 var(x); var(y) # 3350 101.3 mean( (x-mean(x))*(y-mean(y)) ) # 384 mean( (x-mean(x))*(y-mean(y)) )*n/(n-1) # 480 cov(x,y) # 480 Cov = cov(x,y)*(length(x)-1)/length(x); Cov # 384 sd(x); sd(y) # 57.87918 10.06479 Sd(x); Sd(y) # 51.76872 9.002222
In both the theoretical and experimental cases, the same value is obtained for the correlation coefficient:
cov(x,y)/(sd(x)*sd(y)) # 0.8239751 cov(x,y)*(length(x)-1)/length(x)/(Sd(x)*Sd(y)) # 0.8239751 cor(x,y) # 0.8239751