5- Covariance 

In probability theory and statistics, covariance is a measure of the joint variability of two random variables.

If the greater values of one variable mainly correspond with the greater values of the other variable,

and the same holds for the lesser values, (i.e., the variables tend to show similar behavior),

the covariance is positive. In the opposite case, when the greater values of one variable mainly

correspond to the lesser values of the  other,  (i.e.,  the  variables  tend  to  show  opposite  behavior),

the covariance is negative. The sign of the covariance therefore shows the tendency in the linear relationship between the variables.

For two jointly distributed real-valued random variables 𝑋 and π‘Œ,

the covariance is defined as the expected value (or mean) of the product of their deviations from their individual expected values:

cov(𝑋,π‘Œ)=𝐸[(𝑋−𝐸(𝑋))(π‘Œ−𝐸(π‘Œ))]

where 𝐸(𝑋)  is the expected value of  𝑋, also known as the mean of  𝑋.

The covariance is also sometimes denoted πœŽΰ―‘ΰ―’ or 𝜎(𝑋,π‘Œ), in analogy to variance. By using the linearity property of expectations, 

this can be simplified to the expected value of their product minus the product of

their expected values:

cov(𝑋,π‘Œ)=𝐸[(𝑋−𝐸(𝑋))(π‘Œ−𝐸(π‘Œ))]

=𝐸[π‘‹π‘Œ−𝑋𝐸(π‘Œ)−𝐸(𝑋)π‘Œ+𝐸(𝑋)𝐸(π‘Œ)]

=𝐸[π‘‹π‘Œ]−𝐸(𝑋)𝐸(π‘Œ)−𝐸(𝑋)𝐸(π‘Œ)+𝐸(𝑋)𝐸(π‘Œ)]

=𝐸(π‘‹π‘Œ)−𝐸(𝑋)𝐸(π‘Œ)

Covariance of discrete random variables:

If the random variable pair (𝑿,𝒀) can take on the values (π’™π’Š,π’šπ’Š) for π’Š=𝟏,β‹―,𝒏, with equal

probabilities π’‘π’Š= then the covariance can be equivalently written in terms of the means 𝑬(𝑿) and 𝑬(𝒀)  as

More  generally,  if  there  are n possible  realizations  of (𝑋,π‘Œ),

namely (π’™π’Š,π’šπ’Š) but  with  possibly  unequal  probabilities π’‘π’Š for  π’Š=𝟏,β‹―,𝒏, then the covariance is

Suppose  that 𝑋 and  π‘Œ have  the  following  joint  probability  mass function:

(π‘₯,𝑦)∈𝑆={(5,8),(6,8),(7,8),(5,9),(6,9),(7,9)}with probability respectively{0,0.4,0.1,0.3,0,0.2}

Then we can deduce that 𝑋 can take on three values (5, 6 and 7) with probability respectively (0.3,0.4,0.3)

and π‘Œ can take on two (8 and 9) with probability respectively (0.5, 0.5).

So

𝐸(𝑋)=5(0.3)+6(0.4)+7(0.1+0.2)=6

And

𝐸(π‘Œ)=8(0.4+0.1)+9(0.3+0.2)=8.5

Then, 

=(0)(5−6)(8−8.5)+(0.4)(6−6)(8−8.5)+

(0.1)(7−6)(8−8.5)+(0.3)(5−6)(9−8.5)+

(0)(6−6)(9−8.5)+(0.2)(7−6)(9−8.5)=−0.1

 the variance is a special case of the covariance in which the two variables are identical

(that is, in which one variable always takes the same value as the other):

cov(X,X)=Var(X)=

Covariance of linear combinations

If 𝑋,π‘Œ,𝑉,π‘Žπ‘›π‘‘ π‘Š are  real-valued  random  variables  and π‘Ž,𝑏,𝑐,𝑑  are real-valued constants,

then the following facts are a consequence of the definition of covariance:

cov(X,a)=0

cov(X,X)=Var(X)

cov(X,Y)=cov(Y,X)

cov(a X,b Y)=a b cov(X,Y)

cov(X+a,Y+b)=cov(X,Y)

π‘π‘œπ‘£(π‘Žπ‘‹+π‘π‘Œ,𝑐𝑋+𝑑𝑉)=π‘Žπ‘ π‘π‘œπ‘£(𝑋,π‘Š)+π‘Žπ‘‘ π‘π‘œπ‘£(𝑋,𝑉)+𝑏𝑐 π‘π‘œπ‘£(π‘Œ,π‘Š)+𝑏𝑑 π‘π‘œπ‘£(π‘Œ,𝑉)

For  a  sequence  of  random  variables  in  real-valued,

and constants , we have