Research 6 - Derivation of Chebyshev's inequality and its application to prove the (weak) LLN

Chebyshev's Inequality

In probability theory, the Chebyshev's inequality guarantees that, for a wide class of probability distributions, no more than a certain fraction of values can be more distant than a certain value from the mean. In particular, the mentioned inequality states that no more than 1/k2 of the distribution's values can be more than k standard deviations away from the mean. In other words, this mean that at least (1 - 1/k2) of the distribution's values are within k standard deviations of the mean.

The Chebyshev's inequality can be easily derived from the Markov's inequality, where the latter defines an upper bound for the probability that a non-negative random variable is greater than (or equal to) some positive integer constant.

Remember the Markov's inequality

where  a > 0 and X is a nonnegative random variable
The Chebyshev inequality follows by considering the random variable (X - E(X))and the constant a2.


Note that:

For instance, lets suppose we have access to a journal which contains 1000 words per article and has a standard deviation of 200 words. We can compute the probability that is has between 600 and 1400 words (i.e. k=2 standard deviation from the mean) must be at least 75% because there's no more than 1/k^2 = 1/4 chance to be outside that range. 

Weak Law of Large Numbers 

In probability theory the law of large numbers is a theorem that describes the result of performing the same experiment a large number of times. According to the law, the average of the results obtained when a huge number of trials is performed should be close to the expected value, and it will tend to become closer as more trials are performed.

The Chebyshev's inequality can be used to prove the weak law of large numbers (also known as Khinchin's law). The law states that the sample average converges in probability towards the expected value. More formally, we have:



In other words, it states that for any nonzero margin specified, no matter how small, with a sufficiently large sample there will be a very high probability that the average of the observations will be close to the expected value. Is worth to mention that this law does not apply to every distribution. For instance, it doesn't apply for the Cauchy distribution. Let the random numbers be the values of an angle uniformly distributed between -90° and +90°. The median is zero, but the expected value does not exist, and indeed the average of n such variables has the same distribution as one such variable.
It does not tend toward zero as n goes to infinity.

The weak LLN states that for a specific n, the average is likely to be near to μ. Thus, it leaves open the possibility that |Xn - μ| > 𝝐 happens an infinite number of times.
The strong LLN shows that this (almost surely) will not occur. In particular, it implies that with probability 1, we've that for any 𝝐 > 0 the inequality holds for all large enough n.


In order to prove the WLLN with the use of the Chebyshev Inequality:


Commenti

Post popolari in questo blog

Welford Algorithm

Research 7 -- Central Limit Theorem, LLN, and most common probability distributions