Research 4 - Boole's Inequality

Boole's inequality, which is also known as the union bound, states that for any finite or countable set of events, the probability that at least one of the events happens is no greater than the sum of the probabilities of the individual events. More formally, given a set of events A1, A2, A3, ..., Ai the following inequality holds.


The mentioned inequality may be proved in several ways, here the proof using induction is provided.

Proof

For the case n = 1, we trivially have P(Ai) ≤ P(Ai).

For the case n, we've to conclude with the inequality statement.
Since P(A ∪ B) = P(A) + P(B) - P(A ∩ B) and by the associative property of the union operator, we've:



By the first axiom of probability, we've that the last term of the equality is greater than (or equal to) 0 and then



Which is true for any Ai, thus completing the proof.
The discussed inequality may be generalised to find both the upper and the lower bound on the probability of finite unions of events. This bounds are known as Bonferroni inequalities.

Define


which is obvious for k = 1 and k = 2, and holds for any k ∈ {3, 4, ..., n}
Then, for odd k in {1, 3, ..., n}:



and for even k ∈ {2, 4, ..., n}:





Sampling Distribution of the Mean

A sampling distribution is defined as the frequency distribution of a statistic (e.g. the mean) over an arbitrary random samples taken from the same population. Sampling distributions are at the very core of the inferential statistics and they're used almost in any field it's applied.

For instance, consider a social scientist which would like to estimate a particular parameter for a given population. Since asking questions to all the individuals can be too time consuming he will probably end up analysing just a random sample taken from the population in question.
However, random samples frequently lead to a slightly different results and the solution to this issue is to figure out how much they vary. Let's suppose, therefore, that the social scientist won't pick just one random sample but it will pick several samples of the same size and then compute the statistic of interest for each of the chosen sample.
The values of the statistic will be different for the majority of the samples. However, taking into account the mean as the statistic of interest, by the central limit theorem the average value of the sampling distribution, which depends on the number of samples chosen, will be equal to the parameter to be estimated.


Consider as an example a population composed by six pumpkins (which weights, in pounds, are known) and that you are asked to guess the average weight by taking a random sample.


Pumpkin  A  B  C  D  E  F
Weight   19 14 15 9 10 17


The population arithmetic mean is 𝝻 = ( 19 + 14 + 15 + 9 + 10 + 17)/6 = 14.
The sampling distribution of the sample mean for samples of size 2, which can be found just by choosing a random sample (e.g. {A,B}, or {D,E}) and then computing the statistic of interest on it, is summarised by the next table.

9.5
11.5
12.0
12.5
13.0
13.5
14.0
14.5
15.5
16.0
16.5
17.0
18.0
Probability
1/15
1/15
2/15
1/15
1/15
1/15
1/15
2/15
1/15
1/15
1/15
1/15
1/15

As can be seen, when considering the sample mean several values will not be equal to the expected value of 14. However, by the central limit theorem and due to the fact that the population is normal, then the distribution of sample mean looks normal even if the size of samples is restricted to 2.
In fact, the mentioned mean of sample mean is:

(9.5 + 11.5 + 12.0*2 + 14.5*2 + 12.5 + 13.0 + 13.5 + 14.0 + 15.5 + 16.0 + 16.5 + 17.0 + 18.0)/15 =14

A similar outcome may have been obtained taking samples of size (for instance) 5.
In this case we would have had the following mean of sample mean:

(13.4 + 14.8 + 15.0 + 13.8 + 14.0 + 13.0) / 6 = 14

Commenti

Post popolari in questo blog

Welford Algorithm

Research 6 - Derivation of Chebyshev's inequality and its application to prove the (weak) LLN

Research 7 -- Central Limit Theorem, LLN, and most common probability distributions