Post

Research 7 -- Central Limit Theorem, LLN, and most common probability distributions

Immagine
Law of Large Numbers and CLT Intuitively, everyone can be convinced of the fact that the average of many measurements of the same unknown quantity tends to give a better estimate than a single measurement. The law of the large numbers (LLN) and the central limit theorem (CLT) formalise this general ideas through mathematics and random variables. Suppose X 1 , X 2 , ..., X n are independent random variables with the same underlying distribution. In this case, we say that the X i are independent and identically-distributed (or, i.i.d.). In particular, the X i have all the same mean μ and standard deviation σ. The average of the i.i.d. variables is defined as: The central limit theorem states that when an infinite number of successive random samples are taken from a population, the sampling distribution of the means of those samples will become approximately normally distributed with mean μ and standard deviation σ/ √N as the sample size becomes larger, irrespective of the sh...

Research 6 - Derivation of Chebyshev's inequality and its application to prove the (weak) LLN

Immagine
Chebyshev's Inequality In probability theory, the Chebyshev's inequality guarantees that, for a wide class of probability distributions, no more than a certain fraction of values can be more distant than a certain value from the mean. In particular, the mentioned inequality states that no more than 1/k 2 of the distribution's values can be more than k standard deviations away from the mean. In other words, this mean that at least (1 - 1/k 2 ) of the distribution's values are within k standard deviations of the mean. The Chebyshev's inequality can be easily derived from the Markov's inequality, where the latter defines an upper bound for the probability that a non-negative random variable is greater than (or equal to) some positive integer constant. Remember the Markov's inequality where   a  > 0 and  X  is a nonnegative random variable The Chebyshev inequality follows by considering the random variable ( X - E ( X )) 2  and the constant a 2...

Insight 5 - Floating Point Representation and their Computational Issues

Immagine
Floating-point representation In computing, a floating-point representation is an approximation which allows modern computers to support the trade-off between range and precision when handling real numbers. Representing numbers as integers in a fixed number of bits has some notable limitations. It cannot handle numbers that have a fraction and it's not suitable for very large numbers that don't fit into (e.g.) 32 bits. In general, a number is represented by an approximation to a fixed number of significant digits (the significant) and scaled using an exponent for some fixed base; the base for the scaling is normally two, ten, or sixteen. A number that can be represented exactly has the following form: where significand, base, and exponent are all integers and the base is greater than or equal to two. As an example, consider 12345 x 10 -4 = 1.2345, or 11111 x 10 -1 = 1111.1. The term floating point refers to the fact that a number's radix point (decimal point or...

Insight 4 - Kahan and Neumaier Summation

Immagine
Algorithms for 'online data stream' arithmetic mean Data streams can be denoted as an ordered sequence of points that must be accessed sequentially and can be read only once or a small number of times. Efficient processing over massive data sets has taken an increased importance in the last few decades due to the growth of the data traffic and storage required by even more applications. In particular, monitoring huge and rapidly changing streams of data has emerged as an important data management problem. Some applications are the analysis of network traffic, transaction logs, telephone call records and many more. Since data streams have potentially unbounded size, when the amount of computation memory is limited it may be impossible to produce an exact result. Here the challenge is to produce high quality approximated answers, that is, answers as correct as possible. Let's take the arithmetic mean as an example. Considering a huge data stream and by using the...

Research 4 - Boole's Inequality

Immagine
Boole's inequality, which is also known as the union bound , states that for any finite  or countable  set of events, the probability that at least one of the events happens is no greater than the sum of the probabilities of the individual events. More formally, given a set of events A 1 , A 2 , A 3 , ..., A i the following inequality holds. The mentioned inequality may be proved in several ways, here the proof using induction is provided. Proof For the case n  = 1, we trivially have P ( A i ) ≤ P ( A i ). For the case n , we've to conclude with the inequality statement. Since  P ( A  ∪ B ) = P ( A ) + P ( B ) - P ( A  ∩ B ) and by the associative property of the union operator, we've: By the first axiom of probability, we've that the last term of the equality is greater than (or equal to) 0 and then Which is true for any A i , thus completing the proof. The discussed inequality may be generalised to find both the upper and the lo...

Research 3 - Means and Markov Inequality

Immagine
The are several kinds of mean, each with its applications and properties. In probability and statistics, the most used kind of mean is the arithmetic mean which is defined as the central value of a discrete set of numbers and can be computed by adding all the values and then dividing the result by the total number of values. For instance, the average age is equal to the sum of the ages of every individual in a particular sample, divided by the total number of individuals. Naturally, the sample mean will differ from the population mean but, however, by the law of large numbers with a larger sample size, its mean will be closer to the population mean. The following list summarises the other kind of means which are commonly used in statistics and mathematics. Weighted arithmetic mean is similar to an ordinary arithmetic mean except that, instead of each of the data points contributing equally to the final average, some data points contribute more than others. Geometric m...

Insight 3 - Bayes' Theorem

Before giving the Bayes theorem statement is important to define the conditional probability. Once done, the Bayes theorem is easy to derive. Conditional Probability The conditional probability measures the likelihood of an event given that another event has occurred. For instance, assuming that  A  is the event of interest and that  B consists of a different event which has already occurred. The probability of the event  A  to occur, given the occurrence of B , (written as)  P ( A  | B ), can be computed as detailed by the following equality. P(A | B) = P(A ∩ B) / P(B)        P( B ) > 0                               (1) It is defined as the quotient of the probability of the joint events A  and B  and the probability of B . Now consider P(B | A), which is equals to P(B ∩ A) / P(A). Since P(B ∩ A) =  P(A ∩ B), then the fo...