Post

Research 6 - Derivation of Chebyshev's inequality and its application to prove the (weak) LLN

Immagine
Chebyshev's Inequality In probability theory, the Chebyshev's inequality guarantees that, for a wide class of probability distributions, no more than a certain fraction of values can be more distant than a certain value from the mean. In particular, the mentioned inequality states that no more than 1/k 2 of the distribution's values can be more than k standard deviations away from the mean. In other words, this mean that at least (1 - 1/k 2 ) of the distribution's values are within k standard deviations of the mean. The Chebyshev's inequality can be easily derived from the Markov's inequality, where the latter defines an upper bound for the probability that a non-negative random variable is greater than (or equal to) some positive integer constant. Remember the Markov's inequality where   a  > 0 and  X  is a nonnegative random variable The Chebyshev inequality follows by considering the random variable ( X - E ( X )) 2  and the constant a 2...

Insight 5 - Floating Point Representation and their Computational Issues

Immagine
Floating-point representation In computing, a floating-point representation is an approximation which allows modern computers to support the trade-off between range and precision when handling real numbers. Representing numbers as integers in a fixed number of bits has some notable limitations. It cannot handle numbers that have a fraction and it's not suitable for very large numbers that don't fit into (e.g.) 32 bits. In general, a number is represented by an approximation to a fixed number of significant digits (the significant) and scaled using an exponent for some fixed base; the base for the scaling is normally two, ten, or sixteen. A number that can be represented exactly has the following form: where significand, base, and exponent are all integers and the base is greater than or equal to two. As an example, consider 12345 x 10 -4 = 1.2345, or 11111 x 10 -1 = 1111.1. The term floating point refers to the fact that a number's radix point (decimal point or...

Insight 4 - Kahan and Neumaier Summation

Immagine
Algorithms for 'online data stream' arithmetic mean Data streams can be denoted as an ordered sequence of points that must be accessed sequentially and can be read only once or a small number of times. Efficient processing over massive data sets has taken an increased importance in the last few decades due to the growth of the data traffic and storage required by even more applications. In particular, monitoring huge and rapidly changing streams of data has emerged as an important data management problem. Some applications are the analysis of network traffic, transaction logs, telephone call records and many more. Since data streams have potentially unbounded size, when the amount of computation memory is limited it may be impossible to produce an exact result. Here the challenge is to produce high quality approximated answers, that is, answers as correct as possible. Let's take the arithmetic mean as an example. Considering a huge data stream and by using the...

Research 4 - Boole's Inequality

Immagine
Boole's inequality, which is also known as the union bound , states that for any finite  or countable  set of events, the probability that at least one of the events happens is no greater than the sum of the probabilities of the individual events. More formally, given a set of events A 1 , A 2 , A 3 , ..., A i the following inequality holds. The mentioned inequality may be proved in several ways, here the proof using induction is provided. Proof For the case n  = 1, we trivially have P ( A i ) ≤ P ( A i ). For the case n , we've to conclude with the inequality statement. Since  P ( A  ∪ B ) = P ( A ) + P ( B ) - P ( A  ∩ B ) and by the associative property of the union operator, we've: By the first axiom of probability, we've that the last term of the equality is greater than (or equal to) 0 and then Which is true for any A i , thus completing the proof. The discussed inequality may be generalised to find both the upper and the lo...

Research 3 - Means and Markov Inequality

Immagine
The are several kinds of mean, each with its applications and properties. In probability and statistics, the most used kind of mean is the arithmetic mean which is defined as the central value of a discrete set of numbers and can be computed by adding all the values and then dividing the result by the total number of values. For instance, the average age is equal to the sum of the ages of every individual in a particular sample, divided by the total number of individuals. Naturally, the sample mean will differ from the population mean but, however, by the law of large numbers with a larger sample size, its mean will be closer to the population mean. The following list summarises the other kind of means which are commonly used in statistics and mathematics. Weighted arithmetic mean is similar to an ordinary arithmetic mean except that, instead of each of the data points contributing equally to the final average, some data points contribute more than others. Geometric m...

Insight 3 - Bayes' Theorem

Before giving the Bayes theorem statement is important to define the conditional probability. Once done, the Bayes theorem is easy to derive. Conditional Probability The conditional probability measures the likelihood of an event given that another event has occurred. For instance, assuming that  A  is the event of interest and that  B consists of a different event which has already occurred. The probability of the event  A  to occur, given the occurrence of B , (written as)  P ( A  | B ), can be computed as detailed by the following equality. P(A | B) = P(A ∩ B) / P(B)        P( B ) > 0                               (1) It is defined as the quotient of the probability of the joint events A  and B  and the probability of B . Now consider P(B | A), which is equals to P(B ∩ A) / P(A). Since P(B ∩ A) =  P(A ∩ B), then the fo...

Insight 2 - Kolmogorov Axioms and Relative Frequency

Immagine
Probability theory is a branch of mathematics concerned with the analysis of phenomena, characterised by randomness or uncertainty, in order to model/predict the behaviour of defined systems. Although there are several different probability interpretations, such as the propensity interpretation or the subjective one, the most used and established probability theory is due to Andrey N. Kolmogorov, a Russian mathematician which combined previous studies on the field and presented his axiom system for probability theory in 1933. This blog post is intended to introduce the reader to the main axioms and rules regarding the Kolmogorov's theory. Let S  denote a sample space with a probability measure P  defined over it, such that the probability of any event E ⊂ S  is given by P(E) . Then, the probability measure obeys the following axioms: AXIOM 1 - The probability of an event is a non-negative real number: P(E)  ∈   ℝ,   P(E) ≥ 0   ...