The variance is the expectation of the squared deviation of a random variable form its mean. Informally, it measures how far a set of (random) numbers are spread out from their average value. The variance is the square of the standard deviation, the second central moment of a distribution and the covariance of the random variable with itself, and it is often represented by Var(X), σ 2 , s 2 . One of the most widely known formula for computing the variance is: where x-bar is the mean of the sample. The definition given above can be converted into an algorithm that computed the variance and the standard deviation in two passes: 1. Compute the mean (O(n)) 2. Compute the square differences (O(n)) Output the variance Even though this algorithm seems working properly, it may become too expensive on some input instances. Just consider a sampling procedu...
Chebyshev's Inequality In probability theory, the Chebyshev's inequality guarantees that, for a wide class of probability distributions, no more than a certain fraction of values can be more distant than a certain value from the mean. In particular, the mentioned inequality states that no more than 1/k 2 of the distribution's values can be more than k standard deviations away from the mean. In other words, this mean that at least (1 - 1/k 2 ) of the distribution's values are within k standard deviations of the mean. The Chebyshev's inequality can be easily derived from the Markov's inequality, where the latter defines an upper bound for the probability that a non-negative random variable is greater than (or equal to) some positive integer constant. Remember the Markov's inequality where a > 0 and X is a nonnegative random variable The Chebyshev inequality follows by considering the random variable ( X - E ( X )) 2 and the constant a 2...
Law of Large Numbers and CLT Intuitively, everyone can be convinced of the fact that the average of many measurements of the same unknown quantity tends to give a better estimate than a single measurement. The law of the large numbers (LLN) and the central limit theorem (CLT) formalise this general ideas through mathematics and random variables. Suppose X 1 , X 2 , ..., X n are independent random variables with the same underlying distribution. In this case, we say that the X i are independent and identically-distributed (or, i.i.d.). In particular, the X i have all the same mean μ and standard deviation σ. The average of the i.i.d. variables is defined as: The central limit theorem states that when an infinite number of successive random samples are taken from a population, the sampling distribution of the means of those samples will become approximately normally distributed with mean μ and standard deviation σ/ √N as the sample size becomes larger, irrespective of the sh...
Commenti
Posta un commento