About Variance

What is variance?

Variance is a concept which is used a lot within probability and statistics. It is very important to have a good understanding of variance to be able to work and understand the field. Let’s start with an intuitive description. In simple terms, the variance describes how much something varies around its mean. The mean is the value which the process, on average has, or is expected to have. Imagine that the process runs for an infinite amount of time, the mean would then be the mean value of all of these value that the process has taken on. Or for a random variable, the mean is simply the expected value of the variable. For a random variable the expected value can be seen as the value that the random variable would take if you drew it for an infinite amount of times.

There are several definitions of variance. There is one mathematical definition of variance, and there are formulas for calculating the variance based on measurement data. The measurement data calculation takes into account that we have a finite (limited) amount of measurement points. The most common mathematical definition is

$latex Var(X) = E[(X-mu)^2] $

Where X is a random variable, E[] indicates the so called expected value of the and $latex \mu$ is the mean of X. If $mu$ equals zero, then the expression simplifies to

$latex Var(X) = E[(X)^2] $

Which is the expected value of the random variable squared. The expected value is for a continuous random variable given by

$latex E[X] = \int_{-\inf}^{\inf} X P_x (x) dx $

Where $latex P_x (x) $ is the probability density function (pdf) of X. Both these definitions might look a bit complicated at first, but their meaning is quite intuitive. The variance just is how much the variable is expected to vary, squared. And the expected value is the value that the random variable is expected to, on average, take.

Variance for measured data

When we calculate the variance for measured data, we want to take the uncertainty of having a limited number of measurements (samples) into account. This is called sample variance, since we are using samples to calculate it. The formula can be stated as

$latex variance = \frac{ \sum_{n=1}^{N} (x-mu)^2 }{N} $

Where x is each measured value (sample) of what we are measuring. mu is the mean of the measurements, this has to be calculated first by

$latex mu = \frac{ \sum_{n=1}^{N} x }{N} $

By using these two formulas, we can acquire the sample variance. If you are measuring something which is approximately Gaussian (normally distributed), then you can characterize it with these two values, the mean and the variance.