Introduction – What is python?
Python is a script-based programming language with a very clean syntax and a lot of packages for extended functionality. It is used widely in industry and is one of the leading programming languages in some fields.
Packages for Probability and Statistics
There are mainly two packages which are of importance. The first one is NumPy, this is a package for scientific computing. It is not strictly for probability and statistics, rather is provides a wide array of mathematical functionality such as support for matrices, linear algebra, Fourier transform as well as random number generation. NumPy is part of the SciPy Stack, hence it’s better to download the whole stack with all included software, since you might need the other functionality as well. A lot of the probability functions reside in the “scipy.stats” package. With this we can generate a huge amount of different types of random variables, it is truly a great library.
Example – Generate Gaussian Random Variable
Let’s start simple and generate a Gaussian random vector first. This is basically a vector which at each element contains a realization of a random variable. To generate 100 values, we can write
import scipy
from scipy import stats
import matplotlib.pyplot as plt
x =scipy.stats.norm.rvs(size=100)
plt.plot(x)
plt.xlabel(‘Index’)
plt.ylabel(‘Value’)
plt.title(‘Gaussian random vector’)
plt.show()
Lets step through the code step by step. The variable x is filled with Gaussian random variable values by the norm.rvs command. The result is then plotted by the plt.plot command. This generates the following output
If we look in the scipy.stats documentation, we can find a huge amount of random number generators, this is great for development and can produce very advanced and sophisticated programs. Add on this that there are packages for advanced topics such as machine learning, that can be used together with the SciPy package.
Summary
We learned what is required to generate random numbers in Python, and how to generate a Gaussian random vector, which is basically a list of independent Gaussian random variables. We also learn how to plot using matplotlib. To plot other distributions, we just simple need to change what function we use to set the vector x. In comparison with Matlab we see that this example is similar. Hence for low level applications it is probably better to use Python since there is no licence fee for it.