Generating Random Variables

The first task in these applications is to generate random numbers. These random numbers could be uniformly distributed or have arbitrarily probability distributions. It is also of importance to understand how to test those random numbers.

It is not quite obvious how to generate random numbers on a deterministic computer. What is clear is that computers can repeat tasks and are consistent in doing so. Therefore the generation of uniformly distributed random numbers has to follow specific algorithms which can be repeated. Which means that, starting from a seed, a random number algorithm always generate the same random numbers sequence when run repetitively. As such random numbers generated on a deterministic computer are not really random, they are said to be pseudo-random.

A very common algorithm used to generate a sequence of pseudo-random numbers from a seed number is via a linear congruential method. The numbers are generated from the sequence described in the following equation


\begin{displaymath}
x_{i+1} = ( a x_i + c ) mod m
\end{displaymath}

where $x_{i+1}$ is generated from $x_i$ in a recursive manner. The numbers $a$,$c$ and $m$ are chosen carefully according to the computer architecture and often times are very large. The book Numerical Recipes devotes a chapter to this method of generating random numbers. It describes these methods as an art rather than a science.

This C functions ran2.c and ran3.c provided by Numerical Recipes follow such methods based on linear congruential method. In addition to following that the generation of a sequence through the linear congruential equation, the methods shuffle the orders of the numbers generated by using its own generated random numbers. An explanation is provided in Numerical Recipes.

The gcc compiler provides a function, rand() which generates integer pseudo-random numbers between $0$ and RAND_MAX. The man page ( man rand() ) says:

The versions of rand() and srand() in the Linux C Library use the same random number generator as random() and srandom(), so the lower-order bits should be as random as the higher-order bits. However, on older rand() implementations, and on current implementations on different systems, the lower-order bits are much less random than the higher-order bits. Do not use this function in applications intended to be portable when good randomness is needed .

For most applications the use of pseudo-random numbers is perfectly adequate. What is important is that these numbers be statistically random. These numbers should be uniformly distributed and show no correlations in any form whatsoever among themselves.

A very elementary test that those numbers must past to be acceptable is uniform distribution. This is best illustrated by using the random numbers generated to specify points in a plane. The following program random_points.c generates a sequence of x, y, values using the C language rand()function to generate random numbers scaled in the interval [0,1].

Exercise: Plot the sequence of points generated by the program by using GNUPLOT and watch for uniform coverage.

Exercise: Do the same for rand2 and rand3 from Numerical Recipes. Watch out for patterns in the coverage. There should be none of course.

Michel Vallieres 2014-04-01