1.7 Fundamentals of Simulation

The first task of simulation is to produce pseudo-random numbers from a specified distribution. We first investigate how to generate pseudo-random numbers from U[0,1], then discuss how to extend this.

The Linear Congruential Generator

This is a way of generating a sequence of "random" numbers between 0 and 1, but using an algorithmic procedure rather than trying to produce truly random numbers.

The basic equation requires parameters m (the base), a (the multiplier), c (the additive constant). The formula used is

xn º  axn-1 + c (mod m)
un  =  xn / m

Here {xi} is a sequence of integers in the range from 0 to m - 1, {ui} a sequence of real numbers between 0 and 1. The sequence repeats after a finite number of steps (the period cannot be greater than m and may be less). It is important to choose m to be large.

The value of a is also important: some values of a work well, others less well, and the choice depends on m.

The value of c is not important. It is normally acceptable to choose c = 0.

Exercise: Use c = 0, m = 100, a = 7 or 13 and starting value assigned in the lecture to generate 10 pseudo-random numbers.

Tests for randomness

Any generator needs to be tested. Things to test might include

The first two are dealt with by c2 tests, the third by methods from Time Series.

1.8 Simulating continuous random variables

A: Inverse distribution function

A good general method is the inverse distribution function method.

If we want to simulate from a distribution with cumulative distribution function F, we can simulate U from Uniform[0,1] and return X = F-1(U).

Check this works: P(X £ x) = P(F-1(U) £ x) = P(U £ F(x)) = F(x).

Example: exponential distribution.

B: Distribution-specific techniques

For some distributions there are specific transformations. For example, the polar method converts a pair of independent U[0,1] random variables into a pair of indpendent N(0,1) variables:

X = sin(2pU)Ö(-2 log V),   Y = cos(2pU)Ö(-2 log V).

C: Combinations of variables

When a r.v. can be represented as a transformation of another, or as the sum of number of copies of another, this can be used to aid its simulation.

Example 1: if Y has density f(y) = lyl-1 for 0 < x < 1, then X = - log Y is exponentially distributed.

Example 2: to simulate from chi-squared(n), simulate a set of n standard Normal variables and sum their squares.

D: Acceptance-Rejection Method

This is used for simulating from difficult densities when all else has failed.

Suppose f is the density to simulate from. First find another density h such that

  1. we already know how to simulate observations from h
  2. f(x) £ C h(x) for all x for some constant C < ¥.

Then the technique is:

  1. Simulate a value y from the density h;
  2. Generate a U[0,1] variable u;
  3. If u < f(y)/Ch(y) then accept y, otherwise reject it and go back to (a).

The value which is accepted has density f.

On average we need to simulate C pairs of values (y, u) in order to get one accepted, so it is best to make C as small as possible.

Example: to simulate from the density f(x) = cxe-x (0<x<1), where c-1 = 1-e-1. The distribution function F is not readily invertible, but f(x) £ ce-1 for all 0<x<1, so we can take h as the uniform density, with C = ce-1.

Interpretation

The inverse distribution function method can be seen as picking a point (x, y) at random from under the graph of y = f(x) and returning its x-coordinate.
The acceptance-rejection method picks a point (x, y) at random from under the graph of y = Ch(x) and then either returns its x-coordinate if it lies under the graph of y = f(x), or rejects it if not.

1.9 Simulating discrete random variables

A: Inverse distribution function

Although in general one has to be careful when inverting discontinuous functions such as the distribution function of a discrete random variable, this method works fine.

Example: simulate an observation from a distribution with probability function p(0) = 0.4, p(1) = 0.25, p(2) = 0.2, p(0) = 0.15.

We calculate the distribution function, simulate a single U[0,1] random variable U and use a simple rule to generate X.
xp(x)F(x)Rule
00.400.40IF U<0.40 THEN X=0
10.250.65ELSE IF U<0.65 THEN X=1
20.200.85ELSE IF U<0.85 THEN X=2
30.151.00ELSE X=3

B: Distribution-specific methods

For specific distributions there may be simpler methods of generating pseudo-random variables, but these depend on the properties of the distribution concerned.

Example 1: to simulate a Binomial(3, 0.6) r.v. we can simulate 3 U[0,1] r.v.s and count how many of them are less than 0.6. This has the advantage that it is simple to program.

Example 2: to simulate a Poisson(5) r.v. we can simulate a Poisson process and stop it at time 5. This involves simulating a sequence of expo(1) variables, stopping when the sum exceeds 5.

Ease of programming is the only advantage to using distribution-specific methods. They are generally much less efficient than the inverse distribution function method if many simulated values are required.