Simulation Programming Notes

Lin Jensen, Bishop's University
Computer Science 301a
Simulation Techniques
Fall 1996

References: (on reserve at Bishop's Library)

System simulation; programming styles and languages, Wolfgang Kreutzer, QA 76.9 .C65 K74 1986
Simulation Modeling & Analysis, A. M. Law & W. D. Kelton, QA 76.9 .C65 L38 1991

Summary: Simulation means studying a system of interest by numerically exercising a model of that system. We will study the requirements for valid simulation experiments, consider various programming styles, and program several different simulations, using Turbo Pascal and a toolbox developed from Kreutzer[1].

Go to Description of the Toolbox units and program examples

Course Outline - index to Topics

Why and when to simulate.
Use of statistical distributions.
Review of object-oriented programming and Turbo Pascal
Four paradigms of System simulation - Abstraction (simplification) in model building
1. Monte Carlo methods (leave out time). Statistical instrumentation (gather data from simulation experiments) Building object-oriented toolboxes
2. Continuous simulation
3. Discrete-event simulation - Queueing networks Entities, events, activities, processes - programming styles (more toolboxes)
4. Combined discrete/continuous simulation
Simulation using co-routines. Concurrency & Message passing. See also:
- Pascal toolboxes for simulation, and
- Synchronization of Processes
Survey of simulation programming languages.

Grading scheme (as adopted)

                        Assignments                     20%
                        Midterm exam                    20%     October 18
                        Simulation Project              30%
                        Final Exam                      30%

There will be no supplemental exam in this course.

Introduction

The history of mankind (sic) is a history of model building. Rivett(1972)

Building models is central to the way we understand the world around us. We give meaning to our perceptions of the world by fitting them into our mental model of the world. From time to time new experiences may force us to modify our models.

An interesting model of how living organisms respond to their environment is the theory of finite state machines, proposed by biologists in the 1950's. It models an organism as a set of states, stimuli from the environment cause the organism to change state according to a set of transition rules. These rules represent the organism's model of its world. Models upon models! In computer science, the theory of finite state machines is used to model digital electronic circuits and computer programs.

Four phases of modeling as used in simulation are:

System Identification. A system is defined as a collection of objects, their relationships and behaviour, relevant to a set of purposes. There is never a "best" model. A model's usefulness depends on objectives and context of its application. In this phase we identify the objects of interest, and the limits of the system. Usually we identify an observer, with some point of view.
If we now wish to experiment with a system, we have several choices:
- Experiment with the actual system. This could be expensive, too dangerous (nuclear power station) or take too long (evolution of a solar system).
- Build a physical model of the system.
- Build a symbolic (mathematical) model
  - Solve it analytically, if possible. There may be a solution without oversimplifying.
  - Simulate the model, and see what happens.
System representation Choose some language or means to represent the elements of the system.
Model design - validate the design Simplify, or abstract away unimportant details. As a general rule, keep the model as simple as possible for the purpose at hand. You must answer the question, "Is this design a valid represention of the system?"
Assuming we choose to use a computer to help, code the model. Verify that the code does indeed correctly implement the model design. It may be possible to further check on a model's validity by observing a correspondance between the some of the program's results and observations of the original system. For example, it would be reassuring if a model of solar system evolution sometimes resulted in a system like ours.

Why and when to simulate

If you want to perform experiments about a system, to answer some questions. Simulate when it is inconvienient, dangerous, expensive or impossible to experiment with the actual system, or even build a physical model of it.

Sampling from statistical distributions

We cannot model everything in complete detail. For example, the time it takes for a letter to be delivered could be determined by modeling the entire postal system, but if that is not of interest to us, the time for delivery can be modeled as a random process, and random simulated times drawn from a distribution believed to be representative of that process. A process or system with random elements is called stochastic.

Many distributions have been proposed for use in simulation, see Law [2]. The most frequently used are

The uniform distibution

The negative-exponential distribution - Poisson processes
The normal distribution
Draw - True or false - models coin flipping. (Percentage True can be specified, to model an "unfair" coin.)

Uniform distribution

Random numbers are uniformly distributed over some interval if every value in the interval is equally likely. A uniform random sample can be generated by flipping a coin a number of times, and interpreting the outcome as a binary number. [0,1) is a typical standard interval. We can get a "real" number in this interval by dividing our binary number by 2**n where n is the number of times we flipped the coin.

Unfortunately for us, digital circuits are specifically designed to eliminate any random variation, they are completely deterministic. So, a lots of work has been done to produce pseudo-random number generators, deterministic procedures that produce a stream of numbers that look so haphazard that we are unable to predict what the next one will be, and pass statistical tests for randomness (chi-squared, auto- correlation). One popular method is to take two prime constants, A and B, and a seed number, then calculate

seed := A * seed + B

truncating to the number of bits the computer uses for integers, to simulate the ideal coin-flipping. Care and trial and error will lead to a good choice of constants.

Two principles to bear in mind with pseudo-random number streams:

Each time you start from the same seed, you will get exactly the same sequence of "random" numbers. This is very convenient for simulation experiments, in which identical random choices can be generated for alternate solutions to some problem.
Separate streams of random numbers, each with its own seed, should be used for different processes, such as arrivals and service times, to preserve the identical values just mentioned.

Negative exponential distribution

If some events (for example, arrival of busses at a bus stop) occur at random with a uniform distribution and independent of one another, we can ask the question, what is the distribution of times between sucessive events? The answer is, negative exponential, for which the pdf (probability distribution function) is

f(t) = a exp(-at)

where a is the mean arrivals per unit time (arrival rate - usually written with the greek letter lambda). The average time between arrivals is 1/a. This is a consequence of the fact that the probability of an arrival in any time period of a given length is constant. Let g(t) be the probability that no arrival occurs in an interval of length t. then for an additional time-slice s:

g(t + s) = g(t) * g(s)

i.e. the probability that nothing arrived during interval t times the probability that no arrival occurs in the time-slice. This leads to a differential equation identical to that of radioactive decay or population growth, whose solution is

F(t) = 1 - g(t) = 1 - exp(-at)

the cdf (cumulative distribution function) which, when differentiated, gives the pdf above. It is the cdf which is used to get a neg-exponentially distributed random number from a uniform one.

An interesting property of the negative exponential distribution is that it is memoryless, that is, the pdf for the next arrival is the same after a period of waiting as it was right after the last arrival. For example, if down time for a computer network is neg-exp. distributed, the mean time until service is restored after a system crash is 1/a. If you walk in during the down time, the mean time of waiting is still 1/a. If after an hour the system is still down, the mean time to restoration is still 1/a! This explains why people find waiting for busses (and computers) so frustrating.

The Poisson distribution

How many arrivals are likely in a given time interval? On average, a*t, but we can expect different actual integral numbers. In particular, we have seen that the probability that no arrival occurs is exp(-at). In general, for n=0,1,2,...

probability of n arrivals P(n,t) = [(at)**n / n!] exp (-at)

This is the Poisson distribution. It also characterizes the likelihood of random scratches on pieces of furniture.

Normal Distribution

The normal distribution has the familiar "bell curve" shape, of the function:

exp(-x²)

With a few constants thrown in. It is characterized by 2 parameters, mean and standard deviation (sigma). It peaks at the mean, and is symmetrical, with width proportional to sigma. 67% of the samples lie within ± 1 sigma of the mean.

Review of Pascal and object-oriented programming

Pascal is the prinary programming language taught at Bishop's University, and thus a good choice for our purposes. Standard Pascal is a procedural language, lacking facilities for separate compilation. However, for the past several years, we have been using Borland's Turbo Pascal, which provides units (separate compilation and information hiding) and objects.

Object-Oriented programming

In this framework, operations are clustered around objects, which in simulations would be the entities of interest. Object declarations include both data and procedures (called methods in OOP terminology) that operate on the data. OOP features encapsulation and inheritance. In its pure form, the data of an object are private, and can only be manipulated by "sending a message to the object", that is, calling one of the objects procedures. Classes of objects of the same type share common procedures, but each individual object has its own data. In addition, new classes of objects can inherit the characteristics of another object. This results in extendable and reusable code.

Four paradigms of system simulation

T. Kuhn (The structure of scientific revolutions, 1962) used the term "paradigm" to mean, among other things, the whole coherent scientific tradition together with some accepted examples of scientific practice. He pointed out that a paradigm shift occurs when our model of "reality" no longer fits our observations and discoveries.

The term paradigm has come to be applied to programming styles (e.g. procedural, functional, declarative, and object oriented), and we use it to refer to modeling styles used for simulation, and the cultures and schools that have grown up around these.

Modeling styles

Monte Carlo methods (leave out time). Statistical instrumentation (gather data from simulation experiments) Building object-oriented toolboxes
Continuous simulation
Discrete-event simulation - Queueing networks Entities, events, activities, processes - programming styles (more toolboxes
Combined discrete/continuous simulation

Monte Carlo simulations

sampling in state spaces

System state

Systems generally evolve over time, changing "state" as they evolve. By the state of a system, we mean roughly an adaquate description of the system at some instance, for example, the amount of water in a tank, and its temperature and pressure, or the number of wolves and rabbits inhabiting an island.

Continuous simulation

techniques stem from a long tradition of building analogue devices for system modelling and simulation. In this style we ignore individual events and take the large picture. We model a system using a number of state variables, which change continuously over time.

Systems of differential equations

Such a system can be described by differential equations in the state variables, giving their change with respect to time and each other. Sometimes differential equations can be solved analytically, but frequently there is no known solution. In this case one simulates the system by starting at some inital state and proceding by small time increments (time slices), reevaluating all the state variables for eace time slice. Then one produces graphs of the variables.

In studies of population growth, obviously the population changes by discrete events. Animals are born, they die, or are eaten. However the large picture can be described in terms of birth and death rates, abstracting away from these individual events.

It is also very illustrative to provide an animation of the system with computer graphics (CSc 302!). Unfortunately, in one semester we cannot do everything, and programming animations can be very time consuming.

Examples:

A tank fills with toxic chemicals, overflows into a river, kills fish, and is gradually diluted.
Students, staff and professors arrive for registration at the Sports Complex. The staff and professors activate server locations, and students form queues to receive services, then go on to some other queue, and finally exit.

Discrete-event simulation

In this style, by contrast, we focus on important events that occur at instants of time (an event has no mesurable duration of time) and change the state of a system. Examples of events (different systems):

A customer arrives at a shop
A ship sets sail (leaves a harbour)
A computer is turned off
A doctor starts examining her patient

In between events, nothing happens, the system's state remains the same. From the point of view of patients in the doctor's waiting room, nothing happens until the event "doctor finishes examining patient" occurs.

A wide variety of systems can be studied using discrete methods. Most popular are

Queuing network scenarios

which come from the (engineering) field of Operations Research. Much of the vocabulary comes from "job shop" studies, where material arrives and flows from one machine to another, until finished products depart. Material forms queues "in front of" machines while waiting for service. Other equivalent terminologies are also used, such as: Transactions arrive and queue for servers, or compete for resourses. Both "material" or "machines" could be people, depending on the scenario.

5 components of any discrete-event simulation are:

Facilities for model structuring and execution. A collection of interacting objects moves through sequences of actions.
A clock for time management. It is typically asynchronous, jumping to the time of the next scheduled event, once no further actions are possible at the current time.
Random processes to represent less essential aspects (e.g. when visitors arrive, how long it takes to milk a goat).
Facilities for statistical instrumentation and reporting, to gather and analyze data, such as means, maxima, and histograms.
An execution monitor to schedule and keep track of pending events. Arrays can be used, but linked lists are more flexible.

Combined Discrete/continuous simulation

It is probably apparent that the abstractions of continuous simulation (disregard events) and discrete simulation (disregard gradual changes) are somewhat extreme. Unfortunatly they are practiced by different groups using different programming languages and systems. It is possible and often desireable to combine the two approaches.

Using co-routines

Some computer languages provide coroutines, that conceputally execute concurrently. These include C (fork and join) and Modula-2. I have implemented co-routines in Turbo Pascal (some 8086 assembler code involved), so that the "life cycles" of simulation entities can be described in terms of procedures. This is in many ways more natural than focusing on events, each of which may involve several entities.

For more information, see:

Pascal toolboxes for simulation, and
Synchronization of Processes

Last updated 14 November 1996

Up to top of this document
Back to Lin Jensen's home page. Comments, etc, send to

ljensenn at ubishops.ca