An Introduction to
Statistical Signal Processing
Pr(f ∈ F )=P ({ω : ω ∈ F})=P(f
−1
(F ))
f
−1
(F )
f
F
✲
May 5, 2000
ii
An Introduction to
Statistical Signal Processing
Robert M. Gray
and
Lee D. Davisson
Information Systems Laboratory
Department of Electrical Engineering
Stanford University
and
Department of Electrical Engineering and Computer Science
University of Maryland
iv
c
1999 by the authors.
v
to our Families
vi
Contents
3.5.2 Multidimensional Probability Functions . . . . . . . 119
3.5.3 Consistency of Joint and Marginal Distributions . . 120
3.6 Independent Random Variables . . . . . . . . . . . . . . . . 127
3.6.1 IID Random Vectors . . . . . . . . . . . . . . . . . . 128
3.7 Conditional Distributions . . . . . . . . . . . . . . . . . . . 129
3.7.1 Discrete Conditional Distributions . . . . . . . . . . 130
3.7.2 Continuous Conditional Distributions . . . . . . . . 131
3.8 Statistical Detection and Classification . . . . . . . . . . . . 134
3.9 Additive Noise . . . . . . . . . . . . . . . . . . . . . . . . . 137
3.10 Binary Detection in Gaussian Noise . . . . . . . . . . . . . 144
3.11 Statistical Estimation . . . . . . . . . . . . . . . . . . . . . 146
3.12 Characteristic Functions . . . . . . . . . . . . . . . . . . . . 147
3.13 Gaussian Random Vectors . . . . . . . . . . . . . . . . . . . 152
3.14 Examples: Simple Random Processes . . . . . . . . . . . . . 154
3.15 Directly Given Random Processes . . . . . . . . . . . . . . 157
3.15.1 The Kolmogorov Extension Theorem . . . . . . . . . 157
3.15.2 IID Random Processes . . . . . . . . . . . . . . . . . 158
3.15.3 Gaussian Random Processes . . . . . . . . . . . . . . 158
3.16 Discrete Time Markov Processes . . . . . . . . . . . . . . . 159
3.16.1 A Binary Markov Process . . . . . . . . . . . . . . . 159
3.16.2 The Binomial Counting Process . . . . . . . . . . . . 162
3.16.3 Discrete Random Walk . . . . . . . . . . . . . . . . 165
3.16.4 The Discrete Time Wiener Process . . . . . . . . . . 166
3.16.5 Hidden Markov Models . . . . . . . . . . . . . . . . 167
3.17 Nonelementary Conditional Probability . . . . . . . . . . . 168
3.18 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
4 Expectation and Averages 187
4.1 Averages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
4.2 Expectation . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
4.2.1 Examples: Expectation . . . . . . . . . . . . . . . . 192
5.7 Time-Averages . . . . . . . . . . . . . . . . . . . . . . . . . 305
5.8 Differentiating Random Processes . . . . . . . . . . . . . . 309
5.9 Linear Estimation and Filtering . . . . . . . . . . . . . . . 312
5.10 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326
6 A Menagerie of Processes 343
6.1 Discrete Time Linear Models . . . . . . . . . . . . . . . . . 344
6.2 Sums of IID Random Variables . . . . . . . . . . . . . . . . 348
6.3 Independent Stationary Increments . . . . . . . . . . . . . . 350
6.4 Second-Order Moments of ISI Processes . . . . . . . . . . 353
6.5 Specification of Continuous Time ISI Processes . . . . . . . 355
6.6 Moving-Average and Autoregressive Processes . . . . . . . . 358
6.7 The Discrete Time Gauss-Markov Process . . . . . . . . . . 360
6.8 Gaussian Random Processes . . . . . . . . . . . . . . . . . . 361
6.9 The Poisson Counting Process . . . . . . . . . . . . . . . . 361
6.10 Compound Processes . . . . . . . . . . . . . . . . . . . . . . 364
6.11 Exponential Modulation . . . . . . . . . . . . . . . . . . . 366
6.12 Thermal Noise . . . . . . . . . . . . . . . . . . . . . . . . . 371
6.13 Ergodicity and Strong Laws of Large Numbers . . . . . . . 373
6.14 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377
x CONTENTS
A Preliminaries 389
A.1 Set Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . 389
A.2 Examples of Proofs . . . . . . . . . . . . . . . . . . . . . . . 397
A.3 Mappings and Functions . . . . . . . . . . . . . . . . . . . . 401
A.4 Linear Algebra . . . . . . . . . . . . . . . . . . . . . . . . . 402
A.5 Linear System Fundamentals . . . . . . . . . . . . . . . . . 405
A.6 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 410
B Sums and Integrals 417
B.1 Summation . . . . . . . . . . . . . . . . . . . . . . . . . . . 417
B.2 Double Sums . . . . . . . . . . . . . . . . . . . . . . . . . . 420
processing, medical signal processing, geophysical signal processing, and
classical statistical areas of time series analysis, classification and regres-
sion, and pattern recognition show a wide variety of probabilistic models for
input processes and for operations on those processes, where the operations
might be deterministic or random, natural or artificial, linear or nonlinear,
digital or analog, or beneficial or harmful. An introductory course focuses
on the fundamentals underlying the analysis of such systems: the theories
of probability, random processes, systems, and signal processing.
xi
xii PREFACE
When the original book went out of print, the time seemed ripe to
convert the manuscript from the prehistoric troff to L
A
T
E
X and to undertake
a serious revision of the book in the process. As the revision became more
extensive, the title changed to match the course name and content. We
reprint the original preface to provide some of the original motivation for
the book, and then close this preface with a description of the goals sought
during the revisions.
Preface to Random Processes: An Introduction for
Engineers
Nothing in nature is random A thing appears random
only through the incompleteness of our knowledge. — Spinoza,
Ethics I
I do not believe that God rolls dice. — attributed to Einstein
Laplace argued to the effect that given complete knowledge of the physics
of an experiment, the outcome must always be predictable. This metaphys-
ical argument must be tempered with several facts. The relevant param-
which will hold up under scrutiny if the student continues to more advanced
courses. Another goal is to enable the student who might not continue to
more advanced courses to be able to read and generally follow the modern
literature on applications of random processes to information and commu-
nication theory, estimation and detection, control, signal processing, and
stochastic systems theory.
Revision
The most recent (summer 1999) revision fixed numerous typos reported
during the previous year and added quite a bit of material on jointly Gaus-
sian vectors in Chapters 3 and 4 and on minimum mean squared error
estimation of vectors in Chapter 4.
This revision is a work in progress. Revised versions will be made avail-
able through the World Wide Web page
.
The material is copyrighted by the authors, but is freely available to any
who wish to use it provided only that the contents of the entire text remain
intact and together. A copyright release form is available for printing the
book at the Web page. Comments, corrections, and suggestions should be
sent to Every effort will be made to fix typos and
take suggestions into an account on at least an annual basis.
I hope to put together a revised solutions manual when time permits,
but time has not permitted during the past year.
xiv PREFACE
Acknowledgements
We repeat our acknowledgements of the original book: to Stanford Univer-
sity and the University of Maryland for the environments in which the book
was written, to the John Simon Guggenheim Memorial Foundation for its
support of the first author, to the Stanford University Information Systems
Laboratory Industrial Affiliates Program which supported the computer
facilities used to compose this book, and to the generations of students
X
probability mass function (pmf) of a random variable X
f
X
probability density function (pdf) of a random variable X
F
X
cumulative distribution function (cdf) of a random variable X
xv
xvi Glossary
E(X) expectation of a random variable X
M
X
(ju) characteristic function of a random variable X
1
F
(x) indicator function of a set F
Φ Phi function (Eq. (2.78))
Q Complementary Phi function (Eq. (2.79))
Chapter 1
Introduction
A random or stochastic process is a mathematical model for a phenomenon
that evolves in time in an unpredictable manner from the viewpoint of the
observer. The phenomenon may be a sequence of real-valued measurements
of voltage or temperature, a binary data stream from a computer, a mod-
ulated binary data stream from a modem, a sequence of coin tosses, the
daily Dow-Jones average, radiometer data or photographs from deep space
probes, a sequence of images from a cable television, or any of an infinite
number of possible sequences, waveforms, or signals of any imaginable type.
It may be unpredictable due to such effects as interference or noise in a com-
fidelity. All of these operations on signals can be considered as signal
processing, although the name is most commonly used for the man-
made operations such as modulation, digitization, and coding, rather
than the natural possibly unavoidable changes such as the addition
of thermal noise or other changes out of our control.
• For very low bit rate digital speech communication applications, the
speech is sometimes converted into a model consisting of a simple
linear filter (called an autoregressive filter) and an input process. The
idea is that the parameters describing the model can be communicated
with fewer bits than can the original signal, but the receiver can
synthesize the human voice at the other end using the model so that
it sounds very much like the original signal.
• Signals including image data transmitted from remote spacecraft are
virtually buried in noise added to them on route and in the front
end amplifiers of the powerful receivers used to retrieve the signals.
By suitably preparing the signals prior to transmission, by suitable
filtering of the received signal plus noise, and by suitable decision or
estimation rules, high quality images have been transmitted through
this very poor channel.
• Signals produced by biomedical measuring devices can display spe-
cific behavior when a patient suddenly changes for the worse. Signal
processing systems can look for these changes and warn medical per-
sonnel when suspicious behavior occurs.
How are these signals characterized? If the signals are random, how
does one find stable behavior or structure to describe the processes? How
do operations on these signals change them? How can one use observations
based on random signals to make intelligent decisions regarding future be-
havior? All of these questions lead to aspects of the theory and application
of random processes.
3
The other category of courses and texts on random processes is the
typical mathematical approach, which requires an advanced mathemati-
cal background of real analysis, measure theory, and integration theory;
it involves precise and careful theorem statements and proofs, and it is
far more careful to specify precisely the conditions required for a result
to hold. Most engineers do not, however, have the required mathematical
background, and the extra care required in a completely rigorous develop-
ment severely limits the number of topics that can be covered in a typical
course — in particular, the applications that are so important to engineers
tend to be neglected. In addition, too much time can be spent with the
formal details, obscuring the often simple and elegant ideas behind a proof.
Often little, if any, physical motivation for the topics is given.
4 CHAPTER 1. INTRODUCTION
This book attempts a compromise between the two approaches by giving
the basic, elementary theory and a profusion of examples in the language
and notation of the more advanced mathematical approaches. The intent
is to make the crucial concepts clear in the traditional elementary cases,
such as coin flipping, and thereby to emphasize the mathematical structure
of all random processes in the simplest possible context. The structure is
then further developed by numerous increasingly complex examples of ran-
dom processes that have proved useful in stochastic systems analysis. The
complicated examples are constructed from the simple examples by signal
processing, that is, by using a simple process as an input to a system whose
output is the more complicated process. This has the double advantage
of describing the action of the system, the actual signal processing, and
the interesting random process which is thereby produced. As one might
suspect, signal processing can be used to produce simple processes from
complicated ones.
Careful proofs are constructed only in elementary cases. For example,
the fundamental theorem of expectation is proved only for discrete random
low the details in the literature of many proofs of results involving random
processes, the basic results and their development and implications should
be accessible, and the most common examples of random processes and
classes of random processes should be familiar. In particular, the student
should be well equipped to follow the gist of most arguments in the vari-
ous Transactions of the IEEE dealing with random processes, including the
IEEE Transactions on Signal Processing, IEEE Transactions on Image Pro-
cessing, IEEE Transactions on Speech and Audio Processing, IEEE Trans-
actions on Communications, IEEE Transactions on Control, and IEEE
Transactions on Information Theory.
It also should be mentioned that the authors are electrical engineers
and, as such, have written this text with an electrical engineering flavor.
However, the required knowledge of classical electrical engineering is slight,
and engineers in other fields should be able to follow the material presented.
This book is intended to provide a one-quarter or one-semester course
that develops the basic ideas and language of the theory of random pro-
cesses and provides a rich collection of examples of commonly encountered
processes, properties, and calculations. Although in some cases these ex-
amples may seem somewhat artificial, they are chosen to illustrate the way
engineers should think about random processes and for simplicity and con-
ceptual content rather than to present the method of solution to some
particular application. Sections that can be skimmed or omitted for the
shorter one-quarter curriculum are marked with a star (). Discrete time
processes are given more emphasis than in many texts because they are
simpler to handle and because they are of increasing practical importance
in and digital systems. For example, linear filter input/output relations are
carefully developed for discrete time and then the continuous time analogs
are obtained by replacing sums with integrals.
Most examples are developed by beginning with simple processes and
then filtering or modulating them to obtain more complicated processes.
which are a vector or finite collection of measurements; and random pro-
cesses, which can be viewed as sequences or waveforms of measurements.
Random variables, vectors, and processes can all be viewed as forms of sig-
nal processing: each operates on “inputs,” which are the sample points of
a probability space, and produces an “output,” which is the resulting sam-
ple value of the random variable, vector, or process. These output points
together constitute an output sample space, which inherits its own proba-
bility measure from the structure of the measurement and the underlying
experiment. As a result, many of the basic properties of random variables,
vectors, and processes follow from those of probability spaces. Probability
distributions are introduced along with probability mass functions, proba-
bility density functions, and cumulative distribution functions. The basic
derived distribution method is described and demonstrated by example. A
wide variety of examples of random variables, vectors, and processes are
treated.
Chapter 4 develops in depth the ideas of expectation, averages of ran-
dom objects with respect to probability distributions. Also called proba-
bilistic averages, statistical averages, and ensemble averages, expectations
7
can be thought of as providing simple but important parameters describ-
ing probability distributions. A variety of specific averages are considered,
including mean, variance, characteristic functions, correlation, and covari-
ance. Several examples of unconditional and conditional expectations and
their properties and applications are provided. Perhaps the most impor-
tant application is to the statement and proof of laws of large numbers or
ergodic theorems, which relate long term sample average behavior of ran-
dom processes to expectations. In this chapter laws of large numbers are
proved for simple, but important, classes of random processes. Other im-
portant applications of expectation arise in performing and analyzing signal
processing applications such as detecting, classifying, and estimating data.
cesses, ARMA (autoregressive-moving average) processes, random walks,
8 CHAPTER 1. INTRODUCTION
independent increment processes, Markov processes, Poisson and Gaussian
processes, and the random telegraph wave. We also briefly consider an ex-
ample of a nonlinear system where the output random processes can at least
be partially described — the exponential function of a Gaussian or Poisson
process which models phase or frequency modulation. We close with ex-
amples of a type of “doubly stochastic” process, compound processes made
up by adding a random number of other random effects.
Appendix A sketches several prerequisite definitions and concepts from
elementary set theory and linear systems theory using examples to be en-
countered later in the book. The first subject is crucial at an early stage
and should be reviewed before proceeding to chapter 2. The second subject
is not required until chapter 5, but it serves as a reminder of material with
which the student should already be familiar. Elementary probability is not
reviewed, as our basic development includes elementary probability. The
review of prerequisite material in the appendix serves to collect together
some notation and many definitions that will be used throughout the book.
It is, however, only a brief review and cannot serve as a substitute for
a complete course on the material. This chapter can be given as a first
reading assignment and either skipped or skimmed briefly in class; lectures
can proceed from an introduction, perhaps incorporating some preliminary
material, directly to chapter 2.
Appendix B provides some scattered definitions and results needed in
the book that detract from the main development, but may be of interest
for background or detail. These fall primarily in the realm of calculus and
range from the evaluation of common sums and integrals to a consideration
of different definitions of integration. Many of the sums and integrals should
be prerequisite material, but it has been the authors’ experience that many
students have either forgotten or not seen many of the standard tricks