5
SOFTWARE RELIABILITY AND
RECOVERY TECHNIQUES
Reliability of Computer Systems and Networks: Fault Tolerance, Analysis, and Design
Martin L. Shooman
Copyright
2002
John Wiley & Sons, Inc.
ISBNs:
0
-
471
-
29342
-
3
(Hardback);
0
-
471
-
22460
-X (Electronic)
202
5
.
1
INTRODUCTION
The general approach in this book is to treat reliability as a system problem
and to decompose the system into a hierarchy of related subsystems or com-
ponents. The reliability of the entire system is related to the reliability of the
SY
R
H
× R
S
× R
O
[Shooman,
1983
, pp.
351
–
353
].
This chapter will develop models that can be used for the software reliabil-
ity. These models are built upon the principles of continuous random variables
1
Another important “component” of system reliability is human reliability if an operator is
involved in any control, monitoring, input, or similar task. A discussion of human reliability
models is beyond the scope of this book; the reader is referred to Dougherty and Fragola [
1988
].
INTRODUCTION
203
developed in Appendix A, Sections A
6
and A
7
, and Appendix B, Section B
4
.
6
billion embedded microcom-
ponents. Associated with each microprocessor or microcomponent is memory,
a set of instructions, and a set of programs [Pollack,
1999
].
5
.
1
.
1
Definition of Software Reliability
One can define software engineering as the body of engineering and manage-
ment technologies used to develop quality, cost-effective, schedule-meeting soft-
ware. Software reliability measurement and estimation is one such technology
that can be defined as the measurement and prediction of the probability that the
software will perform its intended function (according to specifications) without
error for a given period of time. Oftentimes, the design, programming, and test-
ing techniques that contribute to high software reliability are included; however,
we consider these techniques as part of the design process for the development
of reliable software. Software reliability complements reliable software; both, in
fact, are important topics within the discipline of software engineering. Software
recovery is a set of fail-safe design techniques for ensuring that if some serious
error should crash the program, the computer will automatically recover to reini-
tialize and restart its program. The software succeeds during software recovery
if no crucial data is lost, or if an operational calamity occurs, but the recovery
transforms a total failure into a benign or at most a troubling, nonfatal “hiccup.”
5
0
. If we enter the values
1
,
5
, and
6
for A, B, and C, respectively, the roots will be r
1
−
2
and r
2
−
3
. A sin-
gle test of the software with these inputs confirms the expected results. Exact
repetition of this experiment with the same values of A, B, and C will always
yield the same results, r
1
−
2
and r
2
−
ple, was the developer wise enough to treat the problem of imaginary roots?
Did the developer use the quadratic formula to solve for the roots? How, then,
was the case of A
0
treated where there is only one root and the quadratic
formula “blows up” (i.e., leads to an exponential overflow error)? Clearly, we
should test for all these values during development to ensure that there are no
residual errors in the program, regardless of the input value. This leads to the
concept of exhaustive testing, which is always infeasible in a practical problem.
Suppose in the quadratic equation example that the values of A, B, and C were
restricted to integers between +
1
,
000
and −
1
,
000
. Thus there would be
2
,
000
values of A and a like number of values of B and C. The possible input space
for A, B, and C would therefore be (
2
,
000
)
3
.
2
years. This
is hardly a feasible procedure: any such computation for a practical problem
involves a much larger test space and a more difficult checking procedure that
is impossible in any practical sense. In the quadratic equation example, there
was a ready means of checking the answers by substitution into the equation;
however, if the purpose of the program is to calculate satellite orbits, and if
1
million combinations of input parameters are possible, then a person(s) or
computer must independently obtain the
1
million right answers and check
them all! Thus the probabilistic nature of software reliability is based on the
varying values of the input, the huge number of input cases, the initial system
states, and the impossibility of exhaustive testing.
The basis for software reliability is quite different than the most common
causes of hardware reliability. Software development is quite different from
hardware development, and the source of software errors (random discovery
of latent design and coding defects) differs from the source of most hard-
ware errors (equipment failures). Of course, some complex hardware does have
latent design and assembly defects, but the dominant mode of hardware fail-
ures is equipment failures. Mechanical hardware can jam, break, and become
worn-out, and electrical hardware can burn out, leaving a short or open circuit
or some other mode of failure. Many who criticize probabilistic modeling of
software complain that instructions do not wear out. Although this is a true
statement, the random discovery of latent software defects is indeed just as
damaging as equipment failures, even though it constitutes a different mode
of failure.
The development of models for software reliability in this chapter begins
1970
s, computer memory was so expensive that programmers used many
tricks and shortcuts to save a little here and there to make their programs oper-
206
SOFTWARE RELIABILITY AND RECOVERY TECHNIQUES
ate with smaller memory sizes. In
1965
, the cost of magnetic-core computer
memory was expensive at about $
1
per word and used a significant operating
current. (Presently, microelectronic memory sells for perhaps $
1
per megabyte
and draws only a small amount of current; assuming a
16
-bit word, this cost
has therefore been reduced by a factor of about
500
,
000
!) To save memory,
programmers reserved only
2
digits to represent the last
2
digits of the year.
They did not anticipate that any of their programs would survive for more
than
5
2
K software problems created prob-
lems themselves. One such case was that of the
7
-Eleven convenience store
chain. On January
1
,
2001
, the point-of-sale system used in the
7
-Eleven stores
read the year “
2001
” as “
1901
,” which caused it to reject credit cards if they
were used for automatic purchases (manual credit card purchases, in addition
to cash and check purchases, were not affected). The problem was attributed
to the system’s software, even though it had been designed for the
5
,
200
-store
chain to be Y
2
K-compliant, had been subjected to
10
,
000
, none of the new
16
airport-express trains and
13
high-
speed signature trains would start. Although the computer software had been
checked thoroughly before the start of
2000
, it still failed to recognize the
correct date. The software was reset to read December
1
,
2000
, to give the
German maker of the new trains
30
days to correct the problem. None of the
older trains were affected by the problem [New York Times, January
3
,
2001
].
Before we leave the obvious aspects of the Y
2
K problem, we should con-
sider how deeply entrenched some of these problems were in legacy software:
old programs that are used in their original form or rejuvenated for extended
use. Analysts have found that some of the old IBM
9020
computers used
3
SOFTWARE DEVELOPMENT LIFE CYCLE
Our goal is to make a probabilistic model for software, and the first step in any
modeling is to understand the process [Boehm,
2000
; Brooks,
1995
; Pfleerer,
1998
; Schach,
1999
; and Shooman,
1983
]. A good approach to the study of the
software development process is to define and discuss the various phases of
the software development life cycle. A common partitioning of these phases
is shown Table
5
.
1
. The life cycle phases given in this table apply directly
to the technique of program design known as structured procedural program-
ming (SPP). In general, it also applies with some modification to the newer
approach known as object-oriented programming (OOP). The details of OOP,
including the popular design diagrams used for OOP that are called the uni-
versal modeling language (UMLs), are beyond the scope of this chapter; the
reader is referred to the following references for more information: [Booch,
1999
; Fowler,
1999
TABLE
5
.
1
Project Phases for the Software Development Life Cycle
Phase
Description
Start of project Initial decision or motivation for the project, including
overall system parameters.
Needs A study and statement of the need for the software and
what it should accomplish.
Requirements Algorithms or functions that must be performed, including
functional parameters.
Specifications Details of how the tasks and functions are to be
performed.
Design of prototype Construction of a prototype, including coding and testing.
Prototype: System Evaluation by both the developer and the customer of
test how well the prototype design meets the requirements.
Revision of Prototype system tests and other information may reveal
specifications needed changes.
Final design Design changes in the prototype software in response to
discovered deviations from the original specifications
or the revised specifications, and changes to improve
performance and reliability.
Code final design The final implementation of the design.
Unit test Each major unit (module) of the code is individually
tested.
Integration test Each module is successively inserted into the pretested
control structure, and the composite is tested.
System test Once all (or most) of the units have been integrated,
The project formally begins with the drafting of a requirements document for
the system in response to the needs document or equivalent document. Initially,
the requirements constitute high-level system requirements encompassing both
the hardware and software. In a large project, as the requirements document
“matures,” it is expanded into separate hardware and software requirements;
the requirements will specify what needs to be done. For an air traffic control
system (ATC), the requirements would deal with the ATC centers that they
must serve, the present and expected future volume of traffic, the mix of air-
craft, the types of radar and displays used, and the interfaces to other ATC
centers and the aircraft. Present travel patterns, expected growth, and expected
changes in aircraft, airport, and airline operational characteristics would also
be reflected in the requirements.
5
.
3
.
3
Specifications
The project specifications start with the requirements and the details of how
the software is to be designed to satisfy these requirements. Continuing with
our air traffic control system example, there would be a hardware specifica-
tions document dealing with (a) what type of radar is used; (b) the kinds of
displays and display computers that are used; (c) the distributed computers or
microprocessors and memory systems; (d) the communications equipment; (e)
the power supplies; and (f) any networks that are needed for the project. The
software specifications document will delineate (a) what tracking algorithm to
use; (b) how the display information for the aircraft will be handled; (c) how
the system will calculate any potential collisions; (d) how the information will
be displayed; and (e) how the air traffic controller will interact with both the
system and the pilots. Also, the exact nature of any required records of a tech-
3
.
4
Prototypes
Most innovative projects now begin with a prototype or rapid prototype phase.
The purpose of the prototype is multifaceted: developers have an opportunity to
try out their design ideas, the difficult parts of the project become rapidly appar-
ent, and there is an early (imperfect) working model that can be shown to the cus-
tomer to help identify errors of omission and commission in the requirements and
specification documents. In constructing the prototype, an initial control struc-
ture (the main program coordinating all the parts) is written and tested along with
the interfaces to the various components (subroutines and modules). The various
components are further decomposed into smaller subcomponents until the mod-
ule level is reached, at which time programming or coding at the module level
begins. The nature of a module is described in the paragraphs that follow.
A module is a block of code that performs a well-described function or
procedure. The length of a module is a frequently debated issue. Initially, its
length was defined as perhaps
50
–
200
source lines of code (SLOC). The SLOC
length of a module is not absolute; it is based on the coder’s “intellectual span
of control.” Since a program listing contains about
50
lines, this means that a
module would be
1
–
4
“Plan to Throw One Away,” of Brooks [
1995
].) The cost of the prototype is
not so large if one considers that much of the prototype code (especially the
control structure) can be modified and reused for the final design and that the
prototype test cases can be reused in testing the final design. It is likely that
the same manager who objects to the use of prototype software would heartily
endorse the use of a prototype board (breadboard), a mechanical model, or
a computer simulation to “work out the bugs” of a hardware design without
realizing that the software prototype is the software analog of these well-tried
hardware development techniques.
Finally, we should remark that not all projects need a prototype phase. Con-
sider the design of a fourth payroll system for a customer. Assume that the
development organization specializes in payroll software and had developed
the last three payroll systems for the customer. It is unlikely that a prototype
would be required by either the customer or the developer. More likely, the
developer would have some experts with considerable experience study the
present system, study the new requirements, and ask many probing questions
of the knowledgeable personnel at the customer’s site, after which they could
write the specifications for the final software. However, this payroll example
is not the usual case; in most cases, prototype software is generally valuable
and should be considered.
5
.
3
.
5
Design
Design really begins with the needs, requirements, and specifications docu-
ments. Also, the design of a prototype system is a very important part of
position
Find one root
through trial and
error
Solve function’s
quadratic equation
Use results to solve
for other roots
Check for validity
and requery if
incorrect
Query input file
Send data
to firm’s
plotting system
Interpret and
print results
Input (A, B, C, D)
Root Solution
Suspension
Design Program
Classify Roots
Plot Roots
0.0
1.0
2.0
3.0
4.0
1.1
1.2
1
(b). For more details
on trees, see Cormen [p.
91
ff.].
The example of the H-diagram given in Fig.
5
.
1
is for the top-level archi-
tecture of a program to be used in the hypothetical design of the suspension
system for a high-speed train. It is assumed that the dynamics of the suspen-
sion system can be approximated by a third-order differential equation and that
the stability of the suspension can be studied by plotting the variation in the
roots of the associated third-order characteristic polynomial (Ax
3
+ Bx
2
+ Cx
+ D
0
), which is a function of the various coefficients A, B, C, and D. It is
also assumed that the company already has a plotting program (
4
.
1
) that is to
be reused. The block (
4
decomposition only involves two or three subproblems (degree
2
or
3
), the tree
becomes very deep before all the modules are reached, which is again cum-
bersome. A suitable value to pick for each decomposition is
5
–
9
subprograms
(each node should have degree
5
–
9
). This is based on the work of the exper-
imental psychologist Miller [
1956
], who found that the classic human senses
(sight, smell, hearing, taste, and touch) could discriminate
5
–
9
logarithmic lev-
els. (See also Shooman [
1983
, pp.
194
,
195
1
interfaces. At level
1
, the
214
SOFTWARE RELIABILITY AND RECOVERY TECHNIQUES
top level node has between
5
1
5
and
9
1
9
interfaces. Also at level
2
are between
5
2
25
and
9
2
81
interfaces. Thus, for k levels starting
1
a)
and for r
5
to
9
, we have
(
5
k
−
1
)
/
4
≤ number of interfaces ≤ (
9
k
−
1
)
/
8
(
5
.
1
b)
3
We can better appreciate the use of Eqs. (
5
.
1
a–d) if we explore the following
example. Suppose that a module consists of
100
lines of code, in which case
M
100
, and it is estimated that a program design will take about
10
,
000
SLOC. Using Eq. (
5
.
1
c, d), we know that the number of modules must be
about
100
and that the number of levels are bounded by
5
k
100
and
9
k
course, these computations are for a symmetric graph; however, they give us
a rough idea of the size of the H-diagram design and the number of modules
and interfaces that must be designed and tested.
5
.
3
.
6
Coding
Sometimes, a beginning undergraduate student feels that coding is the most
important part of developing software. Actually, it is only one of the six-
teen phases given in Table
5
.
1
. Previous studies [Shooman,
1983
, Table
5
.
1
]
have shown that coding constitutes perhaps
20
% of the total development
effort. The preceding phases of design—“start of project” through the “final
design”—entail about
40
% of the development effort; the remaining phases,
starting with the unit (module) test, are another
common languages at the present are C++ and Ada.
5
.
3
.
7
Testing
Testing is a complex process, and the exact nature of it depends on the design
philosophy and the phase of the project. If the design has progressed under a
top–down structured approach, it will be much like that outlined in Table
5
.
1
.
If the modern OOP techniques are employed, there may be more testing of
interfaces, objects, and other structures within the OOP philosophy. If proof of
program correctness is employed, there will be many additional layers added to
the design process involving the writing of proofs to ensure that the design will
satisfy a mathematical representation of the program logic. These additional
phases of design may replace some of the testing phases.
Assuming the top–down structured approach, the first step in testing the
code is to perform unit (module) testing. In general, the first module to be
written should be the main control structure of the program that contains the
highest interface levels. This main program structure is coded and tested first.
Since no additional code is generally present, sometimes “dummy” modules,
called test stubs, are used to test the interfaces. If legacy code modules are
available for use, clearly they can serve to test the interfaces. If a prototype
is to be constructed first, it is possible that the main control structure will be
designed well enough to be reused largely intact in the final version.
Each functional unit of code is subjected to a test, called unit or module
are constructed by examining the code. Such tests are often called white box
or clear box tests (the reason for these names will soon be explained).
The system test follows the integration test. During the system test, a sce-
nario is written encompassing an entire operational task that the software must
perform. For example, in the case of air traffic control software, one might
write a scenario that replicates aircraft arrivals and departures at Chicago’s
O’Hare Airport during a slow period—say, between
11
and
12
P
.
M
. This would
involve radar signals as inputs, the main computer and software for the sys-
tem, and one or more display processors. In some cases, the radar would not
be present, but simulated signals would be fed to the computer. (Anyone who
has seen the physical size of a large, modern radar can well appreciate why
the radar is not physically present, unless the system test is performed at an
air traffic control center, which is unlikely.) The display system is a “desk-
size” console likely to be present during the system test. As the system test
progresses, the software gradually approaches the time of release when it can
be placed into operation. Because most system tests are written based on the
requirements and specifications, they do not depend on the nature of the code;
they are as if the code were hidden from view in an opaque or black box.
Hence such functional tests are often called black box tests.
On large projects (and sometimes on smaller ones), the last phase of testing
SOFTWARE DEVELOPMENT LIFE CYCLE
217
is acceptance testing. This is generally written into the contract by the cus-
course, one wonders how independent such an in-house group can be if it and
the developers both work for the same boss.
The term regression testing is often used, describing the need to retest the
software with the previous test cases after each new error is corrected. In the-
ory, one must repeat all the tests; however, a selected subset is generally used
in the retest. Each project requires a test plan to be written early in the develop-
ment cycle in parallel with or immediately following the completion of speci-
fications. The test plan documents the tests to be performed, organizes the test
cases by phase, and contains the expected outputs for the test cases. Generally,
testing costs and schedules are also included.
When a commercial software company is developing a product for sale to
the general business and home community, the later phases of testing are often
somewhat different, for which the terms alpha testing and beta testing are often
used. Alpha testing means that a test group within the company evaluates the
software before release, whereas beta testing means that a number of “selected
customers” with whom the developer works are given early releases of the
software to help test and debug it. Some people feel that beta testing is just a
way of reducing the cost of software development and that it is not a thorough
way of testing the software, whereas others feel that the company still does
adequate testing and that this is just a way of getting a lot of extra field testing
done in a short period of time at little additional cost.
During early field deployment, additional errors are found, since the actual
218
SOFTWARE RELIABILITY AND RECOVERY TECHNIQUES
operating environment has features or inputs that cannot be simulated. Gener-
ally, the developer is responsible for fixing the errors during early field deploy-
ment. This responsibility is an incentive for the developer to do a thorough
job of testing before the software is released because fixing errors after it is
released could cost
25
.
2
. This figure does not include a prototype phase; if this is added to the
development cycle, the diagram shown in Fig.
5
.
3
ensues. In actual practice,
portions of the system are sometimes developed and tested before the remain-
ing portions. The term software build is used to describe this process; thus
one speaks of build
4
being completed and integrated into the existing system
composed of builds
1
–
3
. A diagram describing this build process, called the
incremental model of software development, is given in Fig.
5
.
4
. Other related
models of software development are given in Schach [
1999
].
Now that the general features of the development process have been
described, we are ready to introduce software reliability models related to the
software development process.
5
Phase
Implementation
Phase
Integration
Phase
Operations
Mode
Retirement
Verify Verify
Verify
Verify
Test
Test
SOFTWARE LIFE-CYCLE
DEVELOPMENT MODELS
(WATERFALL MODEL)
Development
Maintenance
Figure
5
.
2
Diagram of the waterfall model of software development.
5
.
4
.
2
Reliability as a Probability of Success
The reliability of a system (hardware, software, human, or a combination
f
≤ t)(
5
.
2
)
220
SOFTWARE RELIABILITY AND RECOVERY TECHNIQUES
Rapid
Prototype
Changed
Requirements
Specification
Phase
Design
Phase
Implementation
Phase
Integration
Phase
Operations
Mode
Retirement
Verify Verify
Verify
Verify
Test
Test
SOFTWARE LIFE-CYCLE
DEVELOPMENT MODELS
dt, and the distribution function is the
RELIABILITY THEORY
221
Requirements
Phase
Specification
Phase
Architectural
Design
Verify
Verify
Verify
Operations Mode
Retirement
SOFTWARE LIFE-CYCLE
DEVELOPMENT MODELS
(INCREMENTAL MODEL WITH BUILDS)
Development
Maintenance
For each build, perform
a detailed design,
implementation, and
integration. Test; then
deliver to client.
Figure
5
.
4
Diagram of the incremental model of software development.
integral of the density function, F(t)
5
.
3
) states the simple relationships
among R(t), F(t), and f (t); given any one of the functions, the other two are
easy to calculate.
222
SOFTWARE RELIABILITY AND RECOVERY TECHNIQUES
5
.
4
.
3
Failure-Rate (Hazard) Function
Equation (
5
.
3
) expresses reliability in terms of the traditional mathematical
probability functions, F(t), and f (t); however, reliability engineers have found
these functions to be generally ill-suited for study if we want intuition, fail-
ure data interpretation, and mathematics to agree. Intuition suggests that we
study another function—a conditional probability function called the failure
rate (hazard), z(t). The following analysis develops an expression for the reli-
ability in terms of z(t) and relates z(t) to f (t) and F(t).
The probability density function can be interpreted from the following rela-
tionship:
P(t < t
f
< t + dt)
f (t) dt (
5
.
5
)
However, we can also write Eq. (
5
.
4
) as
f (t) dt
P(no failure in interval
0
to t)
× P(failure in interval dt
|
no failure in interval
0
to t)(
5
.
6
a)
The last expression in Eq. (
5
.
6
a) is a conditional failure probability, and the
symbol
Substitution of Eq. (
5
.
6
b) into Eq. (
5
.
5
) along with the relationship R(t)
n(t)
/
N yields
RELIABILITY THEORY
223
n(t) − n(t + dt)
N
R(t)z(t) dt
n(t)
N
z(t) dt (
5
.
7
)
Solving Eqs. (
5
.
.
9
), we see that f (t) reflects the rate of failure
based on the original number N placed on test, whereas z(t) gives the instan-
taneous rate of failure based on the number of survivors at the beginning of
the interval.
We can develop an equation for R(t) in terms of z(t) from Eq. (
5
.
6
b):
z(t)
f (t)
R(t)
(
5
.
10
)
and from Eq. (
5
.
3
), differentiation of both sides yields
dR(t)
dt
−f (t)(
5
.
13
a)
Eliminating the natural logarithmic function in this equation by exponentiating
both sides yields
R(t)
e
−
∫
z(t) dt
(
5
.
13
b)
which is the form of the reliability function that is used in the following model
development.
If one substitutes limits for the integral, a dummy variable, x, is required
inside the integral, and a constant of integration must be added, yielding
224
SOFTWARE RELIABILITY AND RECOVERY TECHNIQUES
R(t)
e
−
∫
t
0
z(x) dx+ A
0
is
0
; thus B
1
and Eq. (
5
.
13
c) becomes
R(t)
e
−
∫
t
0
z(x) dx
(
5
.
13
d)
5
.
4
.
4
Mean Time To Failure
] yields a simpler expression:
MTTF
E(t)
∫
∞
0
R(t) dt (
5
.
15
)
Sometimes, the mean time to failure is called mean time between failure
(MTBF), and although there is a minor difference in their definitions, we will
use the terms interchangeably.
5
.
4
.
5
Constant-Failure Rate
In general, a choice of the failure-rate function defines the reliability model.
Such a choice should be made based on past studies that include failure-rate
data or reasonable engineering assumptions. In several practical cases, the fail-
ure rate is constant in time, z(t)
l, and the mathematics becomes quite simple.
Substitution into Eqs. (
5
∞
0
e
−lt
dt
1
l
(
5
.
17
)
The result is particularly simple: the reliability function is a decreasing expo-
nential function where the exponent is the negative of the failure rate l. A
smaller failure rate means a slower exponential decay. Similarly, the MTTF is
just the reciprocal of the failure rate, and a small failure rate means a large
MTTF.
As an example, suppose that past life tests have shown that an item fails at
a constant-failure rate. If
100
items are tested for
1
,
000
hours and
4
of these
fail, then l
case, substitution into Eq. (
5
.
16
) yields R(
5
,
000
)
e
−(
4
/
100
,
000
) ×
5
,
000
e
−
0
.
2
0
.
e
−
∫
t
0
kxdx
e
−kt
2
/
2
(
5
.
18
)
MTTF
E(t)
∫
∞
0
e
−kt
2
/
2
dt
a software-caused system failure only when necessary, and the slang expres-
226
SOFTWARE RELIABILITY AND RECOVERY TECHNIQUES
sion “software bug” is commonly used in normal conversation to describe a
software problem.
3
Software errors occur at many stages in the software life cycle. Errors may
occur in the requirements-and-specifications phase. For example, the specifi-
cations might state that the time inputs to the system from a precise cesium
atomic clock are in hours, minutes, and seconds when actually the clock out-
put is in hours and decimal fractions of an hour. Such an erroneous specifica-
tion might be found early in the development cycle, especially if a hardware
designer familiar with the cesium clock is included in the specification review.
It is also possible that such an error will not be found until a system test, when
the clock output is connected to the system. Errors in requirements and speci-
fications are identified as separate entities; however, they will be added to the
code faults in this chapter. If the range safety officer has to destroy a satellite
booster because it is veering off course, it matters little to him or her whether
the problem lies in the specifiations or whether it is a coding error.
Errors occur in the program logic. For example, the THEN and ELSE
clauses in an IF THEN ELSE statement may be interchanged, creating an error,
or a loop is erroneously executed n−
1
times rather than the correct value, which
is n times. When a program is coded, syntax errors are always present and are
caught by the compiler. Such syntax errors are too frequent, embarrassing, and
universal to be considered errors.
Actually, design errors should be recorded once the program management
reviews and endorses a preliminary design expressed by a set of design repre-
sentations (H-diagrams, control graphs, and maybe other graphical or abbrevi-