www.it-ebooks.info
Praise for
A Developer’s Guide to Data Modeling
for SQL Server
“Eric and Joshua do an excellent job explaining the importance of data modeling and how
to do it correctly. Rather than relying only on academic concepts, they use real-world ex-
amples to illustrate the important concepts that many database and application develop-
ers tend to ignore. The writing style is conversational and accessible to both database
design novices and seasoned pros alike. Readers who are responsible for designing, imple-
menting, and managing databases will benefit greatly from Joshua’s and Eric’s expertise.”
—Anil Desai, Consultant, Anil Desai, Inc.
“Almost every IT project involves data storage of some kind, and for most that means a
relational database management system (RDBMS). This book is written for a database-
centric audience (database modelers, architects, designers, developers, etc.). The authors
do a great job of showing us how to take a project from its initial stages of requirements
gathering all the way through to implementation. Along the way we learn how to handle
some of the real-world design issues that typically surface as we go through the process.
“The bottom line here is simple. This is the book you want to have just finished read-
ing when your boss says ‘We have a new project I would like your help with.’”
—Ronald Landers, Technical Consultant, IT Professionals, Inc.
“The Data Model is the foundation of the application. I’m pleased to see additional books
being written to address this critical phase. This book presents a balanced and pragmatic
view with the right priorities to get your SQL server project off to a great start and a long
life.”
—Paul Nielsen, SQL Server MVP, SQLServerBible.com
“This is a truly excellent introduction to the database design methodology that will work
for both novices and advanced designers. The authors do a good job at explaining the ba-
sics of relational database modeling and how they fit into modern business architecture.
This book teaches us how to identify the business problems that have to be satisfied by a
database and then proceeds to explain how to build a solid solution from scratch.”
—Alexzander N. Nepomnjashiy, Microsoft SQL Server DBA,
ERVER
2005
AND
2008
Eric Johnson
Joshua Jones
Upper Saddle River, NJ • Boston • Indianapolis • San Francisco
New York • Toronto • Montreal • London • Munich • Paris • Madrid
Capetown • Sydney • Tokyo • Singapore • Mexico City
www.it-ebooks.info
Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks.
Where those designations appear in this book, and the publisher was aware of a trademark claim, the designations
have been printed with initial capital letters or in all capitals.
The authors and publisher have taken care in the preparation of this book, but make no expressed or implied war-
ranty of any kind and assume no responsibility for errors or omissions. No liability is assumed for incidental or
consequential damages in connection with or arising out of the use of the information or programs contained herein.
The publisher offers excellent discounts on this book when ordered in quantity for bulk purchases or special sales,
which may include electronic versions and/or custom covers and content particular to your business, training goals,
marketing focus, and branding interests. For more information, please contact:
U.S. Corporate and Government Sales
(800)382-3419
For sales outside the United States please contact:
International Sales
Visit us on the Web: informit.com/aw
Library of Congress Cataloging-in-Publication Data
Johnson, Eric, 1978–
A developer’s guide to data modeling for SQL server : covering SQL server
2005 and 2008 / Eric Johnson and Joshua Jones. — 1st ed.
Preface xv
Acknowledgments xvii
About the Authors xix
PART I Data Modeling Theory . . . . . . . . . . . . . . . . . . . . . . . 1
Chapter 1 Data Modeling Overview . . . . . . . . . . . . . . . . . . . . . . . . . 3
Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Relational Database Management Systems. . . . . . . . . . . . . . . . . . . . . . . 5
Why a Sound Data Model Is Important . . . . . . . . . . . . . . . . . . . . . . . . 6
Data Consistency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Scalability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Meeting Business Requirements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Easy Data Retrieval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Performance Tuning. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
The Process of Data Modeling. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Modeling Theory. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Business Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Building the Logical Model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Building the Physical Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Chapter 2 Elements Used in Logical Data Models . . . . . . . . . . . . . . 23
Entities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Primary and Foreign Keys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Domains. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Single-Valued and Multivalued Attributes . . . . . . . . . . . . . . . . . . . . . . . 32
Referential Integrity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
www.it-ebooks.info
Relationships. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
Relationship Types. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
Determining Normal Forms. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
Denormalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
x Contents
www.it-ebooks.info
Contents xi
PART II Business Requirements . . . . . . . . . . . . . . . . . . . . . 95
Chapter 5 Requirements Gathering . . . . . . . . . . . . . . . . . . . . . . . . 97
Requirements Gathering Overview . . . . . . . . . . . . . . . . . . . . . . . . . . 98
Gathering Requirements Step by Step . . . . . . . . . . . . . . . . . . . . . . . . 98
Conducting Interviews . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
Observation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
Previous Processes and Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
Use Cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
Business Needs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
Balancing Technical Limitations with Business Needs . . . . . . . . . . . . . 112
Gathering Usage Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
Reads versus Writes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
Data Storage Requirements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
Transaction Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
Chapter 6 Interpreting Requirements . . . . . . . . . . . . . . . . . . . . . . 117
Mountain View Music . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
Compiling Requirements Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
Identifying Useful Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
Identifying Superfluous Information . . . . . . . . . . . . . . . . . . . . . . . . . . 120
Determining Model Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . 121
Interpreting User Interviews and Statements . . . . . . . . . . . . . . . . . . . . 121
Interpreting Flowcharts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
Interpreting Legacy Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
Domains. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
Chapter 8 Common Data Modeling Problems . . . . . . . . . . . . . . . . 171
Entity Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
Too Few Entities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
Too Many Entities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
Attribute Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
Single Attributes Contain Different Data . . . . . . . . . . . . . . . . . . . . . . . 176
Incorrect Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
Relationship Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
One-to-One Relationships . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
Many-to-Many Relationships . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
xii Contents
www.it-ebooks.info
Contents xiii
PART IV Creating the Physical Model . . . . . . . . . . . . . . . . 187
Chapter 9 Creating the Physical Model with SQL Server . . . . . . . . 189
Naming Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
General Naming Guidelines. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
Naming Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
Naming Columns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
Naming Views . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
Naming Stored Procedures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
Naming User-Defined Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
Naming Triggers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
Naming Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
Naming User-Defined Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
Naming Primary Keys and Foreign Keys . . . . . . . . . . . . . . . . . . . . . . 197
Chapter 11 Creating an Abstraction Layer in SQL Server . . . . . . . . 241
What Is an Abstraction Layer? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
Why Use an Abstraction Layer? . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
Extensibility and Flexibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
An Abstraction Layer’s Relationship to the Logical Model . . . . . . . . . . 245
An Abstraction Layer’s Relationship to Object-Oriented
Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246
Implementing an Abstraction Layer . . . . . . . . . . . . . . . . . . . . . . . . . 247
Views. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248
Stored Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
Other Components of an Abstraction Layer . . . . . . . . . . . . . . . . . . . . 254
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254
Appendix A Sample Logical Model . . . . . . . . . . . . . . . . . . . . . . . . . 255
Appendix B Sample Physical Model . . . . . . . . . . . . . . . . . . . . . . . . 261
Appendix C SQL Server 2008 Reserved Words . . . . . . . . . . . . . . . . 267
Appendix D Recommended Naming Standards . . . . . . . . . . . . . . . . 269
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271
xiv Contents
www.it-ebooks.info
P
REFACE
As database professionals, we are frequently asked to come into existing
environments and “fix” existing databases. This is usually because of per-
formance problems that application developers and users have uncovered
over the lifetime of a given application. Inevitably, the expectation is that
we can work some magic database voodoo and the performance problems
will go away. Unfortunately, as most of you already know, the problem
often lies within the design of the database. We often spend hours in meet-
ings trying to justify the cost of redesigning an entire database in order to
Server would be helpful for those people redesigning an existing database
to be migrated from another platform to SQL Server.
We’ve all read that software design, and relational database design in
particular, should be platform agnostic. We do not necessarily disagree
with that outlook. However, it is important to understand which RDBMS
will be hosting your design, because that can affect the capabilities you can
plan for and the weaknesses you may need to account for in your design.
Additionally, with the introduction of SQL Server 2005, Microsoft has im-
plemented quite a bit of technology that extends the capabilities of SQL
Server beyond simple database hosting. Although we don’t cover every
piece of extended functionality (otherwise, you would need a crane to carry
this book), we reference it where appropriate to give you the opportunity
to learn how this functionality can help you.
Within the pages of this book, we hope you’ll find everything you need
to help you through the entire design and development process—every-
thing from talking to users, designing use cases, and developing your data
model to implementing that model and ensuring it has solid performance
characteristics. When possible, we’ve provided examples that we hope will
be useful and applicable to you in one way or another. After spending
hours developing the background and requirements for our fictitious com-
pany, we have been thinking about starting our own music business. And
let’s face it—reading line after line of text about the various uses for a var-
char data type can’t always be thrilling, so we’ve tried to add some anec-
dotes, a few jokes, and even a paraphrased movie quote or two to keep it
lively.
Writing this book has also been an adventure for both of us, in learn-
ing how the publishing process works, learning the finer details of writing
for a mass audience, and learning that even though we are our own worst
critics, it’s hard to hear criticism from your friends, even if they’re right;
but you’re always glad that they are.
BOUT THE
A
UTHORS
Eric Johnson (Microsoft SQL MVP) is the co-founder of Consortio
Services and the primary database technologies consultant. His back-
ground in information technology is diverse, ranging from operating sys-
tems and hardware to specialized applications and development. He has
even done his fair share of work on networks. Because IT is a way to sup-
port business processes, Eric has also acquired an MBA. All in all, he has
ten years of experience with IT, much of it working with Microsoft SQL
Server. Eric has managed and designed databases of all shapes and sizes.
He has delivered numerous SQL Server training classes and Webcasts as
well as presentations at national technology conferences. Most recently, he
presented at TechMentor on SQL Server 2005 replication, reporting ser-
vices, and integration services. In addition, he is active in the local SQL
Server community, serving as the president of the Colorado Springs SQL
Server Users Group. He is also the co-host of CS Techcast, a weekly pod-
cast for IT professionals at www.cstechcast.com. You can find Eric’s blog at
www.consortioservices.com/blog.
Joshua Jones (MCTS, SQL Server 2005; MCITP, Database Adminis-
trator) is operating systems and database systems consultant with Consortio
Services in Colorado Springs. There he provides training, administration,
analysis, and design support for customers using SQL Server 2000 and
2005. In his seven years as an IT professional, he has worked in many areas
of information technology, including Windows desktop support, Windows
2000 and 2003 server infrastructure design and support (AD, DNS, MS
Exchange), telephony switch support, and network support. Josh has spoken
at various PASS sponsored events about SQL Server topics such as 64-bit
SQL Server implementation, reporting services administration, and per-
formance tuning. He is also a co-host of CS Techcast, a weekly podcast for
mation to logical objects that can eventually be stored in a database. This
means that a data modeler must wear many hats to do the job effectively.
You not only must understand the process by which the model is built, but
you also must be a data detective. You must be good at asking questions
and finding out what is really important to your customer.
In data modeling, as in many areas of information technology, cus-
tomers know what they want, but they don’t always know what they need.
It’s your job to figure out what they need. Suppose you’re dealing with
Tom, a project manager for an appliance distribution company. Tom un-
derstands that his company orders refrigerators, dishwashers, and the like
from the manufacturers and then takes orders and sells those appliances to
its customers (retail stores). What Tom doesn’t know is how to take that in-
formation, model it, and ultimately store it in a database so that it can be
leveraged to help the company make decisions or control a process.
In addition to finding out what information your customer cares about
and getting it into a database, you must find out how the customer intends
to use the information. Is it for historical purposes, or will the company use
the data in its daily operations? Will it be used only to produce reports, or
will an application need to manipulate the data regularly? As if that weren’t
enough, you eventually have to think about turning your data model into a
physical database.
There are many choices on the market when it comes to database man-
agement products. These products are similar in that they allow you to
store, secure, and use information in databases; however, each product im-
plements features in its own way, so you must also make the best use of
3
www.it-ebooks.info
these features to provide a solution that best meets the needs of your
customer.
Our goal in this book is to give you the know-how and skills you need