Tài liệu Microsoft Office Access 2007 Data Analysis - Wiley 2007 - Pdf 95

Michael Alexander
Microsoft
®
Access™ 2007
Data Analysis
01_104859 ffirs.qxp 2/20/07 3:02 PM Page iii
Microsoft
®
Access™ 2007 Data Analysis
Published by
Wiley Publishing, Inc.
10475 Crosspoint Boulevard
Indianapolis, IN 46256
www.wiley.com
Copyright © 2007 by Wiley Publishing, Inc., Indianapolis, Indiana
Published simultaneously in Canada
ISBN: 978-0-470-10485-9
Manufactured in the United States of America
10 9 8 7 6 5 4 3 2 1
No part of this publication may be reproduced, stored in a retrieval system or transmitted
in any form or by any means, electronic, mechanical, photocopying, recording, scanning or
otherwise, except as permitted under Sections 107 or 108 of the 1976 United States Copy-
right Act, without either the prior written permission of the Publisher, or authorization
through payment of the appropriate per-copy fee to the Copyright Clearance Center, 222
Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 646-8600. Requests to the
Publisher for permission should be addressed to the Legal Department, Wiley Publishing,
Inc., 10475 Crosspoint Blvd., Indianapolis, IN 46256, (317) 572-3447, fax (317) 572-4355, or
online at
/>Limit of Liability/Disclaimer of Warranty: The publisher and the author make no repre-
sentations or warranties with respect to the accuracy or completeness of the contents of this
work and specifically disclaim all warranties, including without limitation warranties of fit-

www.datapigtechnologies.com, where he shares basic
Access and Excel tips to the Office community.
About the Author
vii
01_104859 ffirs.qxp 2/20/07 3:02 PM Page vii
ix
Acquisitions Editor
Katie Mohr
Development Editor
Kelly Talbot
Technical Editor
Todd Meister
Production Editor
Angela Smith
Copy Editor
Travis Henderson
Editorial Manager
Mary Beth Wakefield
Production Manager
Tim Tate
Vice President and Executive
Group Publisher
Richard Swadley
Vice President and Executive
Publisher
Joseph B. Wikert
Project Coordinator
Patrick Redmond
Graphics and Production
Specialists

Reports 12
Macros and VBA 12
Summary 13
Chapter 2 Access Basics 15
Access Tables 15
Table Basics 16
Opening a Table in the Datasheet View 16
Identifying Important Table Elements 17
Contents
xi
02_104859 ftoc.qxp 2/17/07 12:49 AM Page xi
Opening a Table in the Design View 17
Exploring Data Types 19
Before Creating a Table 20
Creating a Table with Design View 21
Field Properties 24
Primary Key 26
Getting Data into Access 28
Importing 28
Linking 28
Things to Remember About Importing Data 28
Importing Data from an Excel Spreadsheet 29
Importing Data from a Text File 30
Understanding the Relational Database Concept 30
Why Is This Concept Important? 30
Excel and the Flat-File Format 31
Splitting Data into Separate Tables 33
Foreign Keys 34
Relationship Types 34
Referential Integrity 36

02_104859 ftoc.qxp 2/17/07 12:49 AM Page xii
Append Queries 68
Why Use an Append Query? 68
What Are the Hazards of Append Queries? 69
Creating an Append Query 71
Update Queries 74
Why Use an Update Query? 75
What Are the Hazards of Update Queries? 75
Creating an Update Query 75
A Word on Updatable Datasets 78
Crosstab Queries 78
Using the Crosstab Query Wizard 79
Creating a Crosstab Query Manually 84
Using the Query Design Grid to Create Your Crosstab
Query 85
Customizing Your Crosstab Queries 88
Summary 90
Part II Basic Analysis Techniques 93
Chapter 4 Transforming Your Data with Access 95
Finding and Removing Duplicate Records 96
Defining Duplicate Records 96
Finding Duplicate Records 97
Removing Duplicate Records 100
Common Transformation Tasks 102
Filling in Blank Fields 102
Concatenating 104
Concatenating Fields 104
Augmenting Field Values with Your Own Text 105
Changing Case 107
Removing Leading and Trailing Spaces from a String 109

Summary 146
Chapter 6 Performing Conditional Analysis 149
Using Parameter Queries 149
How Parameter Queries Work 151
Ground Rules of Parameter Queries 151
Working with Parameter Queries 152
Working with Multiple Parameter Conditions 152
Combining Parameters with Operators 153
Combining Parameters with Wildcards 154
Using Parameters as Calculation Variables 155
Using Parameters as Function Arguments 156
Using Conditional Functions 159
The IIf Function 159
Using IIf to Avoid Mathematical Errors 159
Using IIf to Save Time 161
Nesting IIf Functions for Multiple Conditions 163
Using IIf Functions to Create Crosstab Analyses 164
The Switch Function 166
Comparing the IIf and Switch Functions 167
Summary 169
Part III Advanced Analysis Techniques 171
Chapter 7 Understanding and Using SQL 173
Understanding Basic SQL 173
The SELECT Statement 175
Selecting Specific Columns 175
Selecting All Columns 176
The WHERE Clause 176
Making Sense of Joins 177
Inner Joins 177
Outer Joins 178

Enhancing Your Analysis with Subqueries 196
Why Use Subqueries? 197
Subquery Ground Rules 197
Creating Subqueries without Typing SQL Statements 198
Using IN and NOT IN with Subqueries 201
Using Subqueries with Comparison Operators 201
Using Subqueries as Expressions 202
Using Correlated Subqueries 203
Uncorrelated Subqueries 203
Correlated Subqueries 203
Using a Correlated Subquery as an Expression 205
Using Subqueries within Action Queries 205
A Subquery in a Make-Table Query 205
A Subquery in an Append Query 205
A Subquery in an Update Query 206
A Subquery in a Delete Query 206
Domain Aggregate Functions 208
Understanding the Different Domain Aggregate Functions 210
DSum 210
DAvg 210
DCount 211
DLookup 211
Contents xv
02_104859 ftoc.qxp 2/17/07 12:49 AM Page xv
DMin and DMax 211
DFirst and DLast 211
DStDev, DStDevP, DVar, and DvarP 211
Examining the Syntax of Domain Aggregate Functions 212
Using No Criteria 212
Using Text Criteria 212

Sending Your Access Pivot Table to Excel 253
Pivot Table Options 254
Expanding and Collapsing Fields 255
Changing Field Captions 255
Sorting Data 256
Grouping Data 256
Using Date Groupings 259
Filtering for Top and Bottom Records 260
Adding a Calculated Total 261
xvi Contents
02_104859 ftoc.qxp 2/17/07 12:49 AM Page xvi
Working with Pivot Charts in Access 265
The Data Area 265
The Series Area 265
The Category Area 266
The Filter Area 267
Creating a Basic Pivot Chart 268
Formatting Your Pivot Chart 269
Summary 272
Part IV Automating Data Analysis 353
Chapter 11 Scheduling and Running Batch Analysis 275
Introduction to Access Macros 276
Dealing with Access 2007 Security Features 277
The Quick Fix 278
The Long-Term Fix 279
Creating Your First Macro 280
Essential Macro Actions 282
Manipulating Forms, Queries, Reports, and Tables 283
The Access Environment 283
Executing Processes 284

The Basics of the RunSQL Method 330
Using RunSQL Statements 331
Advanced Techniques Using RunSQL Statements 332
Suppressing Warning Messages 332
Passing a SQL Statement as a Variable 332
Passing User-Defined Parameters from a Form to Your
SQL Statement 333
Summary 337
Chapter 13 Query Performance, Database Corruption, and Other
Final Thoughts 339
Optimizing Query Performance 339
Understanding the Access Query Optimizer 339
Steps You Can Take to Optimize Query Performance 340
Normalizing Your Database Design 340
Using Indexes on Appropriate Fields 341
Optimizing by Improving Query Design 342
Compacting and Repairing Your Database Regularly 343
Handling Database Corruption 344
Signs and Symptoms of a Corrupted Database 344
Watching for Corruption in Seemingly Normal Databases 344
Common Errors Associated with Database Corruption 345
Recovering a Corrupted Database 347
Steps You Can Take to Prevent Database Corruption 348
Backing Up Your Database on a Regular Basis 348
Compacting and Repairing Your Database on a Regular
Basis 348
Avoiding Interruption of Service while Writing to Your
Database 349
Never Working with a Database from Removable Media 350
Getting Help in Access 350

up of more than just calculating and summarizing data.
A more representative definition of data analysis is the process of sys-
tematically collecting, transforming, and analyzing data in order to present
meaningful conclusions. To better understand this concept, think of data
analysis as a process that encapsulates four fundamental actions: collec-
tion, transformation, analysis, and presentation.
■■
Collection: Collection encompasses the gathering and storing of
data—that is, where you obtain your data, how you will receive
your data, how you will store your data, and how you will access
your data when it comes time to perform some analysis.
■■
Transformation: Transformation is the process of ensuring your
data is uniform in structure, free from redundancy, and stable. This
Introduction
xxi
03_104859 flast.qxp 2/17/07 12:49 AM Page xxi
generally entails things like establishing a table structure, cleaning
text, removing blanks, and standardizing data fields.
■■
Analysis: Analysis is the investigation of the component parts of
your data and their relationships to your data source as a whole. You
are analyzing your data when you are calculating, summarizing, cat-
egorizing, comparing, contrasting, examining, or testing your data.
■■
Presentation: In the context of data analysis, presentation deals with
how you make the content of your analysis available to a certain
audience. That is, how you choose to display your results. Some
considerations that go along with presentation of your analysis
include the platform you will use, the levels of visibility you will

the book, you will be able to create your own custom functions, perform
batch analysis, and develop automated procedures that essentially run on
their own.
After completing this book, you will be able to analyze large amounts of
data in a meaningful way, quickly slice data into various views on the fly,
automate redundant analysis, save time, and increase productivity.
What to Not Expect from This Book
It’s important to note that there are aspects of Access and data analysis that
are outside the scope of this book. While this book does cover the funda-
mentals of Access, it is always in the light of data analysis and it is written
from a data analyst’s point of view. This is not meant to be an all-encom-
passing book on Access. That being said, if you are a first-time user of
Access, you can feel confident that this book will provide you with a solid
introduction to Access that will leave you with valuable skills you can use
in your daily operations.
This book is not meant to be a book on data management theory and best
practices. Nor is it meant to expound on high-level business intelligence
concepts. This is more of a “technician’s” book, providing hands-on
instruction that introduces Access as an analytical tool that can provide
powerful solutions to common analytical scenarios and issues.
Finally, while this book does contain a chapter that demonstrates various
techniques to perform a whole range of statistical analysis, it is important
to note that this book does not cover statistics theory, methodology, or best
practices.
Skills Required for This Book
In order to get the most out of this book, it’s best that you have certain skills
before diving into the topics highlighted in this book. The ideal candidate
for this book will have:
■■
Some experience working with data and familiarity with the basic

SQL statements. Chapter 8 picks up from there and introduces you to sub-
queries and domain aggregate functions. Chapter 9 demonstrates many of
the advanced statistical analysis you can perform using subqueries and
domain aggregate functions. Chapter 10 provides you with an in-depth
look at using PivotTables and PivotCharts in Access.
Part IV: Automating Data Analysis
Part IV takes you beyond manual analysis with queries and introduces you
to the world of automation. Chapter 11 gives you an in-depth view of how
macros can help increase you productivity by running batch analysis.
Chapter 12 demonstrates how a little coding with Visual Basic for Applica-
tions (VBA) can help enhance data analysis. Chapter 13 offers some final
thoughts and tips on query performance, database corruption, and how to
get help in Access.
xxiv Introduction
03_104859 flast.qxp 2/17/07 12:49 AM Page xxiv
Part V: Appendixes
Part V includes useful reference materials that will assist you in your
everyday dealings with Access. Appendix A details many of the built-in
Access functions that are available to data analysts. Appendix B provides a
high-level overview of VBA for those users who are new to the world of
Access programming. Appendix C highlights and explains many of the
Access error codes you may encounter while analyzing your data.
Companion Database
The examples demonstrated throughout this book can be found in the
companion database. This sample database is located at www.wiley.com/
go/access2007dataanalysis.
Introduction xxv
03_104859 flast.qxp 2/17/07 12:49 AM Page xxv
Microsoft
®

in Access
CHAPTER
1
05_104859 ch01.qxp 2/17/07 12:50 AM Page 3
fact: their analytical needs had outgrown Excel. They all met with funda-
mental issues that stemmed from one or more of Excel’s three problem
areas: scalability, transparency of analytical processes, and separation of
data and presentation.
Scalability
Scalability is the ability for an application to develop flexibly to meet
growth and complexity requirements. In the context of this chapter, scala-
bility refers to the ability of Excel to handle ever-increasing volumes of
data. Most Excel aficionados will be quick to point out that as of Excel 2007,
you can place 1,048,576 rows of data into a single Excel worksheet. This is
an overwhelming increase from the limitation of 65,536 rows imposed by
previous versions of Excel. However, this increase in capacity does not
solve all of the scalability issues that inundate Excel.
Imagine that you are working in a small company and you are using
Excel to analyze your daily transactions. As time goes on, you build a
robust process complete with all the formulas, pivot tables, and macros
you need to analyze the data that is stored in your neatly maintained work-
sheet.
As your data grows, you will first notice performance issues. Your
spreadsheet will become slow to load and then slow to calculate. Why will
this happen? It has to do with the way Excel handles memory. When an
Excel file is loaded, the entire file is loaded into RAM. Excel does this to
allow for quick data processing and access. The drawback to this behavior
is that each time something changes in your spreadsheet, Excel has to
reload the entire spreadsheet into RAM. The net result in a large spread-
sheet is that it takes a great deal of RAM to process even the smallest

vast quantities of data without being reliant on others to provide data?
Could that analyst be more valuable to the organization? Could that ana-
lyst focus on the accuracy of the analysis and the quality of the presenta-
tion instead of routing Excel data maintenance?
Access is an excellent, many would say logical, next step for the analyst
who faces an ever-increasing data pool. Since an Access table takes very
few performance hits with larger datasets and has no predetermined row
limitations, an analyst will be able to handle larger datasets without requir-
ing the data to be summarized or prepared to fit into Excel. Since many
tasks can be duplicated in both Excel and Access, an analyst who is profi-
cient at both will be prepared for any situation. The alternative is telling
everyone, “Sorry, it is not in Excel.”
Another important advantage of using Access is that if ever a process
that is currently being tracked in Excel becomes more crucial to the organi-
zation and needs to be tracked in a more enterprise-acceptable environ-
ment, it will be easier to upgrade and scale up if it is already in Access.
NOTE An Access table is limited to 256 columns but has no row limitation.
This is not to say that Access has unlimited data storage capabilities. Every bit
of data causes the Access database to grow in file size. An Access database has
a file size limitation of 2 gigabytes. In comparison, Excel 2007 has a limit of
1,048,576 rows and 16,384 columns regardless of file size.
Chapter 1

The Case for Data Analysis in Access 5
05_104859 ch01.qxp 2/17/07 12:50 AM Page 5
Transparency of Analytical Processes
One of Excel’s most attractive features is its flexibility. Each individual cell
can contain text, a number, a formula, or practically anything else the user
defines. Indeed, this is one of the fundamental reasons Excel is such an
effective tool for data analysis. Users can use named ranges, formulas, and

being edited, a number is being calculated, or any portion of the dataset is
being affected as a part of an analytical process, you will readily see that
action. This is not to say that users can’t do foolish and confusing things in
Access. However, you definitely will not encounter hidden steps in an ana-
lytical process such as hidden formulas, hidden cells, or named ranges in
dead worksheets.
Separation of Data and Presentation
Data should be separate from presentation; you do not want the data to
become too tied into any one particular way of presenting it. For example,
when you receive an invoice from a company, you don’t assume that the
financial data on that invoice is the true source of your data. It is a presen-
tation of your data. It can be presented to you in other manners and styles
on charts or on web sites, but such representations are never the actual
source of the data. This sounds obvious, but it becomes an important dis-
tinction when you study an approach of using Access and Excel together
for data analysis.
What exactly does this concept have to do with Excel? People who per-
form data analysis with Excel, more often than not, tend to fuse the data,
the analysis, and the presentation together. For example, you will often see
an Excel Workbook that has 12 worksheets, each representing a month. On
each worksheet, data for that month is listed along with formulas, pivot
tables, and summaries. What happens when you are asked to provide a
summary by quarter? Do you add more formulas and worksheets to con-
solidate the data on each of the month worksheets? The fundamental prob-
lem in this scenario is that the worksheets actually represent data values
that are fused into the presentation of your analysis. The point being made
here is that data should not be tied to a particular presentation, no matter
how apparently logical or useful it may be. However, in Excel, it happens
all the time.
Chapter 1


Nhờ tải bản gốc
Music ♫

Copyright: Tài liệu đại học © DMCA.com Protection Status