Tài liệu Data Warehousing Guide - Pdf 91

Oracle9
i
Data Warehousing Guide
Release 1 (9.0.1)
June 2001
Part No. A90237-01
Oracle9i Data Warehousing Guide, Release 1 (9.0.1)
Part No. A90237-01
Copyright © 2001 Oracle Corporation. All rights reserved.
Primary Author: Paul Lane
Contributing Author: Viv Schupmann (Change Data Capture)
Contributors: Patrick Amor, Hermann Baer, Srikanth Bellamkonda, Randy Bello, Tolga Bozkaya, Benoit
Dageville, John Haydu, Lilian Hobbs, Hakan Jakobsson, George Lumpkin, Jack Raitto, Ray Roccaforte,
Gregory Smith, Ashish Thusoo, Jean-Francois Verrier, Gary Vincent, Andy Witkowski, Zia Ziauddin
Graphic Designer: Valarie Moore
The Programs (which include both the software and documentation) contain proprietary information of
Oracle Corporation; they are provided under a license agreement containing restrictions on use and
disclosure and are also protected by copyright, patent, and other intellectual and industrial property
laws. Reverse engineering, disassembly, or decompilation of the Programs is prohibited.
The information contained in this document is subject to change without notice. If you find any problems
in the documentation, please report them to us in writing. Oracle Corporation does not warrant that this
document is error free. Except as may be expressly permitted in your license agreement for these
Programs, no part of these Programs may be reproduced or transmitted in any form or by any means,
electronic or mechanical, for any purpose, without the express written permission of Oracle Corporation.
If the Programs are delivered to the U.S. Government or anyone licensing or using the programs on
behalf of the U.S. Government, the following notice is applicable:
Restricted Rights Notice Programs delivered subject to the DOD FAR Supplement are "commercial
computer software" and use, duplication, and disclosure of the Programs, including documentation,
shall be subject to the licensing restrictions set forth in the applicable Oracle license agreement.
Otherwise, Programs delivered subject to the Federal Acquisition Regulations are "restricted computer
software" and use, duplication, and disclosure of the Programs shall be subject to the restrictions in FAR

Data Warehouse Architecture (with a Staging Area).............................................................. 1-6
Data Warehouse Architecture (with a Staging Area and Data Marts) ................................. 1-7
Part II Logical Design
2 Logical Design in Data Warehouses
Logical versus Physical Design in Data Warehouses .................................................................. 2-2
Creating a Logical Design ................................................................................................................. 2-2
Data Warehousing Schemas ............................................................................................................. 2-3
iv
Star Schemas.................................................................................................................................. 2-4
Other Schemas............................................................................................................................... 2-4
Data Warehousing Objects................................................................................................................ 2-5
Fact Tables...................................................................................................................................... 2-5
Dimension Tables ......................................................................................................................... 2-6
Unique Identifiers......................................................................................................................... 2-7
Relationships ................................................................................................................................. 2-8
Typical Example of Data Warehousing Objects and Their Relationships............................ 2-8
Part III Physical Design
3 Physical Design in Data Warehouses
Moving from Logical to Physical Design....................................................................................... 3-2
Physical Design................................................................................................................................... 3-2
Physical Design Structures.......................................................................................................... 3-4
Tablespaces.................................................................................................................................... 3-4
Tables and Partitioned Tables..................................................................................................... 3-5
Views .............................................................................................................................................. 3-5
Integrity Constraints .................................................................................................................... 3-5
Indexes and Partitioned Indexes ................................................................................................ 3-6
Materialized Views....................................................................................................................... 3-6
Dimensions .................................................................................................................................... 3-6
4 Hardware and I/O Considerations in Data Warehouses
Overview of Hardware and I/O Considerations in Data Warehouses ..................................... 4-2

Overview of Constraint States ......................................................................................................... 7-3
Typical Data Warehouse Integrity Constraints............................................................................. 7-4
UNIQUE Constraints in a Data Warehouse ............................................................................. 7-4
FOREIGN KEY Constraints in a Data Warehouse................................................................... 7-5
RELY Constraints ......................................................................................................................... 7-6
Integrity Constraints and Parallelism........................................................................................ 7-7
Integrity Constraints and Partitioning ...................................................................................... 7-7
View Constraints .......................................................................................................................... 7-7
8 Materialized Views
Overview of Data Warehousing with Materialized Views......................................................... 8-2
vi
Materialized Views for Data Warehouses................................................................................. 8-2
Materialized Views for Distributed Computing ...................................................................... 8-3
Materialized Views for Mobile Computing.............................................................................. 8-3
The Need for Materialized Views .............................................................................................. 8-3
Components of Summary Management ................................................................................... 8-5
Terminology .................................................................................................................................. 8-7
Schema Design Guidelines for Materialized Views ................................................................ 8-8
Types of Materialized Views .......................................................................................................... 8-10
Materialized Views with Aggregates....................................................................................... 8-10
Materialized Views Containing Only Joins ............................................................................ 8-16
Nested Materialized Views ....................................................................................................... 8-18
Creating Materialized Views .......................................................................................................... 8-22
Naming......................................................................................................................................... 8-23
Storage Characteristics............................................................................................................... 8-23
Build Methods............................................................................................................................. 8-24
Enabling Query Rewrite ............................................................................................................ 8-24
Query Rewrite Restrictions ....................................................................................................... 8-25
Refresh Options........................................................................................................................... 8-26
ORDER BY Clause ...................................................................................................................... 8-30

Validating Dimensions.................................................................................................................... 9-12
Altering Dimensions........................................................................................................................ 9-13
Deleting Dimensions ....................................................................................................................... 9-14
Part IV Managing the Warehouse Environment
10 Overview of Extraction, Transformation, and Loading
Overview of ETL............................................................................................................................... 10-2
ETL Tools............................................................................................................................................ 10-3
Daily Operations......................................................................................................................... 10-4
Evolution of the Data Warehouse ............................................................................................ 10-4
11 Extraction in Data Warehouses
Overview of Extraction in Data Warehouses............................................................................... 11-2
Understanding Extraction Methods in Data Warehouses......................................................... 11-2
Logical Extraction Methods ...................................................................................................... 11-3
Physical Extraction Methods..................................................................................................... 11-4
Change Data Capture................................................................................................................. 11-5
Data Warehousing Extraction Examples ...................................................................................... 11-8
Extraction Using Data Files....................................................................................................... 11-8
viii
Extraction Via Distributed Operations .................................................................................. 11-11
12 Transportation in Data Warehouses
Overview of Transportation in Data Warehouses ...................................................................... 12-2
Understanding Transportation Mechanisms in Data Warehouses.......................................... 12-2
Transportation Using Flat Files ................................................................................................ 12-2
Transportation Through Distributed Operations .................................................................. 12-2
Transportation Using Transportable Tablespaces ................................................................. 12-3
13 Loading and Transformation
Overview of Loading and Transformation in Data Warehouses ............................................. 13-2
Transformation Flow.................................................................................................................. 13-2
Loading Mechanisms ....................................................................................................................... 13-4
SQL*Loader ................................................................................................................................. 13-5

Monitoring a Refresh ............................................................................................................... 14-15
Checking the Status of a Materialized View......................................................................... 14-16
Tips for Refreshing Materialized Views with Aggregates ................................................. 14-16
Tips for Refreshing Materialized Views Without Aggregates........................................... 14-19
Tips for Refreshing Nested Materialized Views .................................................................. 14-20
Tips After Refreshing Materialized Views............................................................................ 14-21
Using Materialized Views With Partitioned Tables................................................................. 14-22
Fast Refresh with Partition Change Tracking ...................................................................... 14-22
Fast Refresh with CONSIDER FRESH................................................................................... 14-26
15 Change Data Capture
About Oracle Change Data Capture ............................................................................................. 15-2
Publish and Subscribe Model ................................................................................................... 15-3
Example of a Change Data Capture System........................................................................... 15-4
Components and Terminology for Synchronous Change Data Capture........................... 15-5
Installation and Implementation................................................................................................... 15-8
Security ............................................................................................................................................... 15-8
Columns in a Change Table............................................................................................................ 15-8
Views................................................................................................................................................. 15-10
Synchronous Mode of Data Capture........................................................................................... 15-11
Publishing Change Data ............................................................................................................... 15-11
Subscribing to Change Data......................................................................................................... 15-13
Steps Required to Subscribe to Change Data ....................................................................... 15-13
What Happens to Subscriptions When the Publisher Makes Changes............................ 15-16
Export and Import Considerations.............................................................................................. 15-17
x
16 Summary Advisor
Overview of the Summary Advisor in the DBMS_OLAP Package ........................................ 16-2
Summary Advisor Wizard ........................................................................................................ 16-6
Using the Summary Advisor .......................................................................................................... 16-6
Identifier Numbers ..................................................................................................................... 16-7

An Aggregate Scenario .............................................................................................................. 18-5
Interpreting NULLs in Examples............................................................................................. 18-6
ROLLUP Extension to GROUP BY................................................................................................ 18-7
When to Use ROLLUP ............................................................................................................... 18-7
ROLLUP Syntax.......................................................................................................................... 18-7
Partial Rollup .............................................................................................................................. 18-8
CUBE Extension to GROUP BY ................................................................................................... 18-10
When to Use CUBE .................................................................................................................. 18-10
CUBE Syntax ............................................................................................................................. 18-10
Partial CUBE.............................................................................................................................. 18-12
Calculating Subtotals without CUBE .................................................................................... 18-13
GROUPING Functions.................................................................................................................. 18-13
GROUPING Function .............................................................................................................. 18-13
When to Use GROUPING ....................................................................................................... 18-16
GROUPING_ID Function........................................................................................................ 18-17
GROUP_ID Function................................................................................................................ 18-18
GROUPING SETS Expression..................................................................................................... 18-19
Composite Columns....................................................................................................................... 18-21
Concatenated Groupings............................................................................................................... 18-24
Concatenated Groupings and Hierarchical Data Cubes..................................................... 18-26
Considerations when Using Aggregation.................................................................................. 18-28
Hierarchy Handling in ROLLUP and CUBE........................................................................ 18-28
Column Capacity in ROLLUP and CUBE............................................................................. 18-29
HAVING Clause Used with GROUP BY Extensions .......................................................... 18-29
ORDER BY Clause Used with GROUP BY Extensions....................................................... 18-30
Using Other Aggregate Functions with ROLLUP and CUBE ........................................... 18-30
Computation Using the WITH Clause ....................................................................................... 18-30
19 SQL for Analysis in Data Warehouses
Overview of SQL for Analysis in Data Warehouses.................................................................. 19-2
Ranking Functions............................................................................................................................ 19-5

REGR_SLOPE and REGR_INTERCEPT................................................................................ 19-33
REGR_R2.................................................................................................................................... 19-33
REGR_SXX, REGR_SYY, and REGR_SXY............................................................................. 19-33
Linear Regression Statistics Examples................................................................................... 19-33
Sample Linear Regression Calculation.................................................................................. 19-34
Inverse Percentile Functions......................................................................................................... 19-35
xiii
Normal Aggregate Syntax....................................................................................................... 19-35
Inverse Percentile Restrictions................................................................................................ 19-38
Hypothetical Rank and Distribution Functions....................................................................... 19-39
Hypothetical Rank and Distribution Syntax ........................................................................ 19-39
WIDTH_BUCKET Function ......................................................................................................... 19-40
WIDTH_BUCKET Syntax........................................................................................................ 19-41
User-Defined Aggregate Functions ............................................................................................. 19-43
CASE Expressions........................................................................................................................... 19-44
Creating Histograms with User-defined Buckets................................................................ 19-45
20 Advanced Analytic Services
OLAP................................................................................................................................................... 20-2
Benefits of OLAP and RDBMS Integration............................................................................. 20-2
Data Mining....................................................................................................................................... 20-4
Enabling Data Mining Applications ........................................................................................ 20-5
Predictions and Insights ............................................................................................................ 20-5
Mining Within the Database Architecture.............................................................................. 20-5
Java API........................................................................................................................................ 20-7
21 Using Parallel Execution
Introduction to Parallel Execution Tuning................................................................................... 21-2
When to Implement Parallel Execution................................................................................... 21-2
Operations That Can Be Parallelized....................................................................................... 21-3
The Parallel Execution Server Pool .......................................................................................... 21-3
How Parallel Execution Servers Communicate ..................................................................... 21-5

Monitoring Parallel Execution Performance with Dynamic Performance Views........... 21-71
Monitoring Session Statistics .................................................................................................. 21-74
Monitoring System Statistics................................................................................................... 21-76
Monitoring Operating System Statistics................................................................................ 21-77
Affinity and Parallel Operations.................................................................................................. 21-77
Affinity and Parallel Queries .................................................................................................. 21-78
Affinity and Parallel DML....................................................................................................... 21-78
Miscellaneous Parallel Execution Tuning Tips ......................................................................... 21-79
Formula for Memory, Users, and Parallel Execution Server Processes............................ 21-80
Setting Buffer Pool Size for Parallel Operations................................................................... 21-82
Balancing the Formula ............................................................................................................. 21-82
Parallel Execution Space Management Issues...................................................................... 21-83
Overriding the Default Degree of Parallelism...................................................................... 21-84
Rewriting SQL Statements....................................................................................................... 21-85
Creating and Populating Tables in Parallel .......................................................................... 21-86
Creating Temporary Tablespaces for Parallel Sort and Hash Join.................................... 21-87
xv
Executing Parallel SQL Statements........................................................................................ 21-88
Using EXPLAIN PLAN to Show Parallel Operations Plans .............................................. 21-89
Additional Considerations for Parallel DML ....................................................................... 21-89
Creating Indexes in Parallel .................................................................................................... 21-93
Parallel DML Tips..................................................................................................................... 21-94
Incremental Data Loading in Parallel.................................................................................... 21-97
Using Hints with Cost-Based Optimization ....................................................................... 21-100
22 Query Rewrite
Overview of Query Rewrite............................................................................................................ 22-2
Cost-Based Rewrite .................................................................................................................... 22-3
When Does Oracle Rewrite a Query? ...................................................................................... 22-4
Enabling Query Rewrite.................................................................................................................. 22-7
Initialization Parameters for Query Rewrite .......................................................................... 22-8

Send Us Your Comments
Oracle9
i
Data Warehousing Guide, Release 9.0.1
Part No. A90237-01
Oracle Corporation welcomes your comments and suggestions on the quality and usefulness of this
document. Your input is an important part of the information used for revision.

Did you find any errors?

Is the information clearly presented?

Do you need more information? If so, where?

Are the examples correct? Do you need more examples?

What features did you like most?
If you find any errors or have any other suggestions for improvement, please indicate the document
title and part number, and the chapter, section, and page number (if available). You can send com-
ments to us in the following ways:

Electronic mail: [email protected]

FAX: (650) 506-7227 Attn: Server Technologies Documentation Manager

Postal service:
Oracle Corporation
Server Technologies Documentation
500 Oracle Parkway, Mailstop 4op11
Redwood Shores, CA 94065

This document contains:
Chapter 1, Data Warehousing Concepts
This chapter contains an overview of data warehousing concepts.
Chapter 2, Logical Design in Data Warehouses
This chapter discusses the logical design of a data warehouse.
Chapter 3, Physical Design in Data Warehouses
This chapter discusses the physical design of a data warehouse.
Chapter 4, Hardware and I/O Considerations in Data Warehouses
This chapter describes some hardware and input-output issues.
Chapter 5, Parallelism and Partitioning in Data Warehouses
This chapter describes the basics of parallelism and partitioning in data
warehouses.
Chapter 6, Indexes
This chapter describes how to use indexes in data warehouses.
Chapter 7, Integrity Constraints
This chapter describes some issues involving constraints.
xxi
Chapter 8, Materialized Views
This chapter describes how to use materialized views in data warehouses.
Chapter 9, Dimensions
This chapter describes how to use dimensions in data warehouses.
Chapter 10, Overview of Extraction, Transformation, and Loading
This chapter is an overview of the ETL process.
Chapter 11, Extraction in Data Warehouses
This chapter describes extraction issues.
Chapter 12, Transportation in Data Warehouses
This chapter describes transporting data in data warehouses.
Chapter 13, Loading and Transformation
This chapter describes transforming data in data warehouses.
Chapter 14, Maintaining the Data Warehouse

Customers in Europe, the Middle East, and Africa (EMEA) can purchase
documentation from
http://www.oraclebookshop.com/
Other customers can contact their Oracle representative to purchase printed
documentation.
To download free release notes, installation documentation, white papers, or other
collateral, please visit the Oracle Technology Network (OTN). You must register
online before using OTN; registration is free and can be done at
http://technet.oracle.com/membership/index.htm
xxiii
If you already have a username and password for OTN, then you can go directly to
the documentation section of the OTN Web site at
http://technet.oracle.com/docs/index.htm
For additional information, see:

The Data Warehouse Toolkit by Ralph Kimball (John Wiley and Sons, 1996)

Building the Data Warehouse by William Inmon (John Wiley and Sons, 1996)
Conventions
This section describes the conventions used in the text and code examples of the
this documentation set. It describes:

Conventions in Text

Conventions in Code Examples
Conventions in Text
We use various conventions in text to help you more quickly identify special terms.
The following table describes those conventions and provides examples of their use.
Convention Meaning Example
Bold Bold typeface indicates terms that are

xxiv
Conventions in Code Examples
Code examples illustrate SQL, PL/SQL, SQL*Plus, or other command-line
statements. They are displayed in a monospace (fixed-width) font and separated
from normal text as shown in this example:
SELECT username FROM dba_users WHERE username = ’MIGRATE’;
The following table describes typographic conventions used in code examples and
provides examples of their use.
lowercase
monospace
(fixed-width
font)
Lowercase monospace typeface indicates
executables, filenames, directory names,
and sample user-supplied elements. Such
elements include computer and database
names, net service names, and connect
identifiers, as well as user-supplied
database objects and structures, column
names, packages and classes, usernames
and roles, program units, and parameter
values.
Note: Some programmatic elements use a
mixture of UPPERCASE and lowercase.
Enter these elements as shown.
Enter sqlplus to open SQL*Plus.
The password is specified in the orapwd file.
Back up the datafiles and control files in the
/disk1/oracle/dbs directory.
The department_id, department_name,

])
{ }
Braces enclose two or more items, one of
which is required. Do not enter the braces.
{ENABLE | DISABLE}
|
A vertical bar represents a choice of two
or more options within brackets or braces.
Enter one of the options. Do not enter the
vertical bar.
{ENABLE | DISABLE}
[COMPRESS | NOCOMPRESS]
Convention Meaning Example
xxv
... Horizontal ellipsis points indicate either:

That we have omitted parts of the
code that are not directly related to
the example

That you can repeat a portion of the
code
CREATE TABLE ... AS
subquery
;
SELECT
col1
,
col2
, ... ,

lowercase.
SELECT last_name, employee_id FROM
employees;
SELECT * FROM USER_TABLES;
DROP TABLE hr.employees;
lowercase Lowercase typeface indicates
programmatic elements that you supply.
For example, lowercase indicates names
of tables, columns, or files.
Note: Some programmatic elements use a
mixture of UPPERCASE and lowercase.
Enter these elements as shown.
SELECT last_name, employee_id FROM
employees;
sqlplus hr/hr
CREATE USER mjones IDENTIFIED BY ty3MU9;
Convention Meaning Example


Nhờ tải bản gốc

Tài liệu, ebook tham khảo khác

Music ♫

Copyright: Tài liệu đại học © DMCA.com Protection Status