Microsoft SQL Server 2000 Data Transformation Services- P13 - Pdf 76

F
IGURE
28.5
Column names versus column ordinal numbers in scripts in Transform Data tasks.
There are two problems with switching from column names to column ordinal numbers:
• The code is harder to read and write.
• The ordinal numbers do not consistently identify the columns. All the ordinal numbers of
the columns are changed whenever you view the Source Columns tab or the Destination
Columns tab of the Transformation Options dialog.
Listing 28.1 shows VBScript code for an ActiveX Script task that will dynamically modify all
the ActiveX Script transformations in all the Transform Data tasks in the package, replacing
the column names with the column ordinal numbers. This code is included on the CD in a
package stored in the ReplaceNamesWithOrdinals.dts file. This package also has a task that
switches all the ordinal numbers back to names, which can be run at the end of the package
execution.
L
ISTING
28.1
VBScript Code That Switches Column Names to Column Ordinal Numbers
for All Transform Data Tasks in a DTS Package
Option Explicit
Function Main()
DTS Packages and Steps
P
ART
V
576
34 0672320118 CH28 11/13/00 5:01 PM Page 576
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
Dim pkg, tsk, cus, trn, col
Dim sScript, sFind, sReplace

28
28
H
IGH
-
P
ERFORMANCE
DTS P
ACKAGES
577
L
ISTING
28.1
Continued
34 0672320118 CH28 11/13/00 5:01 PM Page 577
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
Fetch Buffer Size, Table Lock, and Insert Batch Size
When you use Fast Load, you can choose from several specific loading options. We found that
two of these options, Table Lock and Insert Batch Size, had a definite effect on performance.
We expect that some of the other Fast Load options also could have an effect on performance
in specific situations.
Table 28.6 and Figure 28.6 show the effect on performance of Table Lock, Insert Batch Size,
and Fetch Buffer Size, a setting that can be used whether or not Fast Load is selected. Our tests
were conducted without other users in the database. The default choice, with Table Lock off,
an Insert Batch Size of 0 (load all records in a single batch), and a Fetch Buffer Size of 1, is
shown first. We were unable to increase the Fetch Buffer Size beyond 5000.
T
ABLE
28.6
The Effect of Table Lock, Insert Batch Size, and Fetch Buffer Size in the

au_lname + ‘, ‘ + au_fname
end as au_fullname
from AuthorName
• Looking up an unknown value in another table. For our test, 17% of the values were
found in the lookup table. The lookup table contained a total of 23 records, so all of
them could be stored in the lookup’s cache at the same time:
select
case
when a.au_lname is null or a.au_lname = ‘’
or a.au_fname is null or a.au_fname = ‘’
then lkp.FullName
else
au_lname + ‘, ‘ + au_fname
end as au_fullname
from AuthorName a
inner join tblAuthorNameList lkp
on a.au_id = lkp.au_id
High-Performance DTS Packages
C
HAPTER
28
28
H
IGH
-
P
ERFORMANCE
DTS P
ACKAGES
579

source query. That’s especially true when a lookup is involved. In the last example in
this test, the source query delivered a performance improvement by almost a factor
of seven.
The more complex the data transformation, the more beneficial it is to put the logic
into the source query. Unfortunately, those more complex situations are where the
option of writing a transformation script is the most convenient.
N
OTE
34 0672320118 CH28 11/13/00 5:02 PM Page 580
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
Moving Logic into a Custom Transformation
Chapter 32, “Creating a Custom Transformation with VC++,” shows how to create a custom
transformation that finds the average value for a set of integer source fields. Table 28.8 and
Figure 28.8 show the performance of this custom transformation compared with the perfor-
mance of a transformation script and a source query with the same logic. This test was run
with 10 fields being averaged together.
T
ABLE
28.8
Transformation Script Versus Custom Transformation Versus Source Query
Test Description Records Per Second % Diff from Script
Transformation Script 3333 0%
Custom Transformation 15150 354% faster
Source Query 15625 369% faster
High-Performance DTS Packages
C
HAPTER
28
28
H

. When any of the other queries should be
executed, return a value of
DTSTransformStat_SkipInsert
and include code that exe-
cutes the appropriate data modification lookup.
DTS Packages and Steps
P
ART
V
582
One additional advantage of using the data modification lookups is that you’re not
limited to four possible queries, as with the Data Driven Query task. Of course, you
can also use the data modification lookups in the Data Driven Query task to give
yourself the extra possibilities.
N
OTE
Our testing indicates that a Data Driven Query task update query is approximately 30% faster
than an update performed by a lookup in a Transform Data task.
If you have a data transformation that performs an insert 10% of the time and an update 90%
of the time, the data transformation should be faster with the Data Driven Query task than with
the Transform Data task using Fast Load. The performance advantage of the update queries in
the Data Driven Query task is greater than the performance disadvantage for the insert queries.
If you have a data transformation that performs an insert 50% of the time and update 50% of
the time, the data transformation should be faster in a Transform Data task using Fast Load.
The performance advantage of the Fast Load inserts in the Transform Data task should far out-
weigh the performance disadvantage on the updates.
34 0672320118 CH28 11/13/00 5:02 PM Page 582
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
Choosing a Scripting Language
You can gain a performance improvement by using a different language in your transformation

object’s
ExecuteInMainThread
property to
FALSE
for all the steps. If
two steps are both set to execute on the main thread, they can’t be executed in parallel.
•Increasing the
Package
object’s
MaxConcurrentSteps
property. By default, this property
is set to 4. This is too low in situations where you have many processors available.
There are some factors that limit the use of these strategies:
• The logic of your transformation might require that some tasks be completed before oth-
ers are started. If so, you can force serial execution with precedence constraints.
• If you are using transactions, you have to prevent access to a database from two different
connections at the same time or the package will fail. You can avoid this problem by set-
ting the precedence constraints so that the tasks execute serially or by only having a sin-
gle connection to each database. Either way, you lose the performance benefit of parallel
processing.
• There are some tasks that must be executed on the main thread or they will generate
errors. This is true for any custom task that is not free-threaded (including all custom
tasks built with Visual Basic), tasks that modify properties in custom tasks that are not
free-threaded, and any task with a script that calls a COM object written in Visual Basic.
34 0672320118 CH28 11/13/00 5:02 PM Page 583
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
Effect of Indexing on Performance
The indexes used on tables involved with data transformations can have a very significant
impact on the performance of those transformations:
•Proper indexing on source tables can improve the speed of the transformation tasks and

N
OTE
Of course, you may need to leave indexes in place because of other users who need
to access the data.
N
OTE
34 0672320118 CH28 11/13/00 5:02 PM Page 584
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
Considering Tools Other Than DTS Because of
Performance
Sometimes you may want to consider using non-DTS tools to achieve better performance.
Using bcp for Exporting from SQL Server to Text Files
It is faster to use bcp to bulk copy data from SQL Server to a text file than to use the
Transform Data task. The high-performance Bulk Insert task cannot be used for moving data
from SQL Server to a text file.
Using Replication
If you want to keep two databases synchronized with each other, you should use replication
instead of DTS. Replication is often easier to set up, it often has better performance, and you
have more synchronization options. DTS is needed when you’re changing (transforming) data.
If you’re just copying data between two databases on a periodic basis, consider replication.
Conclusion
DTS gives you many tools for creating high-performance transformations, but there’s still a lot
of work to do if you want to achieve the highest possible performance.
High-Performance DTS Packages
C
HAPTER
28
28
H
IGH

• Accessing Lineage Information
Programmatically 605
35 0672320118 CH29 11/13/00 4:58 PM Page 587
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
DTS Packages and Steps
P
ART
V
588
Meta Data Services provides a repository for storing information. You can use it in SQL Server
2000 to store information about database structures and data transformations, and you can use
it with Visual Studio to share information about programming objects.
This chapter provides a brief introduction to Meta Data Services and examines how it is used
with DTS to store column-level and row-level data lineage. Chapter 23, “The DTS Package
and Its Properties,” has information about saving and retrieving DTS packages in Meta Data
Services. That chapter also discusses the use of the
PackageRepository
object to retrieve gen-
eral information about the packages stored in Meta Data Services.
Microsoft SQL Server 2000 Meta Data Services was called the Microsoft Repository in
SQL Server 7.0. There have been many enhancements in the newer version, especially
with the new Meta Data Browser and the ability to export meta data to XML.
You have two primary tools for viewing the information in Meta Data Services. You
have the new Meta Data Browser. You also have the DTS Browser, which provides the
same capabilities as the SQL Server 7.0 Repository Browser.
The term “repository” is still used to describe the database that physically stores the
meta data (which was called “metadata” in SQL Server 7.0). SQL Server Books Online
has made these changes throughout their text. You can find many instances where
the older terminology is still being used in the SQL Server 2000 user interface and the
DTS object model.

• Data types, field lengths, and field nullability.
• Default values.
•Indexes, primary keys, and foreign keys.
The process of creating and running data transformations also generates technical meta data:
• Map source columns to destination columns.
•Cross-reference between database structures and their use in DTS packages.
• ActiveX Scripts used to cleanse the data.
•History of the creation and versioning of DTS packages.
•History of the execution of DTS packages.
• Data lineage of individual records.
You need this kind of information to ensure that your data is being transformed accurately.
Meta Data Services gives you a place where you can store and retrieve business and technical
meta data, including the full information about the transformations that have been used with
the data.
Integrating DTS with Meta Data Services
C
HAPTER
29
29
I
NTEGRATING
DTS
WITH
M
ETA
D
ATA
S
ERVICES
589

it would require a good deal of specialized programming.
You should save packages to Meta Data Services because you want to handle the
meta data in an organized and consistent way.
N
OTE
The DTS Browser
The SQL Server 2000 Enterprise Manager provides two ways to view the meta data of data-
bases and DTS packages that have been stored in Meta Data Services:
35 0672320118 CH29 11/13/00 4:58 PM Page 590
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
• The DTS Browser is the successor to the Repository Browser in SQL Server 7.0. You
can access it by highlighting the Meta Data node under the Data Transformation Services
node for a particular SQL Server in the Enterprise Manager.
• The Meta Data Browser is a new tool in SQL Server 2000. You can find it in the
Enterprise Manager in the Meta Data Services node for a particular SQL Server.
You can use the DTS Browser to update some of the business meta data contained in Meta
Data Services, but none of the technical meta data.
The DTS Browser has three tabs, each of which has a separate tool:
•Browse—View database meta data.
•Lineage—Use the value of a lineage variable to look up package, version, and execution
information.
•Package—View package meta data.
Integrating DTS with Meta Data Services
C
HAPTER
29
29
I
NTEGRATING
DTS

29.1
You can view both business and technical meta data on the Browse tab of the DTS Browser.
DTS Packages and Steps
P
ART
V
592
F
IGURE
29.2
You can scan a database into the repository if you have an OLE DB provider for that particular data source.
The Browse tab provides a view of the following technical meta data:
• Name and version of the database system containing the database.
• Names of fields, tables, and databases.
35 0672320118 CH29 11/13/00 4:58 PM Page 592
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
• Data types.
•Field lengths, precision, and scale.
•Field nullability.
• Object owners.
• Column-level lineage.
When you click on the table or field hyperlink in the browser, a separate page opens that dis-
plays business meta data, as shown in Figure 29.3. The browser provides read/write access to
the following business meta data:
• Descriptions of fields, tables, and databases
• Comments about fields, tables, and databases
Integrating DTS with Meta Data Services
C
HAPTER
29

ART
V
594
F
IGURE
29.5
This is the information you receive when you look up the lineage.
35 0672320118 CH29 11/13/00 4:58 PM Page 594
Please purchase PDF Split-Merge on www.verypdf.com to remove this watermark.
The Package Tab
Information about DTS packages, package versions, and package executions is available in a
hierarchical format on the Package tab. (See Figure 29.6.) For each package, version, and exe-
cution, you have the option of viewing the package design. When you pick that option, the
DTS Package Designer opens with the selected version of the package displayed.
Integrating DTS with Meta Data Services
C
HAPTER
29
29
I
NTEGRATING
DTS
WITH
M
ETA
D
ATA
S
ERVICES
595


Nhờ tải bản gốc

Tài liệu, ebook tham khảo khác

Music ♫

Copyright: Tài liệu đại học © DMCA.com Protection Status