In Chapter 9, “Building a Package Using Visual Basic,” you saw that in some situations the DTS Designer might not be the best tool to implement your solution. You then created a VB program that used the DTS object model to transform and load data into a SQL database. In this chapter, you will build on this knowledge and investigate some best practices. You will learn about the following:[1]
Executing DTS packages from Visual Basic
Making best-practice modifications to DTS VB .bas files
Loading and saving packages
Event Handling
Error Handling
Armed with this knowledge, we will investigate DTS solutions to various real-world problems. Although DTS provides tasks to deal with most common data transformation problems, in some situations you will want to complement DTS with a programming language such as Visual Basic. Usually, these situations are not due to technical limitations of DTS, but due to business or user requirements.
You can use Visual Basic to do the following:
Create a simplified DTS package scheduler for users to manage when and how often packages are being run.
Process the data before doing DTS work, such as converting the data to a .csv file.
On a predetermined schedule, download files from the Web; log successes and failures; and notify the administrator of failures, completions, and their associated processing time.
Perform file-management tasks prior to DTS importing the data. Such tasks can include copying; deleting; renaming; comparing two files; and checking file properties, such as date and size. Then you can have DTS process only those files that are valid.
Create a custom FTP task that extracts data from a remote system and then kick off DTS to populate a data warehouse.
In this chapter, we will build Visual Basic/DTS packages that will:
Process all files in a directory
Process spreadsheets in an Excel workbook
Process XML input
Process Outlook messages
Although all examples were developed using SQL 2000, the basic principles will work with little or no modifications against a SQL 7.0 Server.
To execute your DTS package in Visual Basic, perform the following steps:
Create a new standard .exe project.
Depending on your requirements, add references to the following:
Microsoft DTSDataPump Scripting Object Library
Microsoft DTSPackage Object Library
Microsoft DTS Custom Tasks Object Library
Dimension a package object.
Load the package.
Execute the package.
Uninitialize the package.
VB is not limited to defining and running packages. It is also possible to load and execute a previously defined package.
When you finish with the Package
or Package2
object, you should release references to all DTS objects except the Package
and call the UnInitialize
method. When you release package references, include object variables that you declared WithEvents
to handle package events.
DTS client applications can respond to the events raised by DTS. DTS raises events to communicate the status of work being done, thereby allowing you to control the execution of the package, task, or step. The execution status can show successful completion or an error; both are a type of execution status. Chapters 4 and 5 provide the full list of execution-status values for a task C. Chapter 6, “DTS Workflows,” provides the full list of execution-status values for packages.
By default, when you run a project, you get no indication of completion or errors. Adding completion notification and error handling is recommended.
In most cases it is easier to use VB to manipulate an existing DTS package. When you want to script and edit a package, consider modifying the VB code to create and return a Package
object as follows:
Name the file PackageName.bas.
Change Sub Main to Function Package.
Do not execute the Uninitialize method.
Return the package created.
This process will allow you to manipulate the existing and scripted packages separately from the rest of your code.
The changes can be seen in Listing 10.1.
Example 10.1. Recommended Changes to the Package .bas File
Option Explicit Public goPackageOld As New DTS.Package Public goPackage As DTS.Package2 Public Function Package() As DTS.Package2 '∼∼Private Sub Main() Set goPackage = goPackageOld ... ' Save or execute package 'goPackage.SaveToSQLServer "(local)", "sa", "" '∼∼goPackage.Execute '∼∼goPackage.Uninitialize Set Package = goPackage Set goPackage = Nothing Set goPackageOld = Nothing End Function
To include PackageName.bas in your project, call the package as shown in Listing 10.2.
VB is not limited to defining and running packages. Both the Package
and Package2
objects provide methods to load, save, or execute a previously defined package.
To load a package, create a Package
or Package2
object and then invoke one of the following object methods:
LoadFromRepository
LoadFromSQLServer
LoadFromStorageFile
To save a package, create a Package
or Package2
object and then invoke one of the following object methods:
SaveToSQLServer
SaveToSQLServerAs
SaveToRepository
SaveToRepositoryAs
SaveToStorageFile
SaveToStorageFileAs
If the package is run before being saved, you must call the UnInitialize
method first.
To save a VB .bas file scripted by DTS Designer or ScriptPkg to a SQL Server package, follow these steps:
At the end of Sub Main
, uncomment this line:
'objPackage.SaveToSQLServer ...
Comment out the following line:
objPackage.Execute
Run the project.
The Package
and Package2
objects provide the LoadFromRepository
method to load a DTS package from a Microsoft Meta Data Services Repository and the SaveToRepository
and SaveToRepositoryAs
methods to save a DTS package to a Microsoft Meta Data Services Repository. Listing 10.3 shows the method syntax.
Example 10.3. Package Load and Save Repository Methods
Package.LoadFromRepository _ RepositoryServerName, RepositoryDatabaseName, _ RepositoryUserNaem, RepositgoryUserPassword, PackageGUID, [VersionGUID], [PackageName], _ [Flags], [pVarPersistStgOfHost] Package.SaveToRepository _ RepositoryServerName, RepositoryDatabaseName, _ RepositoryUserName, RepositoryUserPassword, _ [Flags], [CategoryID], [pVarPersistStgOfHost] Package.SaveToRepositoryAS _ NewName, _ RepositoryServerName, RepositoryDatabaseName, _ RepositoryUserName, RepositoryUserPassword, _ [Flags], [CategoryID], [pVarPersistStgOfHost]
Table 10.1 shows parameter use.
Table 10.1. Repository Load and Save Parameters
Parameter |
Description |
---|---|
|
Meta Data Services server name. If empty or |
|
Meta Data Services database name or data source name (DSN). |
|
Meta Data Services username. |
|
Meta Data Services user password. |
|
Package identifier GUID string. |
|
Version identifier GUID string. |
|
Name of package to be loaded. |
|
Value from the
|
|
Screen layout information associated with a package (used only by DTS Designer) |
Listing 10.4 shows how to load and save a package to Microsoft Data Services Repository.
Example 10.4. VB Code to Load and Save a Package to a Repository
Private moPackage As DTS.Package2 Set moPackage = New DTS.Package2 'Load Repository Package moPackage.LoadFromRepository _ "Brians_LPTS2K", "", "", "", _ "", "", "LoadMftCustomersToTempDB", _ DTSReposFlag_UseTrustedConnection 'Save Repository Package moPackage.SaveToRepositoryAs _ "NewPackage", _ "Brians_LPTS2K", "", "", "", _
The Package
and Package2
objects provide the LoadFromSQLServer
method to load a DTS package from SQL Server local packages and the SaveToSQLServer
and SaveToSQLServerAs
methods to save a DTS package to SQL Server local packages. Listing 10.5 shows the method syntax.
Example 10.5. Package Load and Save to SQL Server Methods
Package.LoadFromSqlServer _ ServerName, [ServerUserName], [ServerPassword], [Flags], _ [PackageOwnerPassword], [PackageOperatorPassword], _ [PackageCategoryID], _ [pVarPersistStgOfHost], _ [bReusePasswords] Package.SaveToSqlServer _ ServerName, [ServerUserName], [ServerPassword], [Flags], _ [PackageOwnerPassword], [PackageOperatorPassword], _ [PackageCategoryID], _ [pVarPersistStgOfHost], _ [bReusePasswords] Package.SaveToSqlServer _ NewName, _ ServerName, [ServerUserName], [ServerPassword], [Flags], _ [PackageOwnerPassword], [PackageOperatorPassword], _ [PackageCategoryID], _ [pVarPersistStgOfHost], _ [bReusePasswords]
Table 10.2 shows parameter use.
Table 10.2. SQL Server Load and Save Parameters
Parameter |
Description |
---|---|
|
Server name |
|
Server username |
|
Server user password |
|
Value from the
|
|
Package password if the package is encrypted |
|
Package identifier GUID string |
|
Version identifier GUID string |
|
Package name |
|
Screen layout information associated with a package (used only by Package Designer) |
The code in Listing 10.6 shows you how to load and save a package to a SQL server.
Example 10.6. VB Code to Load and Save a Package to SQL Server
Private moPackage As DTS.Package2 Set moPackage = New DTS.Package 'Load SQL Package moPackage.LoadFromSQLServer _ "Brians_LPTS2K", _ "", "", DTSSQLStgFlag_UseTrustedConnection, _ "", "", "", "LoadMftCustomersToTempDB" 'Save SQL Package moPackage.SaveToSQLServerAs _ "NewName", _ "Brians_LPTS2K", _ "", "", DTSSQLStgFlag_UseTrustedConnection, _ "", "", ""
The Package
and Package2
objects provide the LoadFromStorageFile
method to load a DTS package from a .dts file and the SaveToStorageFile
and SaveToStorageFileAs
methods to save a .dts file. Listing 10.7 shows method syntax.
Example 10.7. Package Load and Save to Storage File Methods
Package.LoadFomStorageFile _ UNCFile, _ Password, [PackageGUID], [VersionGUID], [Name], _ [pVarPersistStgOfHost] Package.SaveToStorageFile _ [UNCFile], [OwnerPassword], [OperatorPassword], _ [pVarPersistStgOfHost], _ [bReusePasswords] Package.SaveToStorageFileAs _ NewName, _ [UNCFile], [OwnerPassword], [OperatorPassword], _ [pVarPersistStgOfHost], _ [bReusePasswords]
Table 10.3 shows parameter use.
Table 10.3. Storage File Load and Save Parameters
Parameter |
Description |
---|---|
|
File specification of DTS package storage file |
|
Package password if the package is encrypted |
|
Package identifier GUID string |
|
Version identifier GUID string |
|
Package name |
|
Screen layout information associated with a package (used only by Package Designer) |
The code in Listing 10.8 shows loading and saving a package to a DTS storage file.
The DTS package can raise various status and control events during package execution. If you want to receive these events, you must declare a package WithEvents
and provide event handlers in either a form or class module.
A Package2
object may not be declared WithEvents
. If you require Package2
functionality, it is a simple matter to create a Package
object and assign it to a Package2
object. Use the Package2
object to access Package2
functionality while the Package
event handlers respond to events.
After you declare a package WithEvents
, you must provide event handlers for all the package events. If you fail to do so, you will receive an access violation error when an unhandled event is raised. If a particular event is not of interest, provide a minimal handler consisting of the public Sub
and EndSub
statements.
DTS supports multiple threads that allow concurrent processing of package steps. Visual Basic, however, does not support multiple threads. To use event handlers, it is important that you set the ExecuteInMainThread
property to True
for each step object in the package.
Table 10.4 describes DTS package events and the information returned when specified conditions occur.
Table 10.4. DTS Package Events
Event |
Condition |
Information Returned |
---|---|---|
|
A step has started. |
Step name |
|
A step has completed. |
Step name |
|
This event occurs periodically during step execution. |
Step name, progress count (which is |
|
This event gives the application a chance to cancel a step. |
Step name |
|
A DTS error occurred during package execution. |
Step name, error code and description, help file and context, and interface ID |
The flow of DTS package events is as follows:
OnStart
once at the beginning of each step.
OnQueryCancel
if it is safe to terminate the step.
During step execution:
OnProgress
may occur one or more times to provide progress information.
OnQueryCancel
may occur at times when it is safe to terminate the step.
OnFinish
once at the end of each step.
OnQueryCancel
if it is safe to terminate the step.
OnError
may occur at any time.
Listing 10.9 shows how to set up package execution to respond to events. Code for the actual event handlers is included with the event descriptions.
Example 10.9. Code to Have a VB Component Respond to Package Events
Private moPackageOld as DTS.Package Private moPackage As DTS.Package2 Public moPackageEventHandler WithEvents as DTSPackage Private Sub form_Load() ' Establish Package and Event Handler Set moPackageOld = New DTS.Package Set moPackage = moPackageOld Set moPackageEventHandler = moPackageOld ' All Steps/Tasks MUST run on VB Main Thread Dim oStep As DTS.Step For Each oStep In moPackage.Steps oStep.ExecuteInMainThread = True Next ' Execute moPackage.Execute 'Cleanup moPackage.UnInitialize Exit Sub
The package raises an OnStart
event at the beginning of each DTS task and step. To handle the event, you must provide a handler of the form shown in Listing 10.10.
Example 10.10. Using the DTS Package OnStart Event
Private Sub moPackageEventHandler _OnStart(ByVal EventSource As String) MsgBox EventSource + " - " + "OnStart" End Sub
In this event and all package events, moPackageEventHandler
is the name of the pack-age declared WithEvents
. For all package events, tasks defined by DTS return the step name for EventSource
. Custom tasks may return whatever they want.
The package raises an OnFinish
event at the end of each DTS task and step.To handle the event, you must provide a handler of the form shown in Listing 10.11.
The package raises an OnProgress
event from time to time to provide task-progress information.
The ProgressRowCount
property of the DataDriveQuery Task or a DataPump Task can be used to configure the number of rows processed between events.
To handle the event, you must provide a handler of the form shown in Listing 10.12.
Example 10.12. Code to Respond to the DTS Package OnProgress Event
Private Sub moPackageEventHandler_OnProgress( _ ByVal EventSource As String, _ ByVal ProgressDescription As String, _ ByVal PercentComplete As Long, _ ByVal ProgressCountLow As Long, _ ByVal ProgressCountHigh As Long) MessageBox ("" & _ "ProgressDescription: " & ProgressDescription & vbCrLf & _ "PercentComplete: " & PercentComplete & vbCrLf & _ "ProgressCountLow: " & ProgressCountLow & vbCrLf & _ "ProgressCountHigh: " & ProgressCountHigh, EventSource ) End Sub
Table 10.5 defines OnProgress
event parameters.
Table 10.5. OnProgress Event Parameters
Parameter |
Description |
---|---|
|
Source of event being executed. Tasks defined by DTS return the step name for |
|
Description of task progress. |
|
Percentage of task completed. If the percentage completed cannot be reported, 0 is returned. |
|
Low 32 bits of units (such as rows) completed. |
|
High 32 bits of units (such as rows) completed. |
The package raises an OnQueryCancel
event only when it is safe to stop execution of the task. Use this event to terminate package execution gracefully.
To handle the event, you must provide a handler of the form shown in Listing 10.13.
Example 10.13. Code to Interact with the DTS Package OnQueryCancel Event
Private Sub moPackageEventHandler_OnQueryCancel( _ ByVal EventSource As String, _ pbCancel As Boolean) if MsgBox ("Continue to Execute",vbOKCancel) = OK then pbCancel = False Else pbCancel = True End If End Sub
If pbCancel
is set to True
by the event handler, DTS stops task execution and fails with an error. If a task completes quickly, the event may not occur.
The package raises an OnError
event to indicate and provide information about an error. The event also allows the event handler to cancel task execution. A step does not terminate after an error unless the MaximumErrorCount
is exceeded or the event handler explicitly sets pbCancel
to True
.
To handle the event, you must provide a handler of the form shown in Listing 10.14.
Example 10.14. Code to Respond to the DTS Package OnError Event
Private Sub moPackageEventHandler_OnError( _ ByVal EventSource As String, _ ByVal ErrorCode As Long, ByVal Source As String, _ ByVal Description As String, _ ByVal HelpFile As String, _ ByVal HelpContext As Long, _ ByVal IDofInterfaceWithError As String, _ pbCancel As Boolean) If MsgBox ( _ "ErrorCode: " & ErrorCode & vbCrLf & _ "Source: " & Source & vbCrLf & _ "Description: " & Description & vbCrLf & _ "HelpFile: " & HelpFile & vbCrLf & _ "IDofIFWErr: " & IdofInterfaceWithError, _ EventSource, _ VbOkCancel) <> vbOK then pbCancel = False Else pbCancel = True End If End Sub
Table 10.6 defines OnError
event parameters.
Table 10.6. OnError Event Parameters
Parameter |
Description |
---|---|
|
Source of event being executed. Tasks defined by DTS return the step name for |
|
Error code of the failure. |
|
Source of error message (an OLE DB provider description, for example). |
|
Description of the error. |
|
Help-file name. |
|
Help context ID. |
|
ID of the user interface returning the error, a globally unique identifier (GUID). |
|
If |
At the highest level, a DTS object property set, get, or method execution succeeds or fails. Errors that occur while you are creating DTS objects and setting their properties are to be dealt with using standard VB error handling.
The package FailOnError
property specifies whether package execution halts on any step error. A step FailPackageOnError
property determines whether package execution halts when an error occurs in that step. Unless one of these properties is set to True
, errors that occur during Package
or Package2
Execute are processed by DTS and are not propagated back to the caller.
When FailOnError
or FailPackageOnError
is True
, the description of the returned error often tells you only that the package failed because a (named) step failed. To determine why a step failed, you query each step’s GetExecutionErrorInfo
for more information. To determine which step(s) raised errors, the ExecutionStatus property of the step should be DTSStepExecStat_Completed
, and the ExecutionResult
property should have the value DTSStepExecResult_Failure
.
If FailOnError
is True
, an error may have occurred in only one step. Otherwise, depending on package workflow, multiple steps may have failed.
The error handler needs to iterate through all the objects in the Steps collection to find all errors.
The code in Listing 10.15 shows a typical package error handler.
Example 10.15. Code to Manage DTS Package Errors
Private Sub RunDTSPackage( ) Dim xoPackage As New DTS.Package On Error GoTo PackageError ... xoPackage.FailOnError = True xoPackage.Execute Unload me Exit Sub PackageError: MsgBox( "Package Error: " & Err.Number _ & Err.Description & vbCrLf _ & PackageErrors(moPackage), _ vbExclamation, App.EXEName Unload Me Exit Sub End Sub Private Function PackageErrors(ByVal aoPackage As DTS.Package2) As String Dim xoStep As DTS.Step Dim xsMessage As String Dim xlErrNum As Long Dim xsDescription As String Dim xsSource As String If xoPackage Is Not Nothing Then 'Find Steps that Failed 'Steps that did not execute are ignored For Each xoStep In aoPackage.Steps If oStep.ExecutionStatus = DTSStepExecStat_Completed _ And oStep.ExecutionResult = DTSStepExecResult_Failure Then 'Populate Error information oStep.GetExecutionErrorInfo xlErrNum, _ xsSource, xsDescription 'Build Error Message xsMessage = xsMessage & vbCrLf _ & "Step " & xoStep.Name & " failed,, error: " __ & xlErrNum & vbCrLf _ & xsDescription End If Next PackageErrors = sMessage End If End Function
Now that you have a better understanding of the issues and techniques for accessing the DTS object model in Visual Basic, let’s take a look at some real problems.
There are many situations when the data you require is located in many files or directories. DTS has no trouble loading data from a single file or even from all the files in a directory at the time the package is created. It is more complicated to set up a package to scan a directory and process all files.
Sparks Corporation is attempting to get a better handle on its shipping and receiving. To meet this objective, they purchased handheld scanners for every shipping clerk.
Each package is scanned as it is shipped or received. The scanners, which normally are disconnected from the network, are set in a docking station daily to recharge their batteries and download data to a network drive, C:DTS2000DataScanner.
[2]
Each scanner stores data in a separate file, using the name Scanner_SCANNERID_YYYYMMDD_HHMM.csv. SCANNERID is the scanner’s identifier. YYYYMMMDD_HHMM is the year, month, day, hour, and minute of the last entry in the file. Table 10.7 defines the file format.
Table 10.7. Scanner File Format
Field |
Description |
---|---|
|
Scanner ID |
|
Scanned date and time |
|
Scanner independent sequence number, used to ensure that no data is missing |
|
The bar code scanned |
|
‘S’ for shipments,‘R’ for items received |
You have been asked to create a program to load the data into SQL Server.
Build a package that can load a single file. Then use a VB program to scan the directory and call the package once for each file.
First, you create a package that can load a single scanner .csv file. Then use the DTS Import Wizard to create a package as follows: [3]
Data Source: text file C:DTS2000DataScannerScanner_hp1234z_20010209_1050.csv
.
File Format: Delimited, Skip=0. First row has column names; accept the default settings for all other fields.
Column Delimiter: comma.
Destination: your SQL Server tempdb.
Select Source Tables: Change [tempdb].[dbo].[Scanner_....]
to [tempdb].[dbo].[ScannerData].
Run:Yes.
Save: To SQL Server package LoadScannerToTempDb
.
Before running the package again, you need to modify the LoadScannerToTempDb Task so that it does not attempt to create the table if it exists. Follow these steps:
Open the package, using DTS Designer.
Select the Create Table [tempdb].[dbo].[ScannerData]
, and insert the following SQL statement before Create Table :
if not exists (select * from tempdb..sysobjects where id = object_id('[tempdb].[dbo].[ScannerData]'))
Save the package to C:DTS2000WorkSpaceLoadScannerToTempDb.dts.
Now all that remains is to create the VB program to process many files. Follow these steps:
Open VB, and begin a standard .exe.
Reference the DTS object model.
Set up the program to handle DTS package events and errors as described in “Handling DTS Package Events in Visual Basic” and “Handling DTS Package Errors in Visual Basic” earlier in this chapter.
Include code to find the scanner data files, load the package, and execute once for each file (see Listing 10.16).
Example 10.16. VB Program to Scan a Directory and Call a DTS Package Once for Each File
Private moPackage As DTS.Package2 Private moPackageOld As DTS.Package Private WithEvents moPackageEventHandler As DTS.Package Private Sub Form_Load() Dim xsDir As String 'Directory to Processes Dim xoFiles As New Collection 'List of Files to Process Dim xsFile As Variant 'File Name Dim xiFile As Integer 'File Handle Dim xsBuffer As String 'File I/O buffer Dim xsTry As String 'Tell ErrorHandler whatz up On Error GoTo PackageError 'Get Directory If UBound(Split(Command())) > 1 Then xsDir = Split(Command())(1) Else xsDir = mksScannerDataDir End If xsDir = xsDir & IIf(Right(xsDir, 1) <> "", "", "") 'Get File List xsFile = Dir(xsDir & mksScannerFiles, vbNormal) While xsFile <> "" xoFiles.Add xsFile xsFile = Dir() Wend 'Load DTS Package Set moPackageOld = New Package Set moPackage = moPackageOld Set moPackageOld = Nothing moPackage.LoadFromStorageFile mksPackageName, "" 'Process Files For Each xsFile In xoFiles 'Verify File Format ' Could not be done in file load loop ' ANY file I//O between dir calls ' may cause it to lose it's place xiFile = FreeFile() Open xsDir & xsFile For Input As xiFile Line Input #xiFile, xsBuffer Close xiFile If UCase(Trim(xsBuffer)) = mksScannerData Then 'Establish an Event Handler Set moPackageEventHandler = moPackage 'Set DataSource moPackage.Connections("Connection 1").DataSource = _ xsDir & xsFile 'All Steps/Tasks MUST run on VB Main Thread Dim oStep As DTS.Step For Each oStep In moPackage.Steps oStep.ExecuteInMainThread = True Next 'Let VB see Errors moPackage.FailOnError = True 'Execute moPackage.Execute txtDTSEvents = txtDTSEvents & vbCrLf & vbCrLf Else txtDTSEvents = txtDTSEvents & vbCrLf & _ " Error: Not a Scanner Data File" & vbCrLf & vbCrLf End If Next 'Cleanup moPackage.UnInitialize Set moPackage = Nothing Exit Sub PackageError: ... End Sub
As you can see, the most complicated part of the job is looking for the files and verifying the format. With that done, it is a simple matter to set the package connection information and execute the package. See C:DTS2000VBPackages ProcessManyFiles ProcessManyFiles.vbp for the complete code.
All that remains is to insert or update records from tempdb
into the MFT database.
DTS provides an interface for many OLE DB-compliant data sources. Excel is no exception. If you were to point the DTS Wizard at an Excel workbook, it would build a Create Table and Import Task for each spreadsheet.
This would be fine if each spreadsheet contained different data, but the spreadsheets may have similar data and are separated only to make editing and viewing more manageable. In this case, you could use a postload process to combine the data into a single table. However, people like to play with workbooks, so there is a good chance that one spreadsheet will have a slightly different format, making it difficult to postprocess.
Sparks Corporation does work in the United States, Mexico, and Canada. It is important that they have up-to-date accurate currency information. The accounting department has been maintaining a currency workbook. Each currency is maintained in its own spreadsheet. Each spreadsheet contains date, currency, and currency exchange information (see C:DTS2000DataCurrency.xls).
You have been asked to develop a program to load the data into the MFT database once a month.
Build a package that can load a single spreadsheet. Then build a VB program to load each spreadsheet.
First, create a package that can load a single Excel spreadsheet. Go to C: DTS2000PackagesSqlServer2000LoadCurrencyToTempDb.dts. Then use the DTS Import Wizard to create a package as follows:
Data Source: Excel 97-2000 (see C:DTS2000DataCurency.xls)
Destination: Your SQL Server tempdb
Select Source Tables: Select the USD table and set the destination to [tempdb].[dbo].[Currency]
Run:Yes
Save: To SQL Server package LoadCurrencyToTempDb
Before running the package again, you need to modify the LoadScannerToTempDb Task so that it does not attempt to create the table if it exists. Follow these steps:
Open the package, using DTS Designer.
Select the create table [tempdb].[dbo].[Currency]
, and insert the following at the top of the SQL statement:
if not exists (select * from tempdb..sysobjects where id = object_id('[tempdb].[dbo].[Currency]'))
Save the package to C:DTS2000WorkSpaceLoadCurrencyToTempDb.dts.
All that remains is to create the VB program to process many files. Follow these steps:
Open VB, and begin a standard .exe.
Reference the DTS object model.
Set up Form1
with the txtDtsEvent
, Event
, and Error process as shown in
Listing 10.16.
Include code to find the scanner data files, load the package, and execute once for each file (see Listing 10.17).
Example 10.17. VB Program to Find Scanner Data Files in an Excel Spreadsheet, Load a DTS Package, and Execute it Many Times by Changing the Connection
Private moPackage As DTS.Package2 Private moPackageOld As DTS.Package Private WithEvents moPackageEventHandler As DTS.Package Const mksPackageName = _ "C:DTS2000PackagesSQLServer2000LoadCurrencyToTempDb.dts" Const mksCurrencyFile = "C:DTS2000DataCurrency.xls" Option Explicit Private Sub Form_Load() On Error GoTo PackageError 'Define Excel Objects Dim xoXL As Excel.Application Dim xoWB As Excel.Workbook Dim xoSheet As Excel.Worksheet Dim xoRange As Excel.Range Dim xoSheets As New Collection Dim xsSheetName As Variant Dim xsSql As String ' Start Excel,, Get Application Object and Open Workbook Set xoXL = CreateObject("Excel.Application") xoXL.Visible = False ' Load DTS Package Set moPackageOld = New Package Set moPackage = moPackageOld Set moPackageOld = Nothing moPackage.LoadFromStorageFile mksPackageName, "" ' Validate SpreadSheets For Each xoSheet In xoWB.Sheets If UCase(xoSheet.Cells(1, 1)) = "DATE" __ And UCase(xoSheet.Cells(1, 2)) = "CURRENCY" __ And UCase(xoSheet.Cells(1, 3)) = "CONVERSIONFACTOR" __ Then xoSheets.Add xoSheet.Name Else Msgbox ("Validate: " & xoSheet.Name & _ "Not a Currency Spreadsheet" End If Next ' Release Excel Object References prior to Executing Package Set xoRange = Nothing Set xoSheet = Nothing xoWB.Close Set xoWB = Nothing xoXL.Visible = False xoXL.UserControl = False Set xoXL = Nothing ' Process SpreadSheets For Each xsSheetName In xoSheets 'Establish an Event Handler Set moPackageEventHandler = moPackage 'Set DataSource ' Only by Scripting the Package to VB and looking at the code ' will you have a hope of knowing where to set this ' Beware the Single Quote is a "`" not a "'" moPackage.Connections("Connection 1").DataSource = _ mksCurrencyFile If xsSql = "" Then xsSql = moPackage.Tasks( _ "Copy Data from USD$ to [tempdb].[dbo].[Currency] Task"). _ Properties( "SourceSQLStatement").Value xsSql = Left(xsSql, InStr(xsSql, "`USD$`") - 1) End If moPackage.Tasks( _ "Copy Data from USD$ to [tempdb].[dbo].[Currency] Task"). _ Properties("SourceSQLStatement").Value = _ xsSql & "`" & xsSheetName & "$`" 'All Steps/Tasks MUST run on VB Main Thread Dim oStep As DTS.Step For Each oStep In moPackage.Steps oStep.ExecuteInMainThread = True Next 'Execute moPackage.Execute Next 'Cleanup moPackage.UnInitialize Set moPackage = Nothing Exit Sub
Listing 10.17 is a little more complex. First, you create an Excel object and use it to verify spreadsheets.
Processing a spreadsheet requires a little more work. The DTS DataSource
does not change. Instead, you need to modify the connection SourceSqlStatement
. OLE DB exposes each spreadsheet in a workbook as SpreadSheetName$
. To get the package to access a different spreadsheet, you modify the SourceSQLStatement
, stripping off the USD$ generated for the original package and replacing it with SpreadSheet$
. See C:DTS2000VBPackagesProcessManyFilesProcessManySpeadSheets.vbp for the complete code.
You could have written code to select only currencies that interest you, but it’s less work to exclude them when you copy the data from tempdb
to the MFT database.
XML is a powerful platform-independent markup language used to represent data. It is the hottest new technology for moving data among computer applications on the same computer, different computers, or even different operating systems.
An XML document is created using an eXtensible markup language. In its simplest format, an XML document begins with the XML version tag, followed by a document begin tag (<DocumentName >), one or more elements, and a document end tag (<DocumentName >).
Each element consists of a element header ( <ElementTag>) and one or more Attribute = Value pairs, terminated by a element termination (>), as shown in Listing 10.18.
Currently, there is no OLE-DB interface for processing XML documents. As a result, DTS cannot process XML directly. Using VB, however, you can load an XML document and convert it to a .csv file that can be processed in the usual manner.
XML does not limit the number of elements in a document. Smaller documents can be processed without much effort. If the document is large (100,000 elements or more), the parser will attempt to load the entire file. At best, this may take a long time; at worst, it will consume all memory, causing the system to hang.
To deal with large XML documents, you need to read the XML stream and pass it into the parser in smaller chunks.
Sparks Corporation does work in the United States, Mexico, and Canada. It is important that they have up-to-date, accurate currency information. Previously, the accounting department maintained a currency workbook, but it recently began to store the information as an XML document, as shown in Listing 10.19 (see C:DTS2000DataCurrency.xml).
Example 10.19. XML Currency Data
<?xml version="1.0"?> <Currencies> <Currency Currency="USD: Date="1998-04-01T00:00:00" ConversionFactor="1.0000"/> <Currency Currency="USD" Date="1998-05-01T00:00:00" ConversionFactor="1.0000"/> ... </Currencies>
You have been asked to develop a program to load the XML data into the MFT database once a month.
Write a VB program to convert the XML document to .csv format. Then use the LoadCurrencyToTempdb
package to load the data into SQL Server.
Listing 10.20 shows the code for converting XML to .csv.
Example 10.20. VB Program to Convert XML Documents and Then Load Them with DTS
' DTS 2000 - Process Many Files Const mksCurrencyXmlFile = "C:DTS2000DataCurrency.xml" Const mksCurrency'.CSV'File = "C:DTS2000WorkSpaceCurrency.'.csv'" Const mksXMLVersion As String = "<?xml version=""1.0""?>" Const mksXMLStartTag As String = "<Currencies>" Const mksXMLEndTag As String = "</Currencies>" Const mksXMLTagName As String = "Currency" Const mksPackageName = _ "C:DTS2000PackagesSQLServer2000LoadXMLCurrencyToTempDb.dts" Private moPackage As DTS.Package2 Private moPackageOld As DTS.Package Private WithEvents moPackageEventHandler As DTS.Package Option Explicit Private Sub form_load() On Error GoTo PackageError ' Load XML and Convert to '.CSV' Dim xiXmlFile, xsXmlDoc As String, xsXmlText As String Dim xi'.csv'File xiXmlFile = FreeFile() Open mksCurrencyXmlFile For Input As xiXmlFile xi'.csv'File = FreeFile() Open mksCurrency'.CSV'File For Output As xi'.csv'File Print #xi'.csv'File, "Currency,Date,ConversionFactor" xsXmlDoc = "" While Not EOF(xiXmlFile) Line Input #xiXmlFile, xsXmlText 'Strip Out XML Version 'Document Begin and End Tags If InStr(xsXmlText, mksXMLVersion) = 0 _ And InStr(xsXmlText, mksXMLStartTag) = 0 Then xsXmlDoc = xsXmlDoc & xsXmlText End If 'When there is no more data 'Or the chunk is large enough, process If EOF(xiXmlFile) or Len(xsXmlDoc) > 4098 Then SaveXmlTo'.csv' xi'.csv'File, _ mksXMLVersion & mksXMLStartTag & _ xsXmlDoc & mksXMLEndTag xsXmlDoc = "" End If Wend Close xiXmlFile Close xi'.csv'File ' Load DTS Package Set moPackageOld = New Package Set moPackage = moPackageOld Set moPackageOld = Nothing moPackage.LoadFromStorageFile mksPackageName, "" 'Process File 'Point the Source Connection to the .'.csv' File moPackage.Connections("Connection 1").DataSource = _ mksCurrency'.CSV'File 'All Steps/Tasks MUST run on VB Main Thread Dim oStep As DTS.Step For Each oStep In moPackage.Steps oStep.ExecuteInMainThread = True Next 'Execute moPackage.Execute 'Cleanup moPackage.UnInitialize Set moPackage = Nothing Unload Me Exit Sub End Sub Private Sub SaveXmlTo'.csv'(ai'.csv'File, asXmlDoc) Dim aoXmlDoc As MSXML.DOMDocument Dim aoRecords As MSXML.IXMLDOMNodeList Dim aoRecord As MSXML.IXMLDOMNode Dim ai 'Declare a XML Document and Load it Set aoXmlDoc = New MSXML.DOMDocument aoXmlDoc.async = False aoXmlDoc.loadXML (asXmlDoc) 'Step Through XML Nodes and Export as '.CSV' Set aoRecords = aoXmlDoc.getElementsByTagName(mksXMLTagName) For ai = 0 To (aoRecords.Length - 1) Set aoRecord = aoRecords.nextNode Print #ai'.csv'File, _ aoRecord.Attributes.getNamedItem("Currency").Text & "," __ & aoRecord.Attributes.getNamedItem("Date").Text & "," __ & aoRecord.Attributes.getNamedItem("ConversionFactor").Text Next End Sub
In the first half of the code, you parse the input XML document and break it into manageable chunks—in this case, 4K, but larger chunks (64K or more) are possible.
While reading the input document, you strip off the XML, Document Start and Document End tags. However, Microsoft’s XMP Parser requires properly formatted XML documents. Prior to giving the chunk to the parser, you must wrap it in each chunk with XML Document Start, and Document End tags.
To process the chunk, set the parser’s async
property to False
and force synchronous loading. Use getElementsbyTagName
to retrieve a collection of records. Use nextNode
to step through every record, and use Attributes.getNamedItem("AtributeName")
to retrieve field data, which is formatted and written to a .csv file.
When the original file is consumed, you can pass the .csv file to the LoadCurrencyToTempdb DTS package for processing. See C:DTS2000 VBPackagesProcessXmlProcessXml.vbp for the complete code.
All that remains is to insert or update the records from tempdb
into your destination database.
As an alternative to the solution discussed in this section, you can use the XML Bulk Load component from Microsoft. The XML Bulk Load component gives you high-speed XML-based data loading directly into SQL Server tables. This component is distributed free from Microsoft at http://www.microsoft.com/xml. When it is installed, you can execute from any programming language that can work with ActiveX components, such as Visual Basic, VC++, and even VBScript in DTS.
To load an XML document using XML Bulk Load, you will need to provide a mapping of the XML elements to SQL Server tables and fields; this mapping is called XDR Schema. The XDR (XML Data Reduced) Schema is an XML-like document that describes the structure of an XML document and can contain the relational equivalence for the data. For example, the XML Document in Listing 10.19 will have an XDR Schema similar to the one in Listing 10.21.
Please note that the XDR Schema has a section that contains the XML element’s name and associated data type, followed by a section that maps each element to the DBMS sql:relation
(table in SQL Server) and sql:field
(field in the referenced SQL Server table). For this example, the data goes into the table Currency and loads data into three fields: Currency, Date, and ConversionFactor.
Example 10.21. XDR Document for Listing 10.19
<?xml version "1.0" ?> <Schema xmlns="urn:schemas-microsoft-com:xml-data" xmlns:dt="urn:schemas-microsoft-com:xml:datatypes" xmlns:sql="urn:schmas-microsoft-com:xml-sql"> <ElementType name="Currency" dt::type="string"/> <ElementType name="Date" dt::type="datetime"/> <ElementType name="ConversionFactor dt:type="numeric"/> <ElementType name="Currencies" sql:relation="Currency"> <element type="Currency" sql:field="Currency"/> <element type="Date" sql:field="Date"/> <element type="ConversionFactor" sql:field="ConversionFactor"/> </Element type> </Schema>
To use the XML Bulk Load component in DTS, you need to use the ActiveX Script Task. First, create an instance of the XML Bulk Load object, connect it to SQL Server by setting the ConnectionString
property, and then execute it while referencing an XDR Schema and the XML document. Listing 10.22 shows the structure of the code. Please note that the XML Bulk Load has a full object model to support the work, and many more properties are available than the ones discussed in this example. The code in Listing 10.22 sets the ErrorLog
file location.
Example 10.22. ActiveX Script That Loads an XML Document called SampleXMLDocument, Using the SampleXDRSchema.xml for the Data Mappings, and Outputs All Errors into ErrorLog.txt
set oBulkLoad = CreateObject("SQLXMLBulkLoad.SQLXMLBulkLoad") oBulkLoad.ConnectionString = "provider=SQLOLEDB.1;data source=ServerName;database=DatabaseName" oBulkLoad.ErrorLogFile = "c:ErrorLog.txt" oBulkLoad.Execute = "c:SampleXDRSchema.xml", "c:SampleXMLDocument.XML" set oBulkLoad = Nothing
The XML Bulk Load can load data into either single or multiple tables; you just have to make sure that the correct mappings are in the XDR Schema. For more information on the XML Bulk Load as well as full documentation, visit the XML site at Microsoft.com. For more information on XDR Schemas, see SQL Server Books Online.
Email is pervasive in modern business because it is one of the simplest methods of moving data among different computers or even operating systems. Many programs email data to subscribers automatically. What is missing is an automated method for loading email data into SQL Server.
MFT has relocated accounting to an office in another city. Each month, it emails Currency.xml to you. You need to develop a program to scan incoming email and process any messages that contain Currency.xml data.
The best solution to the preceeding scenario is to develop a program to scan email and feed any Currency.xml documents to the package developed in the previous solution.
When developing the application, you need an early-bound reference to a Microsoft Outlook-type library. Table 10.8 lists the filenames of the type libraries for the different versions of Microsoft Outlook.
Table 10.8. Outlook Object References
Outlook Version |
How Type Library Appears in References List |
Filename |
---|---|---|
Outlook 97 |
Microsoft Outlook 8.0 Object Library |
msoutl8.olb |
Msoutl8.olb |
Microsoft Outlook 98 Object Library |
msoutl85.olb |
Outlook 2000 |
Microsoft Outlook 9.0 Object Library |
msoutl9.olb |
The code in Listing 10.23 shows scanning and processing email.
Example 10.23. VB Program to Scan Emails and Process Them with DTS
Private moPackage As DTS.Package2 Private moPackageOld As DTS.Package Const mksPackageName = _ "C:DTS2000PackagesSQLServer2000LoadCurrencyToTempDb.dts" Const mksCurrencyFile = "C:DTS2000DataCurrency.xls" Option Explicit Private Sub Form_Load() Dim xoOutlook As Outlook.Application Dim xoNameSpace As Outlook.NameSpace Dim xoMail As Outlook.MailItem Dim xi'.csv'File Dim xsTry As String On Error GoTo PackageError ' Start Outlook.. Set xoOutlook = CreateObject("Outlook.Application") ' Logon. No affect if you are already running and logged on Set xoNameSpace = xoOutlook.GetNamespace("MAPI") xoNameSpace.Logon ' Open '.csv' File xi'.csv'File = FreeFile() Open mksCurrency'.CSV'File For Output As xi'.csv'File Print #xi'.csv'File, "Currency,Date,ConversionFactor" ' Cycle Through Inbox, looking for CurrencyXML Message(s) For Each xoMail In xoOutlook.Session.Folders( _ "Mailbox - Brian Sullivan").Folders("inbox").Items If UCase(olMail.Subject) = "CURRENCY XML" Then SaveXmlTo'.csv' xi'.csv'File, _ mksXMLVersion & mksXMLStartTag & _ xoMail.Body & _ mksXMLEndTag End If Next ' Clean up Outlook xoNameSpace.Logoff Set xoNameSpace = Nothing Set xoMail = Nothing Set xoOutlook = Nothing ' Close '.CSV' file Close xi'.csv'File ' Load DTS Package Set moPackageOld = New Package Set moPackage = moPackageOld Set moPackageOld = Nothing moPackage.LoadFromStorageFile mksPackageName, "" ... End Sub
The first portion of Listing 10.23 deals with scanning email and extracting XML data to a .csv file. The rest of the code is similar to previous examples.
Divide and conquer is a powerful tool. Separating data extraction from data loading allows you to use known tools to find the data and then allows DTS to transform and load it quickly. See C:DTS2000VBPackagesProcessOutlookProcessOutlook.vbp for the complete code.
In this chapter we have gone through several situations that showed you how to use Visual Basic to run, control, and respond to DTS packages. We covered event handling, error handling, and executing packages. We have used those basics to make DTS packages participate in a variety of solutions in which an external program can provide assistance. This is by no means all there is about writing applications to interact with DTS, but it is my hope that this chapter will provide you with a framework for many of your Visual Basic and ETL application needs.
[1] All data, scripts, and programs used in this chapter are available for download from http://www.magenic.com/publications or http://www.newriders.com. The self-extracting ZIP file will create a directory structure based on DTS2000
. TheVisual Basic programs discussed in this chapter are available in DTS2000VBProjectsBestPractices
. You can extract the files to any drive or directory, but all file references in this book are based on C:DTS2000
.
[2] All data files and scripts used in the book are available for download from http://www.magenic.com/publications or http://www.newriders.com. The self-extracting ZIP file will create a directory structure starting with DTS2000
. You can extract the files to any drive or directory, but all file references in this book are based on C:DTS2000
.
[3] All data files and scripts used in the book, including this package LoadScannertoTempDB. dts, are available for download from http://www.magenic.com/publications or http://www.newriders.com. The self-extracting ZIP file will create a directory structure starting with DTS2000
. All packages in this book are located in the Packages subdirectory of the DTS2000
directory.
18.217.206.10