If you've programmed with Data Transformation Services (DTS), you'll probably agree that error handling is one of the most confusing and challenging problems the DTS developer faces. In our July 1999 article, "The DTS Development Guide," we looked briefly at the built-in error-handling options the Package Designer offers. This month, we discuss in depth some programmatic opportunities for using the DTS Object Model to handle errors. We also examine DTS event handling.
The DTS Perspective on Errors
DTS doesn't consider an error to be a definitively right or wrong result, but rather the status of work accomplished. DTS leaves the interpretation of that status to the developer. Thus, developers can control the execution of a package, task, step, or transformation by monitoring the state or return value of the executed operation. For example, when transforming data via an ActiveX script, developers can control the processing by setting the return code in the script to any one of the values in the DTSTransformStatus enumeration list, which Table 1, page 46, shows. Table 2, page 46, lists other DTS status codes and result constants (their specific values are available in Books Online—BOL). Each of these lists of predefined constants lets developers determine whether, and in what manner, transformation processing will continue.
If you've previously developed only with the Package Designer, you haven't seen these values before because, aside from DTSTransformStatus, which is available to your ActiveX transformation scripts, the values are exposed only when you're writing directly to the DTS Object Model. Although the Package Designer lets you establish error-handling functionality in several places (we'll examine these shortly), it forces you to handle errors reactively. In other words, when the package encounters an error, it can only report or log the error. From then on, you have little control over the package execution. Depending on its type and source, an error generally causes the package execution to abruptly terminate. As a conscientious developer, you want to handle errors proactively—that is, you want to fully control how, when, and whether the error messages are communicated. You also want the flexibility to control the direction a package takes after it encounters an error.
Errors and the Package Designer
Let's review the error-handling capabilities the Package Designer affords within a package's logical control levels: package level, task level, and transformation level. At the highest level—the package—the designer offers little proactive control, as Screen 1, page 46, shows. The Designer lets developers designate an output file for logging errors or direct DTS to write the package's completion status to the Event Log. But both options are reactive. Developers can handle errors more proactively by enabling the Fail package on first error attribute, which tells DTS to halt execution after the first task failure. This specification might seem unnecessary, but even after a task fails, a package attempts to continue processing and uses precedence constraints to control its actions.
At the package's task level, precedence constraints offer the sole means of error control. In the Package Designer, this level provides the best proactive support. By using the Workflow Properties to manage workflow (right-click a constraint, and choose Properties, as Screen 2 shows), developers can designate which tasks happen and in what order, and can designate alternative tasks. For example, a common task in a data-warehousing scenario loads and scrubs raw data from various source systems before moving it to operational data tables. In this situation, as the arrows in Screen 3, page 48, illustrate, precedence constraints control the workflow from one task to the next (green arrows represent On Success constraints, and red arrows represent On Failure constraints).
The final level in the Package Designer, the transformation level, contains two types of error handling. The first type deals with the transformation in general. As with package-level errors, you use a simple dialog box to control the transformation settings, as you see in Screen 4. These settings are available from both the Data Driven Query task and the simple Data Pump task. Reactively, you can designate an output file and its format for logging transformation exceptions. You can proactively establish the maximum number of transformation exceptions before the transformation task aborts. As at the package level, you don't have much control at the transformation level.
The second type of transformation-level error handling offers you more control. An ActiveX transformation controls this type, which falls within the transformation. By using the various transformation and execution status codes, you can skip rows, reprocess them, and designate rows as exceptions without causing the task to fail. For example, suppose you need to process a large data file containing billing information. As part of the transformation process, you want to allow importing of only those rows that originate in Department 1234. Listing 1 contains a segment of control logic that you might use to perform this task. Notice how the logic uses the transformation status codes of DTSTransformStat_OK, DTSTransformStat_SkipRow, and DTSTransformStat_ExceptionRow to filter the data. By extending this logic, you can also perform any necessary cleansing operations, such as marking invalid amounts as exceptions.
The transformation level is as far as you can go in handling errors with the designer. Remember that the Package Designer simply acts as a wrapper, exposing some limited functionality of the underlying DTS Object Model. So, to reach the next level of control, you must peel away the wrapper and use the object model directly.
Errors and the Object Model
Developers who want to leverage the power of DTS quickly abandon the Package Designer and move on to designing custom packages by using the DTS Object Model natively. However, this approach poses a problem: How can you take advantage of the added functionality the object model exposes, while retaining the detailed error messages the package designer offers? After all, one nice feature of running packages through the Package Designer is the status dialog box, which details the tasks and their execution status. Most important, when something goes wrong, the Package Designer provides a dialog box containing a detailed error description. Because the Package Designer acts as a wrapper that exposes the underlying object model, the same errors the Package Designer reports must also be directly available in your Custom Packages, right? And they are! Before you examine how to obtain error information, you need to understand a few rules. The first rule is that package execution requires one or more steps. The second rule is that every task must be associated with at least one step. For simplicity, when you're developing packages graphically, the Package Designer hides both these rules. So, what do these rules have to do with errors? Put simply, packages and steps, not tasks, expose interfaces for examining error information. Now let's examine the two error-retrieval methods, beginning with the reactive approach the Step object offers.
The first and simplest way to access a package's error information is through a step's GetExecutionErrorInfo method. We classify this method as reactive because it can only report on errors; it offers no direct ability to control subsequent execution of the package. In fact, this interface is available only after package execution is complete. For example, suppose your requirements are to develop a package and report any steps that failed. To demonstrate this process in action, we used Visual Basic (VB) to develop a package that attempts to run a stored procedure. (Subscribers can download the complete VB projects and all the code associated with this article at http://www.sqlmag.com.) The package contains an Execute SQL task called SQLTask and a step called ExecuteSQLStep. Listing 2 contains a sample of the code. After execution, the application queries the package's Steps collection and checks each Step object's ExecutionResult property to determine whether an error occurred. If so, the application calls the step's GetExecutionErrorInfo method to retrieve the error details. A message box then presents the results to the user. On first execution, the application calls the stored procedure sp_who and completes successfully. For the second run, change the stored procedure name from sp_who to sp_whois (which doesn't exist) and run the package again. Screen 5 shows the resulting error message. By quickly looking at the message, you can tell which step contains the error (ExecuteSQLStep) and which component is in error (Microsoft OLE DB Provider for SQL Server). And you can get a good description of the error: Could not find stored procedure 'sp_whois'. Now that you've seen the reactive side of capturing error information, let's examine the proactive approach.
The DTS Package object lets applications respond through events to execution state changes. By default, DTS provides five events that developers can use to monitor the internal processing of a package. The OnFinish event executes at the completion of each task or step. The CustomTask uses the OnProgress event object to report progress. The OnQueryCancel event lets you terminate a running task, the OnStart event executes at the start of a task or step, and the OnError event provides information about an error. Developers can proactively control a package through these events. Let's look at how proactive error handling uses the OnError event.
Any time a DTS component or executing task raises an error, the OnError event fires. The event fires before a package returns from its Execute method. This distinction is important because it lets you choose which errors are terminal and which errors to handle. Besides reporting the error information, the OnError event also gives you a means to ignore errors and continue package execution. This option is helpful because by default, when a package encounters an error, it terminates execution.
To illustrate OnError error handling, we used a derivation of our VB application to serve as a model. This application implements the package as a class named cMyPackage, which subscribers can download at at the link to this article at http://www.sqlmag.com. This implementation is necessary for VB to handle the events.
To enable events within an application, VB uses the keyword WithEvents on an object variable declaration. Using WithEvents, however, limits how and where you can declare the variable. For example, you must declare within an object any variable that uses WithEvents (this variable might be a form or a class module). Also, you can't use WithEvents with the New keyword. The final gotcha in dealing with VB and events is the requirement that, to use events, an application must support all events the underlying object exposes. In this example, then, you need to supply the four remaining package events, although they won't contain any code.
To enable events within the class, use the WithEvents keyword and declare the package variable as follows:
Private WithEvents oPackage As DTS.Package
The remainder of the class contains the same functionality as the earlier example. The difference is that this time, the main program logic is encapsulated in the class' Execute method. You can see where we declared all the events at the end of the code.
Callout A in cMyPackage highlights the implementation of the OnError event. The event contains all the same error-reporting data as GetExecutionErrorInfo. In addition, by using the Cancel parameter, you can control whether package execution continues or stops. For this example, leave it set to the default value of TRUE, indicating that the package will terminate execution.
To use the new package, create the driver module in Listing 3. Using the package is as simple as declaring an object of cMyPackage type and firing its Execute method. On the first run, the stored procedure sp_who successfully executes. Because no errors occur, the OnError event doesn't fire. Next, change the stored procedure name to sp_whois. When the package executes this time, an error occurs, which in turn causes the OnError event to fire and display the message box in Screen 6.
So, How About Your Errors?
Using a combination of the OnError event and the GetExecutionErrorInfo method gives you many options for satisfying your error-reporting and flow-control needs. Also, by exploring other package events, you can quickly build a status console similar to the one the Package Designer provides. Although you might be able to deal with errors in other ways, understanding how DTS handles errors is the first step to building robust custom packages. As always, we're interested in hearing how you're handling errors, so email us about your experiences.