DATA TALES #12: The Case of the Code that Refused to Execute

Over recent months the articles that I’ve written have focussed on the database engine. I’ve recently rolled many of the tools that I’ve been building for working with the database engine into a set of free downloads. You’ll find them here:

http://www.sqldownunder.com/Resources/SDUTools

But today’s tale relates to a SQL Server Integration Services (SSIS) package. It used the same architecture as other packages and on the same server, and even though it said it was successful, nothing happened. Let’s discuss why.

Some Background Information

At the site, a separate SSIS server (using SQL Server 2014) had been deployed. The server was used to run all the organization’s SSIS packages. These packages connect to several data sources:

SQL Server 2012
SQL Server 2014
SQL Server 2016
Flat files
IBM AS400

Many packages have previously been deployed and were working fine. The biggest challenge with having a separate SSIS server is that it invariably leads to double-hop issues, unless the Kerberos configuration on all the servers involved has been carefully set up. At this site, a variety of Kerberos issues had happened in the past.

Getting Kerberos configuration correct still seems to be harder than it should be at most sites. The most common problems that I see with Kerberos configurations that don’t work at all are:

SPNs not correctly configured
Constrained delegation not correctly configured for the combination of the service account and service.

If Kerberos configuration is new to you, I’d suggest the following article as a good starting point:

https://blogs.msdn.microsoft.com/psssql/2010/06/23/my-kerberos-checklist/

For SQL Server specifically, there is now a tool for troubleshooting Kerberos configuration. The download is here:

https://www.microsoft.com/en-us/download/details.aspx?id=39046

A blog post that describes the tool is here:

https://blogs.msdn.microsoft.com/farukcelik/2013/05/21/new-tool-microsoft-kerberos-configuration-manager-for-sql-server-is-ready-to-resolve-your-kerberosconnectivity-issues/

Generally I find the tool ok for confirming that things are working, but not all that great when things aren’t working in the first place. For that, checking the items in the previous link should help.

More concerning is when I see intermittent Kerberos issues. The main causes that I see for that are:

Time sync between servers (you cannot have any real difference in time between any of the servers involved)
Server farms or replicas (one server works but another that you fail over to doesn’t so it’s random which one you hit)

The Symptom

The client told me that they had modified an existing SSIS package and deployed it to the server as a separate package. Since they did that, the SQL Server Agent job that ran the modified package (let’s call it package WontWork) completed without errors. However, none of the work that it was meant to do was performed. The previous version of the package (let’s call it package StillWorks) was still working exactly as expected.

The client was puzzled how a job that appeared to run correctly could actually do no work at all.

The Package Design

I looked at what the package did. It located several files from an Input folder, checked that they were accessible and moved them to a Staging folder for processing. For each file in the Staging folder, it then checked the file type and applied processing for each type of file. Mostly that involved loading the data into their SQL Server database that was used for staging data. Each file that was processed was copied to a Copy folder (overwriting any previous version of the same file), then if all files were processed correctly, the files were moved to a Processed folder. Finally a stored procedure on the database server was called to process the staged data.

The Investigation Part 1: Security

The first thing that concerns me with these types of packages is the For Each container that cycles through the files. I’ve seen these containers simply return an empty list of files when the service account that the job is running as did not have permissions on the target folder.

Packages like this should be configured in SQL Server Agent job configuration to run as an appropriate Proxy account. One common mistake that I see is packages that run as the SQL Server Agent service account. This should be avoided. If you do that, the Agent service account needs to have way too many permissions. Instead, an appropriate SQL Server credential should be set up for a separate service account that has permissions to run the package. If you haven’t configured credentials before, I suggest you investigate them. They appear in in Object Explorer. This option has been there since SQL Server 2005 and is a server-level credential:

Note that in SQL Server 2016, and Azure SQL DB, some of these security options are now available at the database level instead. You can read about database scoped credentials here:

https://msdn.microsoft.com/en-us/library/mt270260.aspx

A credential simply associates an external identity (in this case a Windows login) with a SQL Server object name.

To use a credential for executing an SSIS package, you need to also create a proxy. In SQL Server Agent, a proxy is basically an authorization for a given subsystem (like SSIS, PowerShell, etc.) to be configured to use a credential. Here is an example of a proxy being configured:

The proxy has a name, an associated credential, and a list of subsystems that can use that credential.

Once that is configured, a job step of that type can be configured to run as that proxy:

I checked the configuration in this area and found no issues. To be sure of the required access, I logged using the same credential, and was able to work with the files in those folders without issue. I also checked in the Agent Job History that the job step did in fact execute as the account that it should have.

So the problem lie further afield.

The Investigation Part 2: SSIS Logging and Reports

The next thing that I checked was the standard SSMS report that can be run for a package. It shows each execution, then lets you drill into each execution to see what occurred. There are different levels of logging available. Most times, I find the default basic logging to be quite sufficient to get to the bottom of situations, but occasionally I need more detail.

When configuring or altering a SQL Server Agent SSIS job step, you can configure the logging level. You can also configure it in SQL Server Data Tools when designing the package.

I set the logging level to Verbose, re-executed the package and carefully looked through the output.

The things that I discovered at this point were:

The package appeared to run successfully
Connections were being successfully opened
Folders were being accessed successfully
The first For Each container was looping through all the files
The first step in each loop was to find the file extension. That was working correctly.
The second step in each loop was to check if the file could be accessed. This also appeared to have executed.

However, the third step in the loop was to move the file. This was never occurring. It appeared that the code to detect if a file was accessible was always saying that the file wasn’t accessible. This meant that the rest of the package was totally ignoring the files.

So at least I now knew where the issue lied, or so I thought. I presumed there was a bug in that code.

The Investigation Part 3: IsFileAccessible Script Task

The package was using a script task to determine if the file could be accessed. This was done to avoid trying to move a file that was still being written by the source system. The code was as follows:

public void Main()
{
    Dts.Variables["IsFileAccessible"].Value 
    = IsFileAccessible(Dts.Variables["MoveFromLocation"].Value.ToString());
    Dts.TaskResult = (int)ScriptResults.Success;
}

bool IsFileAccessible(string FilePath)
{
    bool isAccessible = false;

try
    {
    FileStream fs = File.Open(FilePath, FileMode.Open, FileAccess.Write, FileShare.None);
    fs.Close();
    isAccessible = true;
    }
    catch
    {
    isAccessible = false;
    }

return isAccessible;
}

It was hard to see any issue with the code that would cause this problem. The approach taken was to try to open the file (based on the variable MoveFromLocation) in Write mode without sharing. If it was possible to do that, the file is clearly accessible. If it was opened, it was immediately closed again. Either way, the variable IsFileAccessible was being set.

I checked that the precedence constraint correctly checked the value of the variable, to determine whether or not to move the file.

The Investigation Part 4: The Plot Thickens

I added some code to output the values of the MoveFromLocation and IsFileAccessible values to a table and re-ran the package. No matter what the filename was, the IsFileAccessible variable returned false.

I was now really starting to get puzzled. I was left wondering how that script task could fail like that.

Finally, I modified the script task to just set the value to true every time. Always the variable was false after it ran.

I checked that the variable was correctly configured as a Read-Write variable in the script task. I’ve seen scenarios where variables were accidentally configured as Read Only. When this happens, they can be modified within the script but the value is never copied back to the package. That also wasn’t the case.

Finally, I was left wondering if the script task was running at all. I added a reference to System.Threading and added a wait via:

Thread.Sleep(10000)

I wanted the task to wait for 10 seconds. But when the package ran, the task returned success yet no wait occurred. I then realized that the script task really just wasn’t running.

Uncovering the Issue

Given the bizarre nature of the situation where a package runs ok but script tasks don’t execute, I started wondering about .NET versions on the server. But then I realized that other packages on the same server still ran and they used the same script task.

I knew there had to be an incompatibility.

Finally, I realized that a later version of SQL Server Data Tools (SSDT) had been used. These versions of SSDT can target multiple output server versions.

I checked the package properties and sure enough, there was the issue:

When the package was opened, it was configured for SQL Server 2016 and the target SSIS server was SQL Server 2014. When you create a new empty package (I’ve just used a default name here), in Solution Explorer, the package looks like this:

But if you change the target server version property, it looks like this:

In the end, once this change was made and the package was redeployed, everything worked as expected.

Summary

I have to say that this issue really surprized me. I would have expected that trying to deploy a SQL Server 2016 package to a SQL Server 2014 server would have failed outright. Instead, for this package, everything worked as expected, except that script tasks that said they worked, simply did nothing.

I hope this helps someone.

Comments

Plain text