SQL Server Integration Services (SSIS) is a powerful data transformation engine that ships with SQL Server. In other words, when you purchase a license of SQL Server, you get a “free” or included license of SQL Server Integration Services. This an obvious win for SQL Server users, but some organizations that don’t even really use SQL Server have been known to purchase SQL Server licenses just to gain access to SSIS. It's that powerful. However, while SSIS is an insanely beneficial tool, it does come with some hidden costs and expensive potential gotchas.
The Primary Purpose of SSIS: ETL and Data Warehousing
The primary benefit of SSIS is flexibility. With SSIS, developers drag and drop various heterogeneous data sources and destinations on to a designer where data can then be transformed or migrated via a powerful pipeline that enables an almost endless number of transformation options and capabilities. In simpler terms, SSIS makes it easy to “grab” data from one or more locations, “transform” or reformat it in memory while in the pipeline, and then “spit it out” into a totally different location.
But, while SSIS is suitable for all sorts of different kinds of tasks (like pulling data out of flat files from vendors and “loading” that data into your inventory or products database for sale), the reality is that SSIS is predominantly focused and used for data warehousing. In this context, SSIS does a great job of facilitating ETL (extract, transform load) operations--or, the idea of pulling data out of one source, flipping it around and tweaking it as needed, and dropping (or loading) it into a data warehouse destination.
In fact, it’s probably safe to say that the majority of the folks who use SSIS are, effectively, ETL ninjas who eat, sleep and drink SSIS and data transformation out of OLTP sources into OLAP destinations. And, frankly, for these ninja types, most of the caveats or complaints I’ll mention here today almost don’t even register because the problems I’ll describe are “hidden costs” that really tend to bite “casual” or almost “recreational” users of SSIS.
Problems and Hidden Costs for Casual SSIS Users
Because SSIS is powerful, it’s non-trivial to use. So, for those who don’t use SSIS day in and day out, there are also some potential long-term problems, hidden gotchas and additional costs that need to be addressed whenever SSIS is used to solve a problem. And while each of these concerns could probably be a post by themselves, in the interest of time I’ve boiled them all down to a quick bullet list of major problems.
- SSIS is brittle – by design. This is almost a non-concern in that SSIS’ brittleness is something you’ll bump into right out of the gate. And, to be fair, a solution that enables in-memory transformation of data can’t be “causal” with data types or afford to make “guesses” about what kinds of truncation or changes to numeric precision would be tolerable. As such, this means that SSIS packages can be a bit of a pain to build: You have to get casts and conversions perfect or your packages will simply crash and burn. But, again, that info is a given. Where this becomes problematic, however, is when casual users of SSIS build packages, let them run for months or years, and then need to come back in and change these packages to either upgrade to a newer version of SQL Server or to address schema changes or workflow modifications. In situations like this, working with SSIS can be a royal pain-- so much so that you’ll frequently find that it’s quicker and easier to destroy and recreate large sections of your package, rather than trying to get existing components to work with underlying schema changes. The problem, though, is that this is time consuming and adds to cost of ownership over time.
- Licensing requirements and options can be confusing. Nothing about SQL Server licensing is really very intuitive. However, when it comes to SSIS, it’s important to realize that SSIS is a SQL Server component or feature. In other words, SSIS is a server-level engine, and while you can set up specialized/dedicated instances of SQL Server to primarily serve as SSIS “servers” for running your packages, those servers are going to require full licenses of SQL Server. This isn’t much of a gotcha, unless you’re thinking you can just “offload” SSIS tasks on to a different server. (Doing so is fully supported, but it’s going to require that server to be fully licensed.)
- Migrating SSIS packages can be a nightmare. When developers create SSIS solutions, or packages (via Visual Studio), they can deploy them to a server and/or save them locally against the file system. Because these packages contain sensitive information (like database connection strings), the default way in which these packages are saved locally is to encrypt sensitive information against the current user’s Windows credentials. This is great from a security standpoint, but absolutely horrible from a source-code perspective. Why? Because it means that if your employees ever leave, or if substantial changes happen in Windows (that is, if you’re in a work group and just repave your workstation), then you’ll never be able to open those older packages again as they were because you won’t be able to un-encrypt the sensitive info. If you Google, you’ll find some potential work-arounds that can help you strip sensitive info and try to reload packages without those details, but I’ve sadly seen entire projects become un-recoverable. They had to be built completely from scratch when major “Windows user” changes happened and the work-arounds simply wouldn't work. As such, make sure that if you wish to keep packages long term, that you chose to save them by encrypting sensitive information with a password (that you keep safe elsewhere). Otherwise, you can be looking at a major loss to continuity when a package needed for important business operations needs to be changed after years of operation.
SSIS Tooling with SQL Server 2012 and above can be a pain to find. Tooling for SSIS has always been a very sore spot, given that it’s long been a half-assed "bolt-on" to Visual Studio. It has, typically, lagged in support of main-line versions of Visual Studio, and it's had substantial compatibility problems over the years. In the SQL Server 2012 timeframe, Microsoft tried to remedy this by “breaking” these tools out into a more stand-alone set of tools that are tied in “lock-step” to full releases of Visual Studio. By and large, this is and was a good thing. The ridiculous way in which they did it though, has caused--and continues to cause--no end of confusion among users. This blog post provides an overview of the problem. A bigger problem, though, is that this means if you now Install SQL Server 2014, there are absolutely no tools whatsoever available on the SQL Server 2014 installation media for managing and creating SSIS packages. For SSIS ninjas who use the tool day in and day out, this sadly makes sense. But for folks like me who aren’t idiots but who also don’t eat, sleep and drink SSIS, this is about as lame and confusing as it gets--especially when the SQL Server 2014 installer media shows the following:
Again, SSIS is an insanely powerful tool. And, again, most of the problems I’ve raised here are things that SSIS ninjas don’t really bat an eyelash at (though I know they’re as sick of versioning problems and naming confusion as anyone). But for non-ninja users, SSIS isn’t all rainbows and unicorns. It's a complex tool, so it’s going to come with some learning curves and caveats. The trick then, is to be as aware of as many of them as possible so that you’re better prepared. Hopefully, this post has helped outline some of the bigger concerns or problems for anyone thinking about using SSIS for any of their own needs.