It's no secret that developers can run afoul of scalability problems when creating solutions backed by SQL Server or other relational databases. In fact, earlier this month, I complained about developers who mistakenly assume that NoSQL is the solution to all scalability and performance problems they encounter when building applications based on relational databases. In my experience, however, one of the easiest ways developers cause themselves significant pain with scalability problems is by using triggers.
The Problems with Triggers
In theory, triggers serve as specialized code that fires when a triggering event occurs within a database. Within SQL Server, there are two primary types of triggers: Data Definition Language (DDL) triggers and Data Manipulation Language (DML) triggers. DDL triggers, as the name implies, 'fire' when data definition language events (like the creation of new code, structures, logins, etc.) are executed—and those triggers are not the subject of this article. DML triggers, the triggers that execute when data is modified or manipulated are, however, subject to a number of major concerns as follows:
- Obscurity. When debugging, troubleshooting, or trying to find the root cause of a bug or what's causing data modifications within complex systems, triggers are insanely easy for developers (and even DBAs) to forget about. Put another way, if developers know that a certain section of their app/code is tasked with handling, say, INSERTs against a particular table, and the developers keep encountering some values ending up slightly different than they were when sent, developers will commonly assume that there's a bug somewhere in the code—before they remember that a trigger deployed potentially years ago is silently watching all INSERTs and occasionally modifying them to force data to line up with business rules that might no longer apply. This same 'deployed but easily forgotten' issue with triggers can also cause additional headaches when troubleshooting performance issues.
- Complexity. Aside from the fact that there are good reasons to avoid putting business logic in the database (there are good reasons to argue for such a practice as well), the fact is that triggers are more complex than they initially appear. For example, one of the most common recurring problems with triggers is that the developers who create them regularly think in terms of a single row being modified and caught in the trigger they're coding. As such, it's very common to see situations where developers mistakenly create triggers with erroneous scalar semantics that cause problems when something like an UPDATE statement affects multiple rows. Then, there's also the issue of whether the DML trigger being created should fire before, after, or instead of the DML that would have normally occurred—all of which combine to make it that much harder to figure out what's going on when triggers are present.
- Performance / Scalability. Triggers, by definition, run every time a DML operation runs, using logic and optional filtering to determine whether or not the trigger should ignore, block, rollback, or modify the DML changes attempted. Likewise, because of how DELETED and INSERTED pseudo-tables work and enable joins against the true/source table being checked, it's typically quite common to see operations with triggers requiring more granular and expensive locks than would normally be the case. In other words, a table without triggers will perform and behave better than a table with triggers in almost any case—which is a key consideration when it comes to scalability—because while a trigger may work fine on smaller databases with smaller numbers of users, as the amount of data in a table increases and the amount of users increases, triggers grab longer and longer locks and impose greater and greater performance and scalability problems. In fact, in some of the worst scalability problems I've ever encountered, there have been a number of ugly and thoroughly ensconced triggers that were making things significantly worse than they should've been.
When to Use Triggers
Given that triggers are complex, easy to miss when troubleshooting, and cause major problems in larger and more heavily-concurrent systems, is there a place or role for them in any system? Personally, I argue against using them pretty much anywhere or for anything. By and large, most of the logic they enforce could be moved into stored procedures or into whatever code is being used to modify data already. In fact, I'd argue that the use of triggers typically signals a systemic or architectural problem, because developers frequently start using triggers as a shortcut to enforce business rules against a table—instead of the code that modifies it—as a catch all against the existing logic and functionality that's already in place. This would be analogous in some ways to trying to write pre- and post-filters against web requests to insert or modify HTML headers and footers, instead of finding the code creating these objects elsewhere and modifying that code as needed.
Those considerations and concerns aside, there is one place where triggers can make sense: simple auditing. Auditing that needs to be done for regulatory compliance is typically too extensive and invasive for triggers and should be tackled by a (typically expensive) third party auditing solution. But, for simple scenarios where you need to keep tabs on data in particular rows as they're modified, it's possible to set up simple triggers that will capture and direct INSERTED, DELETED, and UPDATED rows into a 'TableName_Audit' table where you can track the values and metadata about who performed the operation and when it was performed without too much overhead. This isn't to say that using triggers in this fashion ends up being free of the problems listed above—instead, it's just to say that there is one area where I feel that triggers can and do have a role. If, however, you end up building triggers to be used in this fashion, just make sure to account for the fact that INSERT, UPDATE, and DELETE, as well as MERGE operations can and will modify more than a single row at a time, and realize that, as your audit tables become larger and larger, the amount of time it will take to drop rows into them may eventually cause problems if indexing against your audit tables is excessive or not properly optimized for INSERTs. Otherwise, I strongly recommend against using triggers because of the problems they invariably end up imposing.