A big part of what I do as a SQL Server consultant is help organizations with their T-SQL code.
Sometimes that occurs when troubleshooting or performance tuning leads me to a poorly performing query. Other times that’s part of a more formalized code-review of existing code where I’m looking for the following problems or concerns:
- Potential for SQL Injection
- Improper Error Handling
- Improper use of Transactions
- Improper use of input/output return parameters
- Data-Type coercion
- Non-SARGable code
- Anything else that makes code hard to manage, maintain, and extend (i.e. BAD coding practices)
Ultimately, whenever I review code, I tend to analyze against the following priorities – listed in descending order of importance:
- Maintenance / Extensibility
Stated differently, code is almost always a trade-off of different concerns. One easy way to think of this is in terms of actual cost – or the amount of money it takes to both produce and maintain code. For example, ‘cheap’ developers without the necessary skill and background necessary to properly optimize code can easily create code that costs less in terms of capital expenditures when ramping up a new project or business venture.
Related: Working with Variables in T-SQL
But, the cost of maintaining that ‘cheaper’ code over time can be substantial – especially if that code is bug-ridden or results in performance overhead that unnecessarily bloats hardware (and potentially licensing) requirements. Similarly, if this same ‘chea'p’ code is poorly written against prevailing standards or ‘obtuse’ (i.e., to understand), that can mean that it’s much more expensive to modify and extend as needed – incurring ongoing, additional, cost in the form of technical debt.
On the other hand, developers that over-build and over-optimize fall prey to Donald Knuth’s statement that “premature optimization is the root of all evil” and can delay the adoption of new business initiatives to such a degree (or bloat capital expenditures such) that projects don’t ship on time or to the point where businesses can also lose significant money in terms of missed opportunity costs as well.
Coding is always about Trade-Offs
All of that said, while performance is a huge concern for most end-users, it makes sense that a given query or sproc might end up being just a BIT harder to maintain or extend because the developers (and/or DBAs) who wrote it decided to take a bit of a non-standard approach to coding the query at hand in order to make it 100x or even 1000x faster than it would have been if ‘ease of understanding’ and ‘ease of maintenance’ were the sole/only considerations. Likewise, as important as it might be to let overall code complexity take a bit of a ‘hit’ to ensure better performance in some cases, no performance benefit would ever be worth a security risk. Consequently, while there are trade-offs in ways that priorities can impact the overall quality of code, some things (security) should always be sacred.
So, while T-SQL code will always be complex (not just in terms of the technical complexity involved in correctly executing queries against non-trivial relational database management systems where a single screw-up can be fatal, but also in the sense that costs that might ruin one company are totally acceptable to another), I wanted to focus on just a single facet of ways to address complexity by addressing some basic guidelines and best-practices for using Parameters within SQL Server Stored Procedures.
Basic Guidelines and Best Practices for Parameters within Stored Procedures / Functions
So, without further ado, here’s a quick list (in no particular order) of suggestions and practices I’ve been promising myself forever that I would write based upon some of the things I’ve seen through the years when looking at “other people’s code”:
Start with Good/Specific Parameter and Variable names
Use good descriptive names for your parameters. @p1, @goat, and @x aren’t good names (and yes, I’ve seen all of these). Likewise, in some cases, @input or @value might be ambiguous enough in some procedures/functions that they’re problematic too. In other words, one great way to know if you’re naming your variables correctly (and the same thing goes with object-names) is that you don’t require any documentation to tell other developers or the DBA what the purpose of a parameter or variable is. Which is why @UserName is always going to be a better input parameter name than @input – or why @TemporaryTotal is always going to be a better variable name than @x.
Avoid Hungarian Notation
For all intents and purposes, my dislike of Hungarian notation is LARGELY personal – or stylistic. (I just hate it as a way to name objects or parameters/variables – as it hearkens back to a day before strong-typing was really an option.) That said, given that I just argued for using specific parameter names, it MIGHT be possible that something like @intTotalResultCount would be a good parameter name – as it’s more specific than @TotalResultCount. Only, I disagree – what I find is that useless crap (like usp, sp, tbl, prc, vw, int, vc, nvc, etc) just adds ‘noise’ to variable/parameter/object names and makes them MUCH less readable. That, and it also requires more TYPING – meaning that the chances of a typo are a tiny bit higher – or would be if it weren’t for IntelliSense. Only, IntelliSense also effectively renders Hungarian notation moot – as it always tells you the type of your parameter/variable anyhow. So, just don’t go there.
Declare Variables Late
For .NET and other application developers, there’s a tendency (based in best practices) to declare variables or new instances of objects as late as possible – to minimize the amount of time those objects are on the stack. With T-SQL that concern is actually moot – as all variables are scoped to the current batch – and take up the same amount of resources whether declared late or early. However, another big benefit of declaring variables late is that you keep them better contextualized within your code. Stated differently, one thing that I see very commonly in stored procedures is that developers will commonly declare ALL of their variables at the top of the stored procedure. I’m arguing that a better practice is to declare your variables BEFORE you use them – one by one – instead of ‘in bulk’ at the top of your sproc. Why? Because if you declare a bunch of variables at the top of a long/complex sproc, when it comes time to debug, you’ve got to pour over every line of code to make sure that @CriticalValue hasn’t been modified or tweaked through some different branch of code that you MIGHT not be thinking about when troubleshooting or debugging a stubborn logic problem. Whereas, on the other hand, if you only declare variables right before their used, you’re making problematic debugging much easier on yourself. (You will want/need to watch out for declarations within loops/cursors though – just to make sure you’re not accidentally re-setting these values back to their base/initial setting if that’s NOT what you want/intend to do.)
Don’t Overload Your Variables or Parameters
Recycling cans and bottles, printing less, and consuming less are all great and admirable traits. But when it comes to reduce, reuse, recycle – just don’t do it when it comes to parameters and variables. Doing so is one of the EASIEST ways to introduce expensive bugs and logic bombs into your code. So, don’t go there. Instead, live on the edge: declare a parameter for every purpose, and have a purpose for every parameter – but always keep that at a nice, clean, one-to-one ratio and don’t double-dip.
Don’t be Lazy with Data Types (Avoid Coercion)
While this tip is primarily intended for .NET developers and many ORMs, it also applies to straight T-SQL developers as well. Simply stated, don’t take the easy path when it comes to declaring variable types and sizes – always make sure that you’re as explicit as possible. Otherwise, if you take the easy approach, you may run into scenarios where SQL Server is NOT able to take advantage of perfectly viable indexes – causing ugly performance problems.
As an example, imagine a document management system, where there are 12 million rows of uploaded documents corresponding to about 8 million different accidents (i.e. assume that these are claims forms for car accidents) and that each ‘accident’ has an AccidentCode that’s a char(10). If you’ve got a stored procedure that takes an @AccidentCode nvarchar(30) parameter in as the input parameter that you want to be searching against, you’ll find that operations take significantly longer (and consume VASTLY more resources) than if @AccidentCode were a char(10) instead.
So, pay very careful attention to this – as it’s a well known SQL Server performance gotcha, and something that is easy enough to stumble into – especially when many ORMs or data-access-libraries for .NET apps commonly make it easy for developers to do something like:
Because, in that case – since the SqlDbType and Size parameters aren’t defined, many .NET ‘apps’ or libraries will either pass in @AccidentCode as either an nvarchar(30) or as an nvarchar() where the length is accident.Code.Length() – neither of which are what you want.
Don’t Reinvent the Wheel For Basic SQL Operations
One common, regular, problem that I see when auditing T-SQL code is a failure by many developers to grasp that while T-SQL has way fewer intrinsic features than say C#, it still covers all the basics. This is especially true when evaluating input parameters for whether or not they might be NULL. So, for example, I commonly see code like this way more than I should:
For starters, when it comes to working with Stored Procedure Parameters, many developers don’t realize that T-SQL lets you specify a default value for parameters when creating a stored procedure – such that your parameters will then be automatically set to a default value when no value has been specified:
Likewise, ISNULL is a great way to handle this same functionality when it comes to working with parameters or other values, and NULLIF is also a great way to check for NON-NULL strings (i.e. ‘’) in the form of:
So, stated differently, T-SQL may not have all the features and functions of other, more modern languages, but it still has all the bases covered to the point that there’s no need to re-invent the wheel for basic parameter evaluation and initialization concerns.
Related: T-SQL for Starters
If you’re looking for more guidance on basic coding practices and/or on best practices for working with T-SQL code, I can’t recommend the Code Complete and The Guru’s Guide to SQL Server Stored Procedures, Functions, XML, and HTML enough. Yes, they’re both GINORMOUS books (compared to this pithy blog post). But if you’re serious about coding foundations, you can’t go wrong with Code Complete and Ken Henderson’s book provides some great insights into how to apply industry-standard best practices for coding paradigms (such as patterns and idioms) to T-SQL coding.