Welcome to T-SQL 101. Over the course of the next 10 lessons, you’ll learn the basic principles to follow when writing queries with T-SQL. By the end of the course, you’ll be able to not only write query statements to view and manipulate data but also write custom stored procedures and functions. To start you on this journey, let’s look at what T-SQL is, the query tools you use with it, and how to write the SELECT statements that the query tools use to retrieve the data you need.
What Is T-SQL?
In the early 1970s, IBM developed SQL (short for Structured Query Language) for use in its original relational database product named System R. SQL has one purpose: to query and manipulate data.
T-SQL (short for Transact SQL) is an extension to SQL that adds extra programming functionality and control mechanisms. T-SQL is the query language used by not only Microsoft SQL Server but also other database products, such as Sybase’s Adaptive Server Enterprise (ASE).
The Query Tools
Depending on which version of SQL Server you’re running, you’ll have different query tools available. In SQL Server 2000, the tools are Query Analyzer and Enterprise Manager. Query Analyzer’s primary purpose is to execute the T-SQL commands that you write. Enterprise Manager is more of an administrative tool. It has excellent built-in query-building tools, but they can be overwhelming for uninitiated users.
In SQL Server 2005, SQL Server Management Studio (SSMS) replaces Query Analyzer and Enterprise Manager. The component of SSMS that you use to execute queries is called the Query Editor. All the T-SQL commands presented in this course will work in both SQL Server 2005’s Query Editor and SQL Server 2000’s Query Analyzer.
Retrieving Data with SELECT
A database stores data in tables. Data in a table is broken into columns and rows. An Employee table, for example, would contain information about employees. The columns in the Employee table might include EmployeeNumber, FirstName, LastName, Age, and Salary. A row of data in the Employee table would contain data pertaining to a specific employee.
The data stored in a database is useless without the ability to retrieve and view it. T-SQL’s SELECT statement is what you use to get data out of the database and onto your screen. Querying databases with a SELECT statement is like shopping for groceries with a grocery list. Think of the database as a grocery store in which the database tables are the shelves and the data is the items on the shelves. Your assistant (i.e., Query Editor or Query Analyzer) will be doing the grocery shopping for you, so you write a grocery list (i.e., a SELECT statement) that describes the items you need and contains instructions on where to find those items on the shelves and how you want the items delivered. Your assistant goes to the grocery store and diligently travels up and down the aisles, searching the shelves for the items that you specified, carefully following your instructions. The assistant also follows your instructions on how to pack the items in a bag (i.e., a result set) before delivering that bag to you. If all the items were in stock, your bag will contain everything you need.
Now that you know how the SELECT statement is used, let’s take a look at what it contains. A basic SELECT statement consists of two parts and looks like
SELECT Part1 FROM Part2
Part1 is where you specify the columns that contain the data you need. Part2 is where you specify the table or view containing those columns. The columns that you select must exist in specified table or view; otherwise, an error will result.
The simplest form of the SELECT statement tells the Query Editor or Query Analyzer to give you the data in all the columns in the specified table or view. For example, if you type
SELECT * FROM sysobjects
in the Query Editor or Query Analyzer, highlight the code, then press CTRL-E, you’ll see all the data stored in the sysobjects table in the result pane. The asterisk (*) is a wildcard that tells the Query Editor or Query Analyzer to retrieve the data from all the columns in the table.
In the instructions I just gave, note that I had you highlight the code prior to executing it. Highlighting a code snippet ensures that only that code is executed. Otherwise, all the code in the query window will be executed. Also note that I told you to press CTRL-E. Alternatively, you can press F5 or click the Execute button on the toolbar to execute the code. I prefer using CTRL-E, though, because it goes hand-in-hand with CTRL-D and CTRL-T, which let you put the results in a grid format or text format, respectively. Sysobjects is a system table that exists in every SQL Server database and contains entries for each database object within that database. If you don’t specify a database when establishing a connection to SQL Server, you’re automatically connected to the default database, which is the master database (unless you changed this default). For the purposes here, I’m assuming you’re connected to the master database. Typically, you wouldn’t spend much time querying tables in the master database. However, since all instances of SQL Server have a master database, it serves as a good place to run the sample queries.
Ordering and Renaming Columns
If you don’t want data from all the columns in a table, you can specify the column or columns you want. When you have multiple columns, you need to separate them with commas. The order of the columns in the SELECT statement determines the order of the columns in the result set. For example, when I run the query
SELECT name, type, crdate FROM sysobjects
I get the result set that Figure 1 shows. For space reasons, I included only three of the returned rows and trimmed the length of the columns. (I also shortened and trimmed the data in the other sample result sets shown here.) If you try this query, don’t be surprised if you get different results. The data returned depends on which SQL Server version you’re using and what your tables contain. Having different data doesn’t matter at this point because the focus is on the mechanics of the query itself and not on the details of the data returned.
If you want the result set to display different names for the columns, you can use the AS clause in a SELECT statement. For example, if you want to retrieve the name, type, and crdate columns but want to name them ObjectName, ObjectType, and Creation- Date, respectively, you can run the query
SELECT name AS 'ObjectName', type AS 'ObjectType', crdate AS 'CreationDate' FROM sysobjects
Figure 2 shows the results from this query.
Filtering Results with WHERE
If you don’t want all the data from a column returned, you can use a WHERE clause with operators to filter the result set to get exactly what you need. For example, in the last two queries, all the data (which in this case are objects) in the three columns are being returned. If you want only objects of type U (which stands for user table) returned, you’d run the query
SELECT name AS 'ObjectName', type AS 'ObjectType', crdate AS 'CreationDate' FROM sysobjects WHERE type = 'U'
In this query, the = operator is being used to see whether each value in the type column is equal to U. If so, that object is returned in a result set that will look something like the one in Figure 3. The = operator is one of several comparison operators you can use in a WHERE clause, as Table 1 shows.
You can combine comparison operators with logical operators to further filter result sets. Table 2 outlines the logical operators that you can use in a WHERE clause. For example, suppose you want to see all objects except stored procedures (which have the object type of P) and system tables (which have the object type of S). You can use the and AND operators in the query
SELECT name AS 'ObjectName', type AS 'ObjectType', crdate AS 'CreationDate' FROM sysobjects WHERE type = 'P' AND type = 'S'
As Figure 4 shows, the returned result set contains only those objects that meet the criteria.
Let’s look at a more complex example that combines multiple comparison and logical operators. Suppose you want to see all the stored procedure objects whose names begin with sp. You also want to see all the stored procedure objects that were created in the year 2000. To get this data, you can run the query
SELECT name AS 'ObjectName', type AS 'ObjectType', crdate AS 'CreationDate' FROM sysobjects WHERE (type = 'P' AND name LIKE 'sp%') OR (crdate BETWEEN '2000-01-01' AND '2000-12-31')
The first component—(type = ‘P’ AND name LIKE ‘sp%’)—selects only those objects that are stored procedures (type = ‘P’) and have names beginning with sp (name LIKE ‘sp%’). The second component—(crdate BETWEEN ‘2000-01-01’ AND ‘2000-12-31’)—selects any object that was created between January 1, 2000, and December 31, 2000. Note that parentheses separate these two components, which are part of a large OR operation. The parentheses indicate which logical operators to evaluate first. Figure 5 shows sample results from this query. If you execute this query, consider running it against one of your own databases so that the results will be more meaningful to you.
Sorting Results with ORDER BY
Besides using a WHERE clause in a SELECT statement to filter results, you can use an ORDER BY clause to sort results. By default, results are sorted in ascending order (i.e., lowest value to highest value). If you want the results sorted in descending order (i.e., highest value to lowest value), you must specify the keyword DESC. Optionally, you can use the keyword ASC to explicitly specify that you want the results sorted in ascending order.
For example, suppose you want a query’s returned objects to be sorted alphabetically by object type, then by their creation date, with the most recent date first. The query would look like
SELECT name AS 'ObjectName', type AS 'ObjectType', crdate AS 'CreationDate' FROM sysobjects ORDER BY type ASC, crdate DESC
and the result set would look like that in Figure 6.
You shouldn’t use the ORDER BY clause in views, derived tables, inline functions, and subqueries, unless you also use the TOP command. Otherwise, you might get an error message. (If you’d like to know why the sometimes error occurs, see the Microsoft article at support.microsoft.com/kb/841845.)
Limiting Results with TOP
To reduce strain on the server and network resources (not to mention your own workstation), you should limit the number of rows returned in a result set, especially if you’re querying a large database just to explore it. You can use the TOP command to limit result sets. For example, the query
SELECT TOP (5) type AS 'ObjectType', crdate AS 'CreationDate' FROM sysobjects ORDER BY crdate DESC
displays the first 5 rows returned by the ORDER BY operation. Alternatively, you can have the TOP command return a percentage of rows. For example, if you want to display the top 5 percent of the rows returned by the ORDER BY operation, you’d use the query
SELECT TOP (5) PERCENT type AS 'ObjectType', crdate AS 'CreationDate' FROM sysobjects ORDER BY crdate DESC
The Basics Revealed
In this lesson, I covered the basic components in a simple SELECT statement. You learned how to specify the columns that contain the data you need and how to specify the table containing those columns. You also learned how to order and rename columns and how to filter and sort results. Armed with this information, you’re ready to write your own queries and explore your own databases. Be careful, though, when querying that table with 100 million+ rows. If you don’t remember to limit the number of returned rows like you learned to do, you’ll find yourself under the watchful eye of the friendly neighborhood DBA.