Skip navigation
Sharpen Your Basic SQL Server Skills - 24 Apr 2008

Sharpen Your Basic SQL Server Skills - 24 Apr 2008

Learn how to join a table to itself using a self-join

Q: What is a self-join?
A: A self-join is a standard SQL Server join that joins a table to itself. You can perform a self-join when all the data you require is in a single table, and the data you need to extract relates to other data in the same table. You join a table to itself using a field or combination of fields that have matching data in different records. To perform a self-join, you must use a table name for each instance of the table, so that each instance has a separate name. Use a self-join when you want to compare values in a column to other values in the same column. The data type of the columns on which the join is made must be the same or you’ll need to cast them to the same type.

For example, in employee information records, each record might have a field for an employee’s ID number and also a field for the supervisor’s ID number. To retrieve a list of employees and their managers you need to join the employee information table to itself.

Q: Is a self-join an inner join or an outer join?
A: A self-join can be either an inner join or an outer join. Let’s look at some examples that show how using either an inner join or an outer join affects the self-join.

First, run the following code to create a table called SaleOnDate with two columns, DateOfMonth and Sold- Qty, and populate it with sample data:

USE TestData
  GO
  CREATE TABLE SaleOnDate (DateOfMonth INT,
  SoldQty INT)
  GO
  INSERT INTO SaleOnDate (DateOfMonth,
  SoldQty)
  SELECT 1, 100
  UNION ALL
  SELECT 2, 150
  UNION ALL
  SELECT 3, 200
  UNION ALL
  SELECT 4, 225
  UNION ALL
  SELECT 6, 250
  UNION ALL
  SELECT 7, 280
  UNION ALL
  SELECT 8, 300
  GO

Figure 1: SaleOnDate table

The SaleOnDate table contains information about the day of the month and the quantity of items sold on that day. Figure 1 shows the contents of this table, which you can see by running the following query:

SELECT * FROM SaleOnDate

Next, run the following query to find the difference between a named day and the next day:

SELECT sd1.DateOfMonth, (sd2.SoldQty -
    sd1.SoldQty) DifferenceInSale
    FROM SaleOnDate sd1
    INNER JOIN SaleOnDate sd2 ON sd1.
    DateOfMonth + 1= sd2.DateOfMonth
    GO

This query demonstrates the use of an inner join when joining the SaleOnDate table to itself. As you can see in Figure 2, the result set doesn’t contain information about days 4 and 8, because days 5 and 9 don’t exist in the database. This is because the conditions specified in the join clause haven’t been met.

Figure 2: Result set for inner join of SaleOnDate

Next, let’s look at the use of a left outer join, also called a left join, to join the SaleOnDate table to itself. The following code demonstrates the use of a left join when joining the SaleOnDate table to itself:

SELECT sd1.DateOfMonth,
    ISNULL(CAST((sd2.SoldQty - sd1.
    SoldQty) AS VARCHAR(10)), ‘No Data’)
    DifferenceInSale
    FROM SaleOnDate sd1
    LEFT JOIN SaleOnDate sd2 ON sd1.
    DateOfMonth + 1= sd2.DateOfMonth
    GO

The result set, shown in Figure 3, now contains information about days 4 and 8. Because days 5 and 9 don’t exist in the database, the result set contains NULL for days 4 and 8. NULL values are displayed as No Data because the ISNULL function was used in the SELECT clause.

Figure 3: Result set for left outer join of SaleOnDate

Hide comments

Comments

  • Allowed HTML tags: <em> <strong> <blockquote> <br> <p>

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.
Publish