How to Return All Columns from a Table's Row When Calculating Aggregations


I have a table containing student exam grades. To create the Grades table and populate it with sample data, I can run the script that Listing 1 shows. Then, to get the highest exam grades for each student in each course, I can run the script that Listing 2 shows. Figure 1 contains the data that Listing 2 returns. However, I want to go one step further and retrieve whole rows from the Grades table, including the exam number (1st, 2nd, 3rd, and so on) and extra Grades table columns. (For the sake of simplicity, I didn't include the extra columns in the CREATE TABLE statement or output figures.) If a student has more than one exam in the same course with the highest grade, I want to use the exam number as a tiebreaker. In other words, I want to return the row with the most recent exam (highest exam number) out of the exams with the highest grade per student per course. How can I generate that output?

You can approach this problem in several ways. I discuss two methods: using correlated subqueries and using joins. Let's start with the correlated-subqueries approach. First, you need to retrieve the highest grades in all exams for each student in each course. For that task, you can use a filter in the WHERE clause that checks that a grade is equal to the highest grade of all the Grades table's rows that have the same course and student that the current row contains. This filter uses a correlated subquery, which Listing 3 shows. Figure 2 contains Listing 3's output. Note that in Figure 2, both Michael and Ina have two exams with the highest grades in the same course. To further filter the output so that out of the exams with the highest grades for the same student in the same course, only the grade with the highest exam number will be returned, you can add another correlated subquery, as Listing 4 shows. The code in Listing 4 returns the output that Figure 3 shows. By combining both subqueries in the WHERE clause, you return the desired results.

The second approach to solving this problem is to use joins instead of subqueries. As you'll recall, Listing 2 shows how to retrieve the highest grade for each combination of student and course. You can return the highest grade by turning Listing 3's query into a derived table and joining the resulting query to the original Grades table, as Listing 5 shows. Listing 5 returns more than one row for each student in each course when a course contains more than one row with a student's highest grade. The output is the same as in Figure 2. Listing 6 shows the query that returns the highest exam number for each combination of student, course, and grade. By turning the query that Listing 6 shows into a derived table and adding it to Listing 5's query, you produce Web Listing 1. Web Listing 1 joins the Grades table and the two derived tables to return the result that the reader seeks: the highest grades for the highest exam numbers per student per course. The output is the same as in Figure 3.

After applying the appropriate indexes, I compared the performance of the correlated subqueries and the joins solutions and found that the subqueries solution performed a bit better than the joins solution. Because the difference isn't significant, you can implement the solution that you feel more comfortable with.

Hide comments


  • Allowed HTML tags: <em> <strong> <blockquote> <br> <p>

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.