Benchmarking Relational-Division Solutions


Let's compare the performance of the various solutions I discuss in the main article. First, use the script that Listing A shows to populate the tables with test data before you measure each solution's performance. Next, set the @numcourses value, which determines the number of courses to populate. Then, set the @nummaterials value, which determines the number of distinct materials. I used the same values for @numcourses and @nummaterials because I felt this distribution closely resembles reality. I chose six materials per course because the script populates a random number of materials between 1 and 10 for each course—an average of 5.5 materials. I ran six tests that used the following values in @numcourses and @nummaterials: 100, 1000, 2000, 3000, 10,000, and 100,000. Multiplying those values by 5.5 gives you an estimate of the number of rows in CourseMaterials. Table A shows the results in seconds for each solution by number of courses.

Note that the pure set-based solutions don't scale well, and they perform reasonably well only when the number of courses is very small. The array-based solutions scale well with and without temporary tables. When the population tops 10,000 courses, only the array-based solutions that use temporary tables provide reasonable performance.

Hide comments


  • Allowed HTML tags: <em> <strong> <blockquote> <br> <p>

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.