Isolating Performance Problems

Downloads
9023.zip

Use these simple and readily available resources for pinpointing performance bottlenecks

Solving performance problems can be frustrating and time-consuming—especially when you can't pinpoint the problems. However, by using simple and readily available tools, you can quickly isolate many common performance problems. In this article, I outline several debugging techniques you can use to isolate performance problems in a SQL Server 7.0 database running on Windows NT. You can also use these debugging techniques in routine application performance tuning.

This article doesn't cover all performance-tuning methods or give solutions to all performance problems, but it does discuss the most common problems associated with application and database performance and how to solve those problems. These techniques have helped identify the source of a variety of application and database performance issues in large-scale production environments. For example, large banking and retail organizations have used these techniques to identify problems with memory constraints, missing or poorly designed indexing structures, faulty software design, and even SQL Server itself. Quickly identifying the source of an application problem lets you focus your efforts on solving the defects. These debugging techniques can significantly reduce customer support costs and problem resolution time, from days to hours in many cases.

Common Problem Areas

You can attribute most performance issues to either problems with application implementation or design or problems with the database server configuration. Let's look at how to isolate performance problems in the most common areas. First, you need to determine whether it's an application or a database problem. You then need to explore whether you have any CPU bottlenecks, memory constraints, table blocks or locks, or poor indexing.

To help analyze the problem, you need two main tools. First, you need a SQL Server query tool; I use SQL Server Query Analyzer for all the profiling code in this article. Second, you need access to the NT Performance Monitor, as Figure 1 shows. Performance Monitor is powerful and available for free as part of NT. Note that you should monitor a database server from an NT client workstation so that you don't effect the database server's performance by running Performance Monitor on it. If a client workstation isn't available, profiling from the server will skew the results slightly but will still give you valuable information. This article uses the following syntax to specify the counters you need to add to the Performance Monitor: Object Name>>Counter value.

Quick SQL Server Profile

When profiling any system, a quick analysis of the system's general health can give you a feel for whether problems are application related or database related. In addition, a quick checkup can help confirm reported performance problems. Listing 1's query provides this quick analysis by listing all processes currently executing commands in the database. Using a SQL query tool, execute Listing 1's query for the database that you're analyzing.

The query eliminates all activity performed by sa so that system processing doesn't cloud the results. The query also excludes the process that is running the query. To add back sa processing, remove the p.suid <> 1 statement from the WHERE clause. The results from this query should average fewer than two or three active processes. After executing this query multiple times, analyze the results. If the query consistently returns more than two or three processes, you're experiencing bottlenecks at the database. Such results suggest poor performance somewhere in the database because the database isn't keeping up with application requests. However, you can also trace such results back to improper application design—maybe excessive database queries—or even poor index design. Still, gathering database performance information is an important place to start.

CPU Monitoring

Not having enough or fast enough processors can also cause performance degradation. Although we're quick to blame CPU utilization for CPU problems, CPU utilization usually isn't the culprit. Eliminating the CPU up-front lets you focus on more likely performance problems. To analyze CPU utilization, add the following counters to the NT Performance Monitor chart:

Processor>>% Processor Time. % Processor Time is the percentage of time that the processor is executing a non-idle thread. This counter is a primary indicator of processor activity.
Process>>% Processor Time (SQL Server Instance). % Processor Time is the percentage of time that the selected process is executing a non-idle thread. This counter can be compared to the Processor %Processor Time to see whether this process is causing a bottleneck.
Server Work Queues>>Queue Length. Queue Length is the current length of the server work queue for this CPU. Consistent Queue Length greater than 2 can indicate processor congestion. On a multi-CPU machine, check this counter for each processor. If all processors indicate congestion, consider the CPU to be a bottleneck; otherwise, a process isn't multithreading. Verify the SMP settings, as I discuss in a moment, to further investigate the process causing the bottleneck.

Comparing these three performance statistics will give you a good overview of your CPUs' health. Processor percentages consistently over 90 percent point to bottlenecked CPUs. Ideally, processor percentages should be less than 75 percent. If you have high CPU values for the system and low values for the SQL Server process, you need to find the process causing the high CPU utilization. Many times, an application is causing the CPU bottlenecks. (Although increasing the number or speed of CPUs could improve performance, the CPU isn't necessarily the cause of the problem.) To isolate the offending application, check the Task Manager and continue to use the Performance Monitor.

If you're analyzing a multi-CPU machine, check whether all CPUs are in use by looking at the SMP concurrency value returned by sp_configure. The concurrency value reports the number of CPUs dedicated to SQL Server. You specify these settings by using SQL Server Enterprise Manager's database property screen. On a machine dedicated to SQL Server, you can use all processors without causing a server bottleneck. Note that if you need to change these settings, do so with caution on nondedicated machines and profile between changes to ensure that you don't negatively affect SQL Server or other processing on that machine. See SQL Server Books Online (BOL) for more information about SMP concurrency.

Memory Configuration

You also need to investigate memory utilization on your server. If you don't allocate enough memory to NT or to SQL Server, a paging situation can result. A paging situation occurs when not enough memory is available for the process, which must then rely on the swap file for additional memory storage. This situation increases disk I/O and severely degrades performance. You use SQL Server 7.0's dynamic memory configuration to configure memory. In addition, you can use several techniques to determine whether inadequate memory is causing a performance problem.

First, you need to use the sp_configure command to determine current memory settings for SQL Server. Verify memory on the machine by checking the System Settings on the Control Panel. In my experience, you need to allocate 25MB to 50MB of memory for NT. Values less than that can starve NT for memory and lead to excessive page swapping. Additional indicators that NT is starved for memory are when you receive Out of Virtual Memory pop-up error messages, SQL Server doesn't start, or SQL Server doesn't accept additional connections. On a dedicated SQL Server machine, you should give SQL Server all memory not assigned to NT. If NT memory settings are low, consider NT as a possible bottleneck, and try to decrease SQL Server memory.

You can track the following Performance Monitor counters to gather page-swapping data for the SQL Server machine:

SQL Server Memory Manager >>Total Server Memory (KB). This counter shows the total amount of dynamic memory the server is currently consuming.
SQL Server Buffer Manager>>Page Writes/sec. This counter shows the number of physical database page writes the server issued.
SQL Server Buffer Manager>>Page Reads/sec. This counter reports the number of physical database page reads the server issued.
Process>>Page File Bytes>>SQL Server Process. Page File Bytes is the current number of bytes this process has used in the paging files. Paging files store pages of memory that the process used and that aren't in other files. All processes share paging files, and lack of space in paging files can prevent other processes from allocating memory.
SQL Server Buffer Manager >>Checkpoint Writes/sec. Checkpoint Writes are the number of pages flushed by the checkpoint or by other operations that require all dirty pages to be flushed.

These statistics help identify page-swapping problems. Although you'll always have page writes during a checkpoint, monitoring the checkpoint writes statistics helps identify any anomalies. If you do have excessive page-swapping, adding more memory to the server could help performance.

Inadequate memory is the most common cause of severe performance degradation for customers I work with; and removing memory constraints can be the single most effective way to get better performance. Table 1 shows the results from a performance test that compared the data access performance of a system with 64MB of RAM versus a system with 128MB of RAM. I performed these tests on a 500MHz Pentium II running NT and SQL Server 7.0 Service Pack 1 (SP1). I loaded a table with 1 million rows of random data. Two columns were foreign keys, and I indexed one of these columns. The test selected one row in the table by the indexed column and one row by the nonindexed column. Table 1 shows the averaged results of 50 executions of the test, with the 128MB of RAM dramatically improving performance for selecting by both the indexed and nonindexed columns.

Table Blocks and Locks

Table locking and blocking are typical effects of concurrent data access implementations in databases. However, excessive table locking can cause performance bottlenecks. To analyze blocking and locking in the database, create the LOCKPROFILE stored procedure that Listing 2 shows, then execute the stored procedure. This procedure returns the processes being blocked, the processes causing the block, the locked objects causing the blocks, the statement that the blocked process is executing, and the statement that the blocking process is executing.

You can also monitor table-lock escalations from the Performance Monitor. To profile lock escalations, you can use the Performance Monitor's SQL Server Access Methods>>Table Lock Escalations/sec counter. This counter reports the number of times the database escalated locks on a table; an excessive number could signal an application design problem. As a general rule, we consider a Table Lock Escalations/sec. value of 25 percent of the table accesses as excessive. Excessive table-locking causes semaphore-like single-threading of application processes through the locked database table. A semaphore intentionally blocks all processes, letting the system execute only one process at a time. If you have excessive table-locking, you should probably consider tuning your application. However, keep in mind that lock escalation in SQL Server is a complex algorithm that is determined by many factors, such as isolation level, available memory, and active users. However, the results from the LOCKPROFILE stored procedure can get you started by helping you isolate the offending application code.

Investigating Indexes

Inefficient indexing on frequently used tables also causes performance bottlenecks. In Table 1, you can see that selecting data by an indexed column is much faster than selecting by a nonindexed column. To get a profile of how your applications are accessing data, you can monitor the following Performance Monitor counters:

SQL Server Access Methods >>Full Scans/sec. This counter shows the number of unrestricted full scans. These can be either base-table or full-index scans.
SQL Server Access Methods>>Index Searches/sec. This counter shows the number of index searches SQL Server is performing. SQL Server uses index searches to start range scans, to retrieve a specific record by using one index, and to reposition within an index.

A large number of table scans can point to a cause of your performance problems. You can reduce the number of table scans by first identifying the tables SQL Server is scanning (using procedures I'll cover in a moment) and then modifying the database to use indexes. For the applications I work with, optimal indexing performance occurs when you're accessing data about 90 percent through indexes and 10 percent through table scans.

If you suspect inefficient indexing is causing performance problems, you can use the code in Listing 3 to display the SQL statements currently executing. Execute long-running select statements in the Query Analyzer, making sure that the Show Query Plan option is turned on or that the graphical execution plan utility is running. Running a query with an execution plan on doesn't execute the query but instead displays how the query will be resolved when executed. As you look at the executive plan, ask yourself the following questions: Are the keys in the WHERE clauses indexed and is SQL Server using them, or is the system performing table scans? Are the frequently used key columns indexed? If not, consider testing performance with key columns indexed. Is SQL Server not using the indexes? You can run the command

DBCC SHOW_STATISTICS (table name, index name)

to see statistical information for the selected index. Also, run the UPDATE STATISTICS table command, which updates the distribution information that SQL Server uses to determine whether to use an index. After the update, display the execution plan again to note any improvements. You can use SQL Profiler to perform additional index and query profiling, and you should consult the SQL Server Index Tuning Wizard to help in further index design.

Other Tools

In addition to the counters I mentioned earlier, you might also want to check out the Performance Monitor's SQL Server Buffer Manager>>Free Buffers counter, which shows when SQL Server isn't finding data pages in memory and is having to access them from disk, and many of the Physical Disk object counters, which also monitor the amount of memory versus disk activity. To monitor the disk counters, you must first enable them by executing the diskperf –y command at a command prompt.

Comments

Plain text