A dual-core processor is a single chip that contains two processor cores. It's the equivalent of having multiple processors on your motherboard, except that a dual-core processor lets you have two processors even if your motherboard has only one socket. Windows Task Manager will display two processors, and applications will truly execute in parallel instead of forcing the OS to rapidly switch between tasks.
Dual-core technology is great if you want to upgrade to multiple processors without replacing your entire system. But what if you're choosing between dual-core processors and a two- or four-socket system? Many factors (e.g., memory bandwidth, latency) affect processor performance. Do dual-core processors really perform as well as two- or four-way single-core machines with an equivalent number of cores running at the same clock speed? The answer, of course, depends on your application. I tried a few common applications on equivalent single-core and dual-core configurations and found that AMD's dual-core processors performed somewhat better than their single-core counterparts.
A Complicated Problem
When considering a dual-core system, remember that AMD dual-core processors are priced to match their higher-clock-speed single-core counterparts. For example, both the 2.0GHz Opteron 870 and the 2.8GHz Opteron 854 cost $1514. The idea is that because multithreaded applications use both processor cores simultaneously, multiple cores will more than make up for the lower clock speeds. But processor technology more complicated than that. For example, Opterons all have built-in memory controllers and each socket on a multiway motherboard can have its own memory. Therefore, the cores in a dual-core processor would share access to memory, whereas two single-core processors would each have their own memory. Dual-core processors therefore have lower memory latency but also lower memory bandwidth.
To test the effects of two cores on a single chip, I compared two four-core configurations. I used a NEWISYS 4300-E server on loan from AMD. The 4300-E is a four-way system, and each socket has its own memory. I tested four single-core Opteron 848 processors against two dual-core Opteron 875 processors. Both configurations had four processor cores, all running at 2.2GHz.
Synthetic vs Real
By using synthetic benchmarks, you can easily see that dual-core processors have a memory-bandwidth disadvantage. I tried a few synthetic memory-bandwidth benchmarks and found that my dual-core setup had anywhere from 75 percent to 50 percent the memory bandwidth of the equivalent single-core setup, depending on the tool and test used. To determine how this and other factors affected the performance of real-world applications, I tried a few real-world processor-intensive tasks: encoding an MP3 file, encrypting a file, calculating Pi, and performing a join in Microsoft SQL Server.
My dual-core configuration performed as well or slightly better than my single-core configuration for all four of the processor-intensive applications that I tried. To ensure that the processor was the bottleneck for each application, I ran Performance Monitor and verified that all applications throttled my CPUs at 100 percent and kept the processor queue length greater than 10 for their duration. I also verified that none of the applications caused % Disk Time to spike or used a significant portion of available memory.
I tested each application on all four processors simultaneously. SQL Server is multithreaded, and because the nested-loop join I performed is a parallelizable operation, it used all four processors. (I've simplified this matter the sake of clarity. The operation may or may not use all the processors; it depends on the query.) I used VBScript to kick off four instances for the other applications simultaneously, to use all four processor cores. I ran the SQL Server join four times and took the average. For each of the other applications, I took the average of the time each process took to complete across the four CPUs.
For the MP3 file encoding, I encoded a 50MB WAV file to MP3 format by using the LAME MP3 encoder. For the encryption operation, I encrypted a 50MB file by using Gnu Privacy Guard (gpg) with a 1024-bit DSA encryption key. I calculated Pi to 4 million digits through a free program that implemented Chudnovskys's formula. And for the SQL Server join operation, I performed a join between a 100,000-row table and a 10,000-row table filled with randomly generated data. The join was on an integer and a float field in each table, combined with an Or clause in the join. The Or clause forced SQL Server to choose a processor-limited nested-loop execution plan. See Web Table 1 for my results.
Dual Core: A Good Option
Choosing new hardware for an application that's running slow is no simple matter. Before buying anything, make sure you've thoroughly tested the application to discover whether disk, memory, network, CPU, or other factors are the bottleneck. If CPU is your limiting factor, dual-core CPUs might be a good option. In my testing, the dual-core systems did outperform the single-core systems for CPU-intensive tasks, but that might not be the case for all applications.