Choosing the right storage system is critical for a successful Exchange Server 2007 deployment. Exchange Server supports three primary types of storage technologies: DAS, SAN, and the iSCSI protocol. There are advantages and disadvantages to each of these storage management options, but all three options are based on simple spinning disks.
Microsoft offers several tools that can help you determine your storage needs, such as the Exchange Storage Calculator and Jetstress to help you quantify your needs. Other free tools are also available, such as Iometer. Properly identifying how you'll use your storage before you commit to an option can save you time and money.
DAS has been around for decades and is a common choice for Exchange Server storage in small-to-midsized businesses, and is increasingly common in enterprises as well. When considering DAS for your Exchange environment, you should investigate the various RAID array options and the importance of multiple disks.
SANs are dependable and scalable centralized data storage resources. SANs operate on their own independent networks and generally use connections based on Fibre Channel (FC) to communicate between various disks and connected hosts. Their drawback is that they're more expensive and more complicated than other options. Software for SANs tends to cost more than that for DAS arrays and is usually packaged separately from the hardware, while DAS arrays often include a set of utilities.
iSCSI is a storage protocol used to connect to a network device that moves storage-related data. It allows clients to send SCSI commands to remote, consolidated storage targets (or disk arrays) in the same way the client can interact with a locally attached disk. A common misconception is that you can connect iSCSI over your existing LAN infrastructure. Although this is technically possible, it isn't recommended. iSCSI devices are less expensive than FC devices, but you should still use dedicated hardware and cables. At the very least, you should have a dedicated virtual LAN and keep your devices relatively close together—running iSCSI over a WAN isn't a good idea.
NAS vs. iSCSI
iSCSI is sometimes incorrectly referred to as NAS. Although iSCSI storage systems are connected to a TCP/IP network like NAS, iSCSI isn't the same as traditional NAS. NAS is a type of device rather than a protocol. NAS uses standard network protocols, such as Server Message Block (SMB) and the Microsoft Windows Network, to emulate a storage device. iSCSI is a true storage protocol that is supported for Exchange deployments.
Traditional NAS is no longer supported in Exchange 2007, as stated in the Exchange team's blog. Even in Exchange Server 2003 installations, Microsoft supports only the use of Microsoft Windows Hardware Quality Labs qualified NAS storage devices. I haven't personally had good experiences with any NAS products; if you're determined to choose NAS, check with your vendor to ensure the end-to-end solution is designed for use with Exchange 2003.
DAS: Cheap and Easy
There's a good reason DAS is so common for Exchange Server storage—it's the cheapest and offers the best performance of the three approaches presented here. DAS uses a single host and is best for small to mid-sized companies, but poses significant challenges when scaling to large numbers of users. If you need to add spindles for extra space and performance, DAS solutions may not expand as easily as SAN solutions.
How the DAS solution will be managed long term should be another key part of your decision process. Keep in mind that there are several factors involved in calculating the cost of storage, and capital expenditure is just one—DAS may seem cheap up front, but remember to take into account the cost of managing it long term.
In a DAS environment, more spindles equals higher I/O operations per second (IOPS) capacity and better IOPS performance. The size of the individual disks affects data storage, but your I/O rate will be significantly better with a RAID array of ten 180GB drives than with an array of six 300GB drives. Capacity and I/O throughput don't increase at the same rate.
Using a RAID array with fault tolerance can increase your financial overhead costs and decrease performance, but you can add additional disks to an array to speed up your I/O on top of the added tolerance. Using such an array is well worth the investment if you want to avoid permanent data loss and potential downtime to the end user.
SAN: Costly, Complex, and Extremely Capable
SANs are the best centralized data storage option if cost and complexity aren't issues for you. SANs have high data throughput and excellent fault tolerance—not only are the disk arrays fault tolerant, the connections themselves are fault tolerant as well. Multiple data paths ensure that there's always a route to the required storage. SANs are also very scalable—they can have multiple hosts and you can add new volumes or expand existing storage as you need it.
SANs have traditionally been used with FC connections, but FC is usually a problem for smaller businesses because it's expensive and requires complex configuration. In my experience, there are very few resources in the form of well-trained engineers. Employing these engineers on a full time basis to help with your storage infrastructure may be costly, and the learning curve for deployment and maintenance is massive. SAN vendors usually have services to assist in configuration and initial setup, but these services can be costly, especially if you have a changing environment.
Unfortunately, there aren't any comprehensive, cross-platform tools that will manage all brands of SAN devices, so to minimize the probability of complications, it's best to stick with a specific product set and a single vendor. This will make life easier on you and your Exchange team, because each SAN vendor implements slightly different configurations for creating and presenting storage to hosts. Plus, the SAN itself will likely have to deal with various servers running different OSs.
Many companies use a SAN for applications other than Exchange, such as SQL Server or SharePoint, in addition to Exchange. Using a SAN for multiple applications can reduce the cost of SAN storage for Exchange by spreading the costs around, but in my experience, most Exchange deployments should use disks that are separate from those used by other applications. I've seen plenty of customers try "sharing spindles" only to find that their SANs couldn't handle Exchange's I/O plus the other I/O on the same set of disks.
SAN with iSCSI: An Alternative to FC
iSCSI SANs are a less expensive alternative to FC, and they work for both large and small companies. iSCSI SANs are configured from the host or device itself, so they're simple to configure, too. iSCSI SANs are comparable to FC with regard to data transport security, because connections can be authenticated or encrypted as long as both the initiator and target systems support the required protocols. The drawback is that iSCSI isn't always as fast as FC, but despite this, iSCSI's versatility and lower price tag are making it more popular.
You'll probably want to use a hardware-based iSCSI or TCP/IP Offload Engine (TOE) adapter to optimize performance. Using a TOE iSCSI adapter allows much of the communication process to be handled by the processer and memory in the adapter itself, unlike software-only iSCSI solutions, which can get bogged down in heavy load environments.
RAID is an all-encompassing term for data storage schemes that divide and duplicate data across multiple disk drives. The data is spread across the array of disks, but users and the OS see the array as one entity.
When selecting and designing a RAID solution for your Exchange server, keep in mind the amount of disk space you'll need and the amount of rebuild time your company can withstand if something happens to your array. A RAID array with larger disks will take longer to rebuild than one with smaller disks, and one with Serial ATA (SATA) disks will take longer to rebuild than one with Serial Attached SCSI (SAS) disks. Adding more disks to the array will increase build times as well. There are several levels of RAID that can be used for Exchange Server.
RAID 0 arrays split and distribute, or "stripe," data across several disks. In a RAID 0 array, you can use the full storage capacity of all of your disks—two 300GB drives will give you a total storage capacity of 600GB. RAID 0 arrays offer good performance because they can write simultaneously to each drive, but their drawback is that they don't offer fault tolerance. If one disk in a RAID 0 array fails, the entire array is destroyed and you'll lose all data on all disks. Because of this risk, RAID 0 isn't recommended for use in any business-critical capacity.
RAID 1 is a mirrored, or duplicated, set of disks. RAID 1 setups are composed of an even number of disks, and all data written to one drive is also written to another, so if one disk fails, you won't lose any data. RAID 1 is a good choice for OS partitions and Exchange database logs, and is the most common RAID level used for transaction logs. A drawback of RAID 1 is that while some RAID 1 configurations allow simultaneous reading from two disks at once for improved read times, mirrored arrays are no faster at writing than using a single disk. Also, in RAID 1 configurations you can only use half the capacity of your disks, because all data is written twice.
RAID 0+1, as the name implies, is a combination of both RAID 0 and RAID 1 arrays, where data is both striped and mirrored to provide fault tolerance and performance improvements. RAID 0+1 setups are fairly expensive because of the duplicate disks (they require at least four disks—two striped disks and their mirrored duplicates), but the improved fault tolerance and increased speed is usually worth the additional cost and complexity. 0+1 is the most common RAID level for Exchange databases.
RAID 5 is another fairly common option for use in Exchange Server 2007. RAID 5 is a set of striped disks that relies on parity to protect you from data loss. Data is spread across all the disks in a RAID 5 array, and if one disk fails no data is lost, but no disk is a duplicate of another in the array. The performance of RAID 5 is about a third of that of RAID 1 and 0+1, because each write to the OS requires three writes to disk. The storage capacity of a RAID 5 array is reduced—the total capacity of a RAID 5 array is equal to the capacity of all the disks in the array minus the capacity of one of those disks. That reduced capacity combined with its slower performance may make another configuration preferable to RAID 5.
To determine which RAID level is best for your environment, you should assess what the impact of RAID will be on the overall IOPS capacity of your drive system. RAID 1 and 0+1 don't have any impact on the drives' ability to handle IOPS, but RAID 5 will produce only one third the performance of the same drives in a non-RAID 5 configuration. During the design phase, it's important to evaluate all of the RAID options or work with someone who can guide you through the complexities of storage architecture design. Be sure to take advantage of tools from storage vendors and the tools I mentioned in the introduction to test your storage systems and make sure they're meeting your requirements.
Configurations to Fit Your Business Needs
The most important consideration when selecting an Exchange Server storage configuration is to make sure it will fit your business needs. Each option has its benefits and drawbacks. For example, don't use an unsupported version of NAS. If you have a limited storage budget, don't opt for an FC SAN. Don't forget to test everything before putting your new disk system into a production environment. I recommend using Jetstress to make sure everything works before you take the leap and make your system live.
You're likely to find that one of the options discussed will provide a workable solution to fit the performance, scalability, and budget requirements of your environment, regardless of what version of Exchange you're running. I'd like to hear how you've managed to circumvent tricky environments to improve your Exchange storage infrastructure, so feel free to drop me an email.
Sidebar: Exchange 2010 Considerations
Exchange Server 2010 has major improvements in the way it utilizes the storage system. In Exchange Server 2007, it was common for customers to use high-speed FC or SAS disks because of the IOPS capacity they provided. Exchange 2010 has a significantly lower IOPS profile than previous versions, so you may not require FC or SAS disks for performance any longer.
Databases Availability Group (DAG), a new feature in Exchange 2010, allows for configurations that don't require RAID on the server. A configuration known as Just a Bunch Of Disks (JBOD) has been added to provide for additional customer choice. JBOD is accomplished by application-level database replication as part of DAG's functionality.
More than ever, it's important when using Exchange 2010 to take the time to estimate the correct amount of memory and IOPS load/capacity needed for deployments. Exchange 2010 has much higher memory requirements for large mailbox support than past versions, and all Exchange 2010 designs should use the Microsoft best practice guidelines and test tools for final validation.
As more and more applications take advantage of Exchange, it's common to see Exchange deployments that are undersized because third-party use hasn't been factored into their designs. When designing an Exchange 2010 system, ensure that all third-party applications are accounted for. Wireless devices, third-party add-ins for Outlook, and others all have an effect on the overall performance of your Exchange server.