How much space to allocate to user mailboxes is a vexing subject for many systems administrators. Allocate too little storage, and users will be unhappy as they perpetually hit limits and can't send mail. Allocate too much space, and the server will groan under the weight of thousands of messages kept well past their useful life. Walking the middle line is the trick: Allocate enough space to keep users productive while keeping the size of the Information Store under control. In this article, I'll discuss calculating the size of the Private Information Store to accommodate mailboxes, allocating and administering quotas, controlling message storage, analyzing storage patterns, and helping users stay within their quota.
Calculating the Size of the Information Store
When sizing hardware for Exchange Server, you need to know how much the databases will grow. That way, you can leave enough room for growth when you buy and configure disks. The size of the Information Store is also important because it determines the time required for backups.
For years, people have used the following formula to predict the size of the Private Information Store:
Mailbox Quota x Number of Mailboxes = Size of Private Information Store
For example, if you allocate 50MB to every mailbox and a server supports 1000 mailboxes, the Private Information Store will grow to 50GB (50MB x 1000 = 50GB).
Because Exchange couldn't support databases larger than 16GB before Exchange 5.5, administrators used the store size formula to determine how many users a server could support. But the formula doesn't take into account two important factors that influence the overall size of an Information Store: the single-instance storage model and the deleted-items cache.
The single-instance storage model. In the classic LAN-based design for email, each recipient receives a separate copy of a message. In a single-instance storage model, Exchange stores one copy of a message's content (plus any attachments) in a central repository (e.g., the Private Information Store). Thus, the single-instance model uses less storage space than the classic model. In the single-instance model, users access message content via pointers to one copy of the data and a system of reference counts. The reference counts track the number of users who maintain a pointer to the content. As people delete their pointers to messages (i.e., delete the message from their mailbox), Exchange reduces the number of reference counts. When a count reaches zero, Exchange removes the content from the Information Store.
A Windows NT Performance Monitor counter (MSExchangeIS Private : Single Instance Ratio) gives a snapshot of how effective the single-instance storage model is on a server. This counter gives the ratio between the total number of message references and the total number of messages stored in the Information Store. This ratio shows the amount of storage that single-instance storage saves as a result of message sharing; thus, we can consider the ratio a sharing ratio.
My observations show that the value of this counter varies from close to 1 (very bad) to 3 or higher (good), depending on user communities' size and work habits. For example, a server that hosts large user communities will attain a higher ratio of message references to total number of messages. In a large community, Exchange is more likely to deliver messages to a local mailbox, which means that sharing can occur. Servers that host small user communities or groups that send a high percentage of messages to external recipients (on other servers or mail systems) will see low ratios. The value of the ratio on my server, which hosts a small group of consultants, is currently 1.86.
Exchange charges each recipient's quota with the full size of a message, rather than dividing the message size by the number of recipients and charging each mailbox on a pro rata basis. Thus, if a user sends a 10KB message to nine recipients on a server, Exchange charges 10KB against 10 mailboxes (the sender plus each recipient). The sharing ratio is very high immediately after a message arrives, and the ratio begins to decrease when recipients start deleting their copies.
If you consider both the sharing ratio and the way Exchange manages mailbox quotas, you see that the simple quota-times-the-number-of-users calculation is valid only if the sharing ratio value is 1. Because most servers enjoy a ratio value that's considerably higher than 1, Exchange can make much more logical space available within the Information Store than is physically available in disk space. For further information about single-instance storage, see my article "Inside the Exchange Information Store," Windows NT Magazine, April 1998.
Deleted-items cache. Exchange 5.5 lets servers maintain a deleted-items cache to retain deleted messages for an administrator-defined number of days after users delete the messages from a mailbox. The rationale is to let users quickly recover messages they've deleted in error. This process works well, but the deleted-items cache takes up room in the Information Store. (Jerry Cochran answers some questions about deleted-item recovery in "The Exchange Server Troubleshooter," page 13.)
The size of the deleted-items cache depends on the number of days Exchange retains messages after deletion. Generally, you set the period to between 7 days and 10 days, because people usually realize quickly that they've deleted a message in error. You can use two Performance Monitor counters (MSExchangeIS Private\Total Size of Recoverable Items and \Total Count of Recoverable Items) to monitor the size of the deleted-items cache and the number of items in the cache. You can assume that the cache will occupy between 5 percent and 10 percent of the Information Store, but be aware that this figure varies greatly.
If you take the single-instance storage model and the deleted-items cache into account, a better formula for predicting the size of the Private Information Store is
((Mailbox Quota x Number of Mailboxes) ÷ Sharing Ratio Value) + % of the Information Store Allocated to Deleted Items Cache) = Size of Private Information Store
(In the formula, mailbox quota x number of mailboxes ÷ sharing ratio value equals the size of the Information Store before you increase it by a certain percentage to allow for the deleted-items cache.) If you use a sharing ratio value of 1.8 and allow 7 percent for the deleted-items cache, the predicted size of the Private Information Store is approximately 29.7GB.
((50MB x 1,000) ÷ 1.8) + .07 (27,778) = 29.7GB (rounded)
To arrive at the final size, allow a contingency for internal database structures, a worse-than-expected sharing ratio, a surge in the deleted-items cache, or some amount of inefficiency in the database. I usually increase the result of the calculation by 20 percent for contingency, so the adjusted predicted size becomes 35.6GB.
You must make two further adjustments before you can decide on the size and number of physical disks you need to hold the store. First, you must decide the level of RAID protection you want to use. RAID5 requires an extra disk for parity checking, and RAID0+1 (striping and mirroring) requires double disk capacity because Exchange writes the data twice. Both configurations protect the data in the Information Store, and you can use either configuration in a production environment. RAID0+1 provides better I/O throughput at the expense of additional disks. Second, completely filling disks is a bad practice, so estimate capacity on the assumption that you will use only 80 percent of a disk for storage.
Because many details in the adjusted calculation are imprecise, many systems administrators continue to use the original, simplistic calculation. Using the original formula has some value because this formula usually calculates too much disk capacity, and too much capacity is better than too little.
Allocating Mailbox Quotas
Most installations I see let users store between 40MB and 60MB in their mailboxes. In 1996, when Microsoft introduced Exchange, installations often set quotas to between 20MB and 30MB. Today's larger quotas reflect the availability of cheaper and larger disks, the elimination of the 16GB limit for an Exchange database (for Exchange Enterprise Edition), and users' inclination to store large documents in their mailboxes. In addition, systems administrators are wiser about using the single-instance storage model to its fullest extent; fewer administrators now use Personal Stores (PST files) as the primary message stores. Data-sharing and groupware functions also account for the heavier mailbox use. You can't share data in a PST file. You can set quotas for each server or for individual mailboxes. You set the default values for mailbox storage limits as properties of the Information Store on the server. As Screen 1 shows, Exchange implements quotas in three steps.
- When the mailbox is at 90 percent of its quota, the Exchange System Attendant sends a warning message to the mailbox's owner that the mailbox is approaching its quota. At that point, the recipient can delete some messages to recover storage space.
- When the mailbox reaches its quota, Exchange stops the user from sending any new messages from that mailbox. Exchange still accepts incoming messages, but the server rejects any attempt to send new messages or replies, or to forward messages.
- When the mailbox exceeds its quota, the Information Store rejects all incoming and outgoing messages for the mailbox. People who send messages to the mailbox receive a nondelivery notification. This restriction is new in Exchange 5.5; in earlier versions of Exchange, a mailbox can continue to receive messages until the volume holding the Private Information Store exceeds its disk space.
When you are setting these values, allocate to each step a reasonable range of memory, so users will have time to react to a warning before Exchange prohibits them from sending new mail. The values that Screen 1, page 11, shows are usually enough to let people work without too many system interruptions.
You can set specific quotas on selected mailboxes by changing the mailbox's properties, as Screen 2 shows. Note that on the same property page, you can also set a maximum message size. You can use mailbox-specific quotas to allocate more space to people who need it. (Don't worry—my quota is 350MB, not 35MB; the screen masks the last digit. You probably don't want too many users like me cluttering up your server.) Because user habits affect how quickly mailboxes fill up, teach users how to use email efficiently. The sidebar "User Habits Affect the Message Store" offers some tips for reducing message size.
Controlling Message Storage
Once you've assigned a quota to a mailbox, Exchange gives you no proactive tools for controlling the messages people store in a mailbox. For example, unlike other high-end messaging systems, Exchange doesn't offer an analysis tool for scanning through people's mailboxes and reporting on the total number of items, average number of items per folder, the largest folder, and other characteristics.
The Exchange Administrator program's Tools, Clean Mailbox option, which Screen 3 displays, lets you select one or more mailboxes and delete messages older than a specified number of days or larger than a certain size. You can remove messages from the system immediately or move them into the Deleted Items folder. The tool works, but it is very labor-intensive, and you can't automate it. Use this option with caution, because in some cases, Mailbox Cleanup deletes Calendar entries, Contacts, and other information in Outlook (for more information about this bug, see Knowledge Base article "XL98: Unable to Create a Transparent Chart Area," (http://support.microsoft.com/ support/kb/articles/q179/0/48.asp).
Screen 4 shows the Mailbox Cleanup Agent, which is in the Exchange Server Resource Kit (part of the Microsoft BackOffice Resource Kit—use Mailbox Cleanup Agent 1.9 or later with Exchange 5.5). This agent works in the background to clean up mailboxes. The tool runs as a scheduled NT service and can process a selected range of mailboxes from a complete recipients container (i.e., a group of mailboxes, distribution lists, and other directory entities) in one run. For some reason, however, many systems administrators are reluctant to deploy a utility from a resource kit, and the Mailbox Cleanup Agent still isn't part of the shrink-wrapped product. For further details about the utility, see my sidebar "The Mailbox Cleanup Agent for Exchange," Windows NT Magazine, June 1997.
Analyzing the Message Store
Exchange provides some basic reporting facilities to discover the amount of space mailboxes are occupying in the Private Information Store. You can use Exchange Administrator to view mailbox resources, as Screen 5 shows, but scrolling through hundreds of mailboxes is not an efficient way to gather and analyze information. Fortunately, Exchange 5.5 introduced an undocumented feature, Save Window Contents, which helps enormously.
Save Window Contents is an option on Exchange Administrator's File menu. When you choose the Save Window Contents option, you can save the data from the right side of the Exchange Administrator screen into a Comma Separated Values (CSV) format file. You can import CSV data into spreadsheet (e.g., Microsoft Excel ) or database (e.g., Microsoft Access) programs, and generate reports. You can regularly review mailbox sizes against quotas, if only to track growth.
If you don't want to create your own analysis tools, you can use the Exchange Server Resource Kit's Crystal Reports utility—an abbreviated version of the commercially available Crystal Reports product (http://www.seagatesoftware.com/ crystalreports)—to extract and report on data from Exchange. If you're running a medium-to-large Exchange server, you will find the reports you can generate valuable. The Knowledge Base article "XADM: How to Generate Reports on Mailbox Resources" (http://support.microsoft.com/ support/kb/ articles/q178/8/44.asp) gives detailed instructions for generating reports with Save Window Contents and the Crystal Reports utility.
Not Too Big, Not Too Small
Users won't complain if the Information Store is too big—they want only to be able to work without impediment—but users won't thank you if you set mailbox quotas too small. Setting and managing quotas isn't a hot topic, but getting quotas right can help make the difference between a good systems administrator and an excellent systems administrator.