Skip navigation

Search & Destroy Email Content with Exchange 2010

Use the Search-Mailbox and New-MailboxSearch cmdlets to clean house

Multi-mailbox discovery searches receive a lot of headline attention when discussion turns to the features of Microsoft Exchange 2010 (or Exchange Online, as deployed in Microsoft Office 365). And why shouldn't this be the case? Microsoft invested heavily during the development of Exchange 2010 to create an array of features that could satisfy the compliance requirements of large organizations. Although small organizations also need to comply with legislative or other regulatory directives, large organizations tend to devote the most attention to this aspect of email -- if only because they are often targets for discovery actions launched by external parties.

In any case, although the legal community will luxuriate in its ability to expedite discovery searches and review the results, messaging administrators often have more mundane concerns. For example, how do you remove objectionable items from user mailboxes without teaching every user how to use Outlook or another client to purge items, especially when accessing an item might download a malicious payload? The good news is that Exchange 2010's built-in compliance features can also be used to locate and eradicate problematic items.

Email as a Virus Vector

In the early days of email viruses -- around the time that users still happily opened any message that proclaimed love for the proud recipient (the first I Love You virus appeared in May 2000) -- many antivirus engines that protected email servers were slow and ponderous. These engines depended on the ability to log on to every mailbox on a server to check incoming messages. As the number of mailboxes grew and the volume of messages increased, this technique struggled to cope. Viruses could often sneak past the checks on incoming email to penetrate mailboxes. In these cases, administrators might be forced to log on to user mailboxes to check for and remove problem messages before they could spread infection.

It was only after Sybari (bought by Microsoft in 2005) introduced the "ESE shimmy," enabling its antivirus engine to load its code before the Information Store, that we had reliable and robust antivirus products for Exchange that could catch viruses quickly. Today's antivirus products all use a supported Microsoft API for fast and reliable access to mailbox contents.

Evolving Needs for Search and Destroy

With servers protected by reliable antivirus barriers, administrators aren't likely to be forced to rush to disinfect mailboxes by searching and removing infected messages. However, we live in a litigious environment, so the need for search-and-destroy activities has evolved. It's common to receive requests from an authority (e.g., the HR department, senior management, legal advisors), asking administrators to remove specific messages from user mailboxes. Perhaps someone sent out information that they should not have, or a company is compelled by a legal order to remove all references to an event, project, or product. In such circumstances, an Exchange administrator starts to consider using the Search-Mailbox cmdlet.

Exchange 2010 includes a GUI to create and execute multi-mailbox discovery searches from the Exchange Control Panel (ECP). These searches use the New-MailboxSearch cmdlet to search a set of specified mailboxes and copy the results to a discovery mailbox. The big difference between the two cmdlets is that Search-Mailbox can search and remove content (i.e., seek and destroy) from a specified mailbox, whereas New-MailboxSearch is optimized to scan as many as 25,000 mailboxes and then copy the discovered content. The limit of 25,000 is set to restrict the amount of memory that multi-mailbox searches use. If necessary, you can update the system registry to increase this number by following the steps described in the article " Exchange 2010 Discovery: Modify the maximum number of mailboxes searched at a time ." Another feature of New-MailboxSearch (from Exchange 2010 Service Pack 1 -- SP1 -- onwards) is the ability to deduplicate search results so that separate copies of the same item aren't taken from multiple mailboxes.

You can use Search-Mailbox to process multiple mailboxes. However, you must first form a collection of the desired mailboxes by using a cmdlet such as Get-Mailbox, and then pipe the resulting data for processing by Search-Mailbox. The downside of using Search-Mailbox is that Exchange provides no UI in either Exchange Management Console (EMC) or ECP to construct and execute searches, as it does for multi-mailbox discovery searches. Instead, you must invoke these searches through Exchange Management Shell (EMS). The commands that I describe in this article are valid for both on-premises Exchange and Exchange Online.

Finding Data

The first order of business is to define what you want to find. In general, the more specific the search criteria, the better, faster, and more accurate the search will be. Casting a net to find every item with a subject containing "Test" on a large mailbox server will keep the computer occupied, but the results are unlikely to satisfy anyone.

Both the Search-Mailbox and New-MailboxSearch cmdlets support the AQS syntax, a powerful method to build searches for the mix of structures found in mailbox data, which comprise text that can contain just about anything as well as well-known properties such as author, subject, and date. The trick in successful Exchange searches, both in simple mailbox searches and multi-mailbox discovery searches, is to spend time making the search query as specific as possible before you launch it on a server.

In this case, let's assume that many tasteless messages have recently appeared in mailboxes. You want to perform a public service for users by removing these messages. You know the date range when the messages appeared, as well as some of the not-so-nice terms that the message body contains. Equipped with this knowledge, you can build a query and test its effectiveness.

To begin, you'll search just one mailbox. Ideally, choose one that you know holds some of the target messages, and run the following command:

Search-Mailbox -Identity 'Billing' -SearchQuery "Received: 
> $('01/01/2012 00:00:00') AND Received: < $('01/31/2012 23:59:59')
AND hookup" -LogLevel Full -LogOnly -TargetMailbox 'AdminMailbox' -TargetFolder 'Search Results' 

This command:

  • searches the Billing mailbox, as indicated in the -Identity parameter
  • uses the AQS query that says, "find anything between January 1 and January 31 AND includes the word hookup in the message body"
  • creates a full log of operations but doesn't do anything except log what you do
  • puts the results in the Search Results folder of the AdminMailbox mailbox

Note the use of times in the AQS query. You don't need to pass time details -- a date is usually enough to find data -- but best practice is to be as exact as you can whenever you look for information.

The output of this search is a message that's created in the destination folder. As you can see in Figure 1, the search results indicate that three items have been found. Also note that Exchange has attached a ZIP file to the message. This file contains a comma-separated value (CSV) file with the details of the found items. You can use this information to confirm that the correct items have been located.

Figure 1: Reported search results are reported
Figure 1: Search results indicating targeted items found

Deleting Content

After you're satisfied that you have a solid set of search criteria ready to go, you can modify the previous command to add processing power. Remember that Search-Mailbox operates on just one mailbox at a time. Sometimes this is sufficient, but not when you're trying to eliminate problematic messages from every mailbox on a server.

One method is to read a list of mailboxes from a data file and feed the mailbox names, one by one, to Search-Mailbox. This is a good approach when you need to process a set of mailboxes that are spread across multiple mailbox servers or perhaps the output of an external data feed, such as from an HR system. However, the usual approach is to use the Get-Mailbox cmdlet to build whichever set of mailbox objects need to be processed, and then to pipe those objects to Search-Mailbox. In the following example, I tell EMS to process every mailbox in the organization. This works for a test or a small organization, but it's probably better to break things up if you have more than 1,000 mailboxes to deal with. That way, the processing load is spread over time or over servers. For example, you could use Get-Mailbox to build a list of every mailbox in a database, every mailbox on a server, and so on.

The other major addition to the command is the inclusion of the DeleteContent parameter. This parameter instructs Exchange to permanently delete the located items from the source mailboxes. If you provide values for the TargetMailbox and TargetFolder parameters, Exchange will copy the items before it deletes them from the source mailboxes. Copying items before deleting them can be an invaluable safeguard if a mistake creeps in and data is removed incorrectly. Should this happen, you can recover the situation by copying the items to a PST and then using the New-MailboxImportRequest cmdlet to import the items back into their rightful place in the user mailbox. This two-step approach is necessary presuming that you can't open the user's mailbox to drag and drop the items from one location to another.

If you copy items, be sure that the target mailbox has sufficient quota to hold the copied items, which could amount to quite a lot should you process many mailboxes. You cannot specify a folder in the mailbox that you search to use as the target.

Get-Mailbox | Search-Mailbox -SearchQuery "Received: > $('01/01/2012 00:00:00') AND Received: < $('01/31/2012 23:59:59') AND hookup" -LogLevel Full -TargetMailbox 'AdminMailbox' -TargetFolder 'Search Results' -DeleteContent

Because the DeleteContent parameter is included, EMS prompts for confirmation before executing the command.

After being launched, Search-Mailbox opens each mailbox in its input list, searches for the targeted items, and removes any items that match the search criteria. In this instance, we've provided a target mailbox and folder, so Exchange first copies the located items. Exchange creates the target folder if it doesn't exist in the nominated mailbox.

Just like multi-mailbox discovery searches, the mailboxes that you search are assigned a subfolder under the target folder; the search date and time are used as part of the folder name, to identify the particular search. Under this folder, you'll find an additional subfolder for each folder in which an item was found, as Figure 2 shows. Copies of the found items are stored in the relevant subfolders. Unlike multi-mailbox discovery searches, empty folders are not created if no items are found.

Redmond-WIN2624-SearchDestroy-Figure2-sm
Figure 2: The folder structure for discovered items

By default, Exchange searches an archive mailbox if one exists, and creates a separate set of folders for any items found in the archive. You can exclude archives from searches by passing the DoNotIncludeArchive parameter. The contents of the Recoverable Items folder are also searched unless you set the SearchDumpster parameter to $False.

Some RBAC Constraints

The DeleteContent parameter is available only to on-premises administrators who are members of the Mailbox Import Export Role Based Access Control (RBAC) role group. I think that Microsoft provides this extra safeguard to ensure that only suitably authorized users who run the Search-Mailbox cmdlet can delete content. By comparison, any Office 365 tenant administrator who is a member of the Organization Management role group can delete content immediately because they automatically hold the Mailbox Import Export role. On-premises Exchange and Exchange Online operate radically different RBAC environments, and this is just one example of where the two differ.

You can use the following command to see the current set of assignments for the Mailbox Import Export role:

​Get-ManagementRoleAssignment -Role "Mailbox Import Export" | Format-List RoleAssigneeName, EffectiveUserName

If you need to add a user to the Mailbox Import Export role group, you can do so by using the Add-RoleGroupMember cmdlet. For example, this command adds a user called Joe Smith to the group:

Add-RoleGroupMember -Identity "Mailbox Import Export" -Member "Joe Smith"

Easy to Delete

Both Exchange 2010 and Exchange Online include powerful search-and-destroy facilities. I hope that you never need to clean out funky items in user mailboxes, but it's good to know that doing so is easy!

Hide comments

Comments

  • Allowed HTML tags: <em> <strong> <blockquote> <br> <p>

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.
Publish