Extending the Active Directory Schema

Historically, both Active Directory (AD) administrators and IT managers have been fearful of extending the AD schema. Much of this fear stems from Microsoft documentation in the Windows 2000 era that made schema extensions appear to be dangerous and something best done with extreme caution. However, with a bit of planning and due diligence, extending your AD schema doesn’t have to be something to fear.

The AD schema defines the structure of the data stored in the directory. Out of the box, AD supports many different types of objects (e.g., users) and attributes (e.g., first and last name). When the base schema that comes with AD doesn’t lend itself well to data you need to store in the directory, you can extend the schema with custom objects and attributes.

Typically, the AD schema is extended for a number of reasons. For many organizations, the most common reason is the implementation of an application that requires a schema extension. Microsoft Exchange is a perfect example of this. Third-party software vendors also sometimes require schema extensions to support their application. Also quite common is extending the schema to support an internally developed application, or to provide a location to store proprietary data in AD.

Data Storage Options

The first things you should evaluate when considering a schema extension, particularly for an inhouse application, is whether or not the data is appropriate for storing in AD. Particularly well suited to AD is data that is relatively static (i.e., it doesn’t change often), that is necessary across your organization (because it will replicate across domains), and that isn’t highly sensitive (e.g., you shouldn't store birth dates, social security numbers, and so on in AD).

If you have data that doesn’t match this criteria but still needs to be in an LDAP directory, a second option might be a good fit for you. AD Lightweight Directory Services (AD LDS, formerly ADAM) is a standalone version of AD that can run as a service on a member server (or domain controller—DC) and be queried via LDAP, just like AD. Rather than being constrained by the need to place AD DCs to support enterprise authentication and application requirements, you can tightly control who can read data and where that data is replicated by only placing AD LDS instances in appropriate locations.

Data Storage Primitives

To understand the AD schema, you need to know two key terms: class and attribute. Everything in AD, including the schema itself, is defined in terms of classes and attributes. Classes are types of data you want to store. For example, user is a class in the AD, as is computer. Attributes are properties of classes. The user class has a first-name attribute (givenName) and a last-name (sn) attribute. The computer class has an OS attribute. The AD schema is defined in terms of two classes: classSchema for classes and attributeSchema for attributes.

If you’re familiar with a typical database, another way to think of this is that classes are analogous to tables in the database, and attributes are analogous to columns within a table. Note, however, that you shouldn’t get the impression that this is how the AD database (the Directory Information Tree—DIT) is structured, because it’s actually quite different.

When thinking about storing a new type of data in AD, you need to think about how the data maps to classes and attributes. For most common extensions, adding an attribute to an existing class (e.g., user or group) is sufficient. When you simply need to store a new piece of data about an existing type of object (such as users), you should first determine whether an existing attribute in AD is appropriate. The schema contains thousands of attributes and most of them are unused. So, for example, if you wanted to store a user’s mail stop, you might consider the physicalDeliveryOfficeName attribute of users.

Repurposing an attribute for something other than its intended use is a bad idea. Consider the scenario in which you repurpose an attribute for something other than its intended use and then you buy an application that uses the attribute for its original purpose. You need to do double the work because you have to reconfigure the existing application using the attribute and then move the data. In general, it’s always safer to add a custom attribute than to take this risk.

But sometimes you need to think in terms of classes; in a couple of scenarios, adding new classes to the schema makes more sense than using attributes. The first scenario is when you need to track a totally new type of data in the directory. If, for example, you wanted to keep track of the company cars in AD, it would probably make sense to define a new car class in the schema. Another scenario is when you need to do a one-to-many mapping.

Microsoft Exchange Server 2010 provides a perfect example of this. Each mobile device a user has synchronized to Exchange using ActiveSync is stored as an instance of a special object class (msExchActiveSyncDevice) in the directory. These mobile device objects are stored as child objects under the user who owns the device. This design permits the mapping of a significant number of attributes (for each device) to a single user.

Schema Extension Inputs

To create a custom schema extension, you need to gather a number of key inputs before you can implement your custom attribute or class in a development environment. Many of these inputs are required to be globally unique, so it’s important to do the necessary prerequisite work before proceeding. Cutting corners on this prep work is how schema extensions become dangerous.

The first thing you need to decide on is the name of your class or attribute. The most important part of the name is the prefix. Because attributes and classes need to have unique names in your schema (and your customers’ schemas, if you’re selling an application), adding a prefix helps ensure that your ID attribute doesn’t conflict with someone else’s ID attribute.

Typically, you use an abbreviated form of your company name for the prefix. For example, I use bdcLLC as the prefix for attributes that my company (Brian Desmond Consulting, LLC) creates. You might use abcCorp if you are ABC Corporation. Just think about the uniqueness of your prefix because no overall registry of prefixes exists. If you work for a company with a very common name, or an abbreviated name, think about how to make it unique.

After you decide on a name, think about the Object Identifier (OID) you assign to your attribute or class. OIDs are an additional component that needs to be globally unique. AD (more generically, LDAP) isn’t the only thing that uses OIDs for identifiers, so the Internet Assigned Numbers Authority (IANA) assigns unique OID trees to organizations upon request. Requesting a Private Enterprise Number, which is the portion of the OID tree unique to your organization, takes about 10 minutes and is free. You want to do this before you start creating custom schema extensions. You can request a Private Enterprise Number at www.iana.org/cgi-bin/assignments.pl.

After you have a Private Enterprise Number, you can create a seemingly infinite number of unique OIDs and organize them. Figure 1 shows a diagram of the OID tree for my company’s Private Enterprise Number. Because you create OIDs by appending branches to the tree, many organizations first begin by creating an AD Schema branch (1.3.6.1.4.1.35686.1 in Figure 1), and then under that branch they create a class branch and an attribute branch. Under each of these branches OIDs will be allocated for each new attribute or class. In Figure 1, I allocated an OID (1.3.6.1.4.1.35686.1.2.1) for a custom attribute, myCorp-ImportantAttr. It’s extremely important that you devise an internal tracking solution (such as an Excel spreadsheet or SharePoint list) to ensure that OIDs are always uniquely allocated.

Microsoft provides a script you can run that will generate a random OID, but you have no guarantee it will be unique. Best practice is to request a unique assigned branch from IANA and to use that for your schema extensions. Given how easy this process is, you never have reason to use Microsoft's OID generation script.

The remaining two inputs you need to decide on are specific to attributes and depend on the type of attribute that you want to implement. linked attributes, which are extremely useful, are used for storing relationships between objects in AD. They are stored as pointers in the AD database so that the relationships are always up-to-date in relation to the whereabouts of the object in the forest. Two common examples of linked attributes are group membership (member and memberOf) and manager/employee relationships (manager/directReports). In discussions of linked attributes, you often see the concept of forward and backward links. The forward link is the editable portion of the linked attribute relationship. For example, with group membership, the member attribute on the group is the forward link; the memberOf attribute on the user is the backward link. When editing a group’s membership, the modification must be made to the member attribute (the forward link) rather than the member object’s memberOf attribute (the backward link).

To define linked attributes in AD, you need to define two attributes (the forward and backward links) and attach a link identifier (linkID) to each of these attributes. Link IDs need to be unique within the forest, and because other applications that might extend your schema need to use link IDs too, you want yours to be globally unique. Microsoft used to have a process for issuing link IDs to organizations, but in Windows Server 2003 the company replaced that process with a special indicator to AD that allows AD to generate unique link IDs when you extend the schema with a linked attribute pair.

AD expects link IDs to be sequential numbers. Specifically, AD expects that the forward link attribute is an even number, and the next sequential number is assigned to the backward link attribute. For example, with member and memberOf (group membership), the link ID for member is 4, and the link ID for memberOf is 5. If you need to support Windows 2000 forests with your schema extension, you need to continue defining static link IDs in the manner described here. Otherwise, you should use the auto link ID process introduced in Windows Server 2003. To use the auto link ID process, follow the steps below when you define your schema extension. When you build your schema extension, as discussed later in this article, you need these steps to construct the linked attributes—if you're using linked attributes as part of your extension.

Create the forward link first, using a link ID of 1.2.840.113556.1.2.50. Note that although this link ID value is an OID, Microsoft simply reserved this OID value for the special purpose of creating an auto link ID.

Reload the schema cache.
Create the backward link attribute, using a link ID of the name of the forward link attribute.
Reload the schema cache.

The second item that’s unique to attributes, and is also optional, is the MAPI ID. MAPI IDs are specific to Exchange Server. If you don’t have Exchange or your attribute doesn't need to be surfaced in the Global Address List (GAL), you can skip this section. MAPI IDs are used to display attributes on one of the property pages in the address book, such as the user general fetails Template Figure 2 shows. If, for example, you want to display employee classification (e.g., full-time employee or contractor) in the GAL, you need to assign your attribute for this as a MAPI ID. After you assign a MAPI ID to an attribute, you can use the Exchange Details Templates Editor to add that attribute’s data to the view provided in the GAL inside Office Outlook.

MAPI IDs must be unique, much like OIDs and link IDs. In the past you had no way to generate unique MAPI IDs, so these IDs were always a sticky point in the realm of schema extensions. Fortunately, Windows Server 2008 introduced a process to automatically generate unique MAPI IDs within the directory to reduce the risk of duplicate MAPI IDs. To use this functionality, assign a value of 1.2.840.113556.1.2.49 to the MAPI ID attribute when you create the attribute. AD will generate a unique MAPI ID for the attribute after the schema cache reloads. Note that although this value is an OID, it's reserved within AD for indicating automatic MAPI ID generation, much like automatic link ID generation discussed earlier.

To summarize, you must consider three crucial inputs when planning a schema extension. The first is the name of the class or attribute; the second is the unique prefix that you assign to all of your classes and attributes; the third is the OID. You need to request a unique branch of OIDs from IANA to generate your OID. If you’re going to create a linked attribute pair, you need a unique pair of link IDs. And if you’re going to surface your attribute in the Exchange GAL, you need to use a unique MAPI ID. In the case of both link IDs and MAPI IDs, using the automatic generation process inside AD is much better than using static values.

Implementation Planning

When implementing a custom schema extension or extending your schema with a vendor’s attributes and classes, you need to take some basic planning steps to protect the integrity of your AD forest. The first step is testing your schema extension.

If you’re creating a custom schema extension, use a disposable development environment to create your schema extension. AD Lightweight Directory Service (AD LDS) is available as a free download for Windows XP and Windows 7 workstations. You can create an AD LDS instance on your workstation, design your schema extension in an isolated environment, and then export that extension for import in a test AD forest. AD LDS’s schema is compatible with AD so you can use LDIFDE for export. After you develop your schema extension, you can import it in your test AD forest and ensure the import is successful and that no key applications are affected. With regard to AD, you should plan to check for the import being successful and replication continuing to succeed within your test environment.

If you choose to test the schema extension with a test AD forest, it should have a schema that matches the production forest so that your testing is comprehensive. You can use the AD Schema Analyzer tool (included with AD LDS) to identify schema differences between two AD forests. The TechNet article, "Export, Compare, and Synchronize Active Directory Schemas" (http://technet.microsoft.com/en-us/magazine/2009.04.schema.aspx) discusses how to import and export schema extensions, as well as how to use the AD Schema Analyzer tool. Note that there may be some differences depending on service packs and versions of Windows when you compare schemas, such as differences in attribute indexing and tombstone preservation.

In the case of schema extensions you didn’t create (such as those bundled with a commercial application), you need to ensure that there’s nothing suspect in the changes the vendor is planning to make. All the inputs we discussed in the previous section are critical to examine, as well as a few other things. The following list details the key variables you need to check:

Delivered as an LDIF file (or series of LDIF files)
Properly prefixed attributes
Registered OIDs
Registered/automatically generated link IDs
Automatically generated MAPI IDs

LDIF files are an industry standard; all schema extensions you receive should be delivered in this format. It’s permissible for applications to provide a custom import mechanism rather than requiring you to use LDIFDE to import the schema extension. But if the extension isn't delivered in this format, you should question the validity of it, as well as the practices of the vendor that created it. Figure 3 shows a sample LDIF entry that creates an attribute in the AD schema for storing a user’s shoe size. You should note the following in this sample schema extension:

The attribute is prefixed with the name of the vendor’s company (Brian Desmond Consulting, LLC: bdcllc)
The attribute has a unique OID issued under a Private Enterprise Number registered to the vendor
The attribute is indexed (searchFlags: 1) and is available in the Global Catalog (isMemberOfPartialAttributeSet: TRUE)

You also need to ensure that an attribute’s availability in the Global Catalog (Partial Attribute Set—PAS) is appropriate and that the indexes created on an attribute are appropriate if the attribute is going to be used in LDAP search filters. Also, it’s wise to make sure that the data to be stored in the attribute is sensible for AD in the context of the restrictions and recommendations discussed earlier.

After you test your schema extension and you're ready to implement it in production, you want to plan for an appropriate time to do this. In general, it’s perfectly feasible to make this change during business hours. You should expect some measurable increased CPU utilization on your schema master, as well as some negligible increased CPU utilization on DCs as they replicate the change. In large environments, you might also see transient suspensions of replication between DCs for four- to six-hour periods if you add attributes to the Partial Attribute Set (PAS). These suspensions will come with errors indicative of lingering object problems, but you can often ignore them and they will go away. If DCs remain quarantined from replication for an extended period of time, you should begin troubleshooting.

Following Through

Extending your AD schema isn't dangerous or something to be feared, if you take some basic steps. When planning new schema extensions, as well as evaluating custom attributes and classes from third-party vendors, look at the identifying information unique to each class or attribute and make sure that it's truly globally unique.

After you evaluate the integrity of the proposed extension, import it to a representative test environment and ensure that the test environment and key applications continue to function. Then you can import the schema extension into your production environment and begin taking advantage of it.

Comments

Plain text