I spent quite some time trying to pick one of many possible titles for this column. Candidates included:
- SharePoint User Profile Synch: 14 Days of My Life That I Will Never Get Back
- Service from the Dark Side
- Get Rich Quick: Get Paid Hourly for UPS Troubleshooting
- CSI: UPS
Then, ironically (as incorrectly as ironically can be used), it happened… The User Profile Synchronization Service (UPS) took a bite out of the entire community. If you haven’t heard, this is important: DO NOT INSTALL THE OCTOBER 2010 CUMULATIVE UPDATE for SharePoint Server or Project Server.
The update kills the UPS. What is the UPS? Let’s look at that, this week. Then, next time, we’ll look at the bigger problem—a service that exposes huge weaknesses not only in SharePoint but in every complex technology that we as IT pros will have to support in an ever more increasingly distributed (read: Cloud) world.
The User Profile Service Application (UPA) is a service application that provides functionality related to user profiles, supporting My Sites, audiences, personalized navigation, and other personalization features. The UPA manages administration and configuration of the User Profile Service (generally not abbreviated as UPS).
The User Profile Service (one of the “Services on Server”) is the service instance running on one or more servers (where it is called a service machine instance) that does the dirty work on behalf of Web applications and other service applications in the farm.
These two are, generally, “good guys” in the story. They represent a new way of deploying services in a distributed architecture—the Service Application Framework—but their functionality is, in essence, (significantly) improved versions of what we had in MOSS 2007 in the SSP.
Lots of organizations have the UPA running with few issues. And one side note: even if you don’t want My Sites right now, I highly recommend deploying the UPA as it provides other very valuable functionality.
Then, in walks the USER PROFILE SYNCHRONIZATION SERVICE <cue ominous music> or UPS. The key word here is “synchronization.”
The job of the UPS is to populate profile attributes from external sources, including Active Directory and third party data stores such as, say, your PeopleSoft implementation. The UPS is a service instance that is a wrapper around Forefront Identity Manager (FIM) services—two Windows services that are installed (but initially set to “disabled”) on each server in the farm.
You use the UPS to provision FIM—to start it up—and after that, FIM is supposed to just work. The theory is that FIM will tap into your UPA configuration to determine the sources (e.g. Active Directory) and attributes to synchronize, and then it will do its work.
You know you’re in trouble when you open the FIM client (C:\Program Files\Microsoft Office Servers\14.0\Synchronization Service\UIShell\miisclient.exe) and discover that FIM is a release candidate (RC1) version of FIM. FIM wasn’t at RTM when SharePoint was ready to seal its code base.
That leads to an interesting “current state”—you cannot apply FIM updates directly to your SharePoint servers, because there is a “fork” in the code… you have to update FIM by updating SharePoint. I actually understand enough about how Microsoft works as a company to understand why they gave us an RC version of FIM, and is actually forgivable, but it certainly doesn’t instill confidence, and given how flaky the entire thing is, you have to wonder…
By the way (in case you’re testing this in real time), you cannot open the FIM client until after you’ve started FIM. And that’s where the fun really begins.
To start the User Profile Synchronization successfully, you must sacrifice a chicken at the moment of the green flash of sunset on the night of a new moon.
Seriously, anyone who has successfully started the UPS, at least during one of the first attempts, should consider themselves very lucky. The UPS will start—I started it last week—if all the stars are aligned.
Most importantly, the farm service account (e.g. SPFarm) must be a member of the local Administrators group on the server on which you will run UPS. Be certain that you’ve restarted the server to ensure that every service running as SPFarm now uses a security token that includes membership in the local Administrators group.
Just like a user must log off and log on to be part of a new group, a service must restart. You could restart each and every service running as SPFarm, and then do an IISReset, but it’s easier and more certain to reboot that one server. And yes, you can only run UPS on one server in the farm. It is one of the few pieces of SharePoint that cannot be made redundant in SharePoint 2010.
After UPS has been provisioned, you can remove SPFarm from the local Administrators group and restart, again, else you will get nasty-grams from the Health Service telling you that a SharePoint account is an administrator, and should not be.
There are several other requirements, as well, but on a clean farm with no weird tweaks to Windows, local Administrators membership of SPFarm is really all it takes before you can click Start.
When you start the service, FIM has to tap into the UPA and some of the associated data, and then configure itself, and here’s where the fun really begins. SharePoint will try to provision FIM 15 times before it gives up. This can take 10-20 minutes or even longer. All kinds of things can and do go wrong during this process. “14 Days of My Life That I Will Never Get Back” refers to my September, during which I spent 14 days, at about 12-18 hours per day, trying to get the UPS to start FIM on a farm that I was using to write the SharePoint 2010 Training Kit for Microsoft. It never started, and next time I’ll tell you why and what it means for you.
Thankfully, your mileage may vary and you might find, as I have on numerous other farms, that UPS/FIM start up nicely. But as luck would have it, you must re-provision FIM each time you apply a cumulative update. When you run the SharePoint Products Configuration Wizard (psconfig) after installing a CU, it does not update the Sync database. Instead, that happens (hopefully) when you re-start the UPS and re-provision FIM.
The June and August CUs were pretty smooth. Then came the October CU. The first sign of trouble I saw was a series of tweets by my colleague Todd Klindt, who early on reported that the October CU had broken UPS.
Sure enough, last Friday, Microsoft pulled the October CU and recommended, unequivocally, that you do not install the October CU. If you are unlucky enough to have already done so, the SharePoint Team Blog has the details.
This is enough background and “critical information” to get you going this week. In Achilles Profile, Part II, in two weeks, I will share my experience and some of the experiences of my clients with UPS Hell.
Until then, be absolutely certain that you have visited and “Added to Favorites” these two invaluable blog posts by MVP and all-around SharePoint Megamind Spence Harbar:
Spence has done an extraordinary job of documenting everything that can be documented about UPS and UPA and related components. He keeps updating and refining these posts. They are the UPA/UPS Bible.
And, until my Part II, let me leave you with this: Don’t ever throw a nasty look at your SharePoint server. The UPS will stop, and you’ll never get it back!