It’s a killer job, but somebody has to do it. Someone has to drive the schedule and monitor accountability from all the contributors to the huge project of delivering the first service pack for the massively complex OS, Windows Server 2003. That somebody is Group Program Manager Clyde Rodriguez, a calm, soft-spoken, engaging guy, who has the clout to make hundreds of notoriously strong-minded developers deliver their code according to plan. I recently sat down with Clyde to discuss Windows 2003 Service Pack 1 (SP1). Here’s his behind-the-scenes description of what went on during development, how decisions were made, what role customer feedback played, and how the developers addressed customer problems.
KF: Clyde, how would you describe your role?
Clyde: I am the project manager for Windows 2003 SP1 and the x64 release—which are tied together. \[Editor's note: The x64 version of Windows 2003 SP1 will not run on 32-bit machines.\] My job evolves depending on the phase we’re in. It’s everything from defining the vision, to determining exactly what we want to deliver for the benefit of customers and for the evolution of the platform, to making sure the objectives are clear, all the way down to herding cats day to day, making sure we’re delivering on schedule. If we get responses or feedback from customers, that’s also incorporated into the original plan. We have metrics that we track daily.
KF: What are those metrics? What were the SP1 “ship criteria”—the conditions that had to be met before you allowed the product to ship?
Clyde: Every release candidate we ship has to meet release criteria that say it’s better than the previous milestone. In general, our internal testing group has to sign off, and our deployment partners sign off. We will not ship unless we meet our metrics and the objectives we set. We don’t move on to the next step until we've reached interim goals. I could categorize the ship criteria in terms of SP1’s three main pillars: to improve security, reliability, and performance. We have a way of quantifying that tells us how we’re doing.
Take security, for example: First, we have metrics that tell us to what extent we’ve addressed customer issues that are reported. Second, through the Windows XP SP2 process, we discovered some techniques for checking our source code for potential exposures. We applied these techniques extensively to the entire code base to ensure that we fixed things like buffer overruns. So, we’ve improved the internal engineering tools we use. Our compilers are much more adept at uncovering security issues early in the development process. We have tools that ensure that we don’t ship with known vulnerabilities. And third, we have a team dedicated to reviewing security.
For the reliability pillar, we have servers that we don't touch for weeks at a time. We take a build that we believe is good quality to “self-host” on (that’s a term we use internally, meaning we run our day-to-day production work on interim builds of the software). We let the build sit in production without us applying any updates and see how long it can run without any issue. Self-hosting helps us compile statistics on reliability.
In the area of performance, we have benchmark results that validate that SP1 is going to make dramatic improvement. Some examples of how we have improved performance with SP1 include Secure Sockets Layer (SSL) and Microsoft IIS. We've improved the performance of SSL by 50 percent. This means that with SP1, an SSL server has greater capacity for serving secure pages. IIS hosting with SP1 is 80 percent faster when there are 50,000 sites to host. From the system point of view, 80 percent less startup is necessary to get all the sites running. This \[performance improvement\] is useful in the lab because it reduces the time it takes to run an experiment. The net effect is that customers can increase their server’s capacity under many workloads simply by upgrading to SP1 without having to add expensive hardware.
KF: You mentioned customer feedback. Where do you get it?
Clyde: We get that data from Customer Support Services (CSS, formerly Product Support Services—PSS), from Watson reports when users report problems to Microsoft, from our technical beta program, and from Microsoft IT, which is actually just one member of the Technology Adoption Program (TAP). Without that relationship, the quality of the product certainly would suffer. TAP participants are willing to take our interim beta releases and release candidates into production deployments. We, in turn, give them a high degree of interaction with our engineering team. I view TAP customers as an extension of our test organization. If they report an issue, we are on it—immediately. Every day we have a representative from the TAP program tell us how the customers are doing. If there’s any issue and it affects a certain team in the room, their ears instantly perk up and you can see them become more alert. They immediately follow up. At each milestone, we ask our TAP customers for at vote: Do you believe this product is ready to ship? Does it meet your expectations?
KF: If the customer identifies a problem, it gets the team’s attention?
Clyde: Absolutely, in fact that’s a great way to end a debate—when somebody says, "But customer so and so is reporting this issue and can’t go into production without it" (or they have work stoppage, or they’re vulnerable in some way), we move quickly. The development process is very customer centric from the start. It’s a process of continual refinement with their involvement at every step.
KF: You mentioned that Microsoft IT has to sign off before you can ship a product. How does that work?
Clyde: Our IT department is an early adopter of our products. In fact, in this role, they are now managed through the TAP project and are considered one of our deployment partners. We will not ship an interim release until they tell us it’s OK.
KF: Can you give an example of how customer feedback affects the development process?
Clyde: The weekend before the release of SP1 Release Candidate 1 (RC1) in December 2004, I happened to be in line at a movie theater waiting to buy a ticket—I don’t even remember what the movie was. I got a phone call from the gentleman who runs our MS IT relationship. He said, "They’ve hit a showstopper. We’ve got to jump all over this." I instantly called some of my partners on the project to look at the issue right then and there. It was a Saturday night, but we immediately called into action the experts in that area. The funny thing is I called five people, and two of them were in line at the same movie. One of them was inside the theater. I definitely felt bad for him because he was seeing a cartoon with his kids, but this incident captures the passion that everybody feels and the customer-centric focus that we have. We even had another conference call at midnight that night to make sure we’d ship on time and provide exactly what the customer needed. In the end, we were able to ship right on schedule, and it had a lot to do with that passion that everyone on the project feels about serving customers.
KF: You’ve mentioned the role that customer feedback plays during testing. What specific SP1 features or functionality came about as a result of customer feedback?
Clyde: One example is the Security Configuration Wizard (SCW). Security is a profound tenet of the SP1 release. That’s a good example of how a feature evolved based on customer feedback. The City of Redmond has been a strong partner in testing this. It’s one feature that I can point to that had strong customer involvement.
KF: What would you say to an IT person who is looking at SP1?
Clyde: We’ve done extensive work to help improve your day-to-day lives, and I want you to trust and understand that we live and breathe your world. We know that you face challenges, and we've taken your experiences to heart in developing a product that we believe will definitely help improve your experience. I think you're seeing evolution in that there is much greater emphasis on making sure everything we do is tied back to a customer scenario and customer benefit.