Program Classes Defensively

asp:feature

LANGUAGES: C#

TECHNOLOGIES: ASP.NET | Debug Class | Exception Handling | OO Design

Program Classes Defensively

Testing techniques save time and frustration when debugging your code.

By Bill Wagner

When you write software, you make assumptions. The key to writing robust software is to test those assumptions in your code and verify them. When you write the software for a class, you should strive to have your code notify the caller as quickly as possible if anything is wrong; this helps find the bugs as quickly as possible.

You want to test three key assumptions. First, you test class invariants - facts that are always true for an object of a given class. Second, you test function preconditions - facts that are always true when a function starts. And finally, you test function post-conditions - facts that are always true when a function exits. By testing these three sets of assumptions explicitly, you save yourself lots of time in testing and debugging your code.

Test for Class Invariants

When you test for class invariants, you're looking for some fact, or set of facts, that's always true for an object. Imagine a drawing program that stores the bounding rectangle for every shape in the drawing. The shape class expects that the bounding rectangle always exists and has a non-zero area. You write the class invariant as a function like this:

[Conditional ("Debug")]

private void ClassInvariant ()

{

Debug.Assert (area.Width * area.Height > 0);

}

The Debug class is in the System.Diagnostics namespace. The Assert method stops the program if the expression being tested is false. All the methods of the Debug class are compiled into your executable in debug builds only. These assertions do not exist in release builds, so you can place them anywhere in your code without affecting the performance of your release executables. Furthermore, I tag the ClassInvariant method with the Conditional attribute, which means this function is compiled only into debug builds; in release builds, the ClassInvariant function does not even exist. Whenever you write a class that supports instance methods, write a class invariant function and call it from all your public methods and properties. That way, you immediately can determine whether the state of your object has become invalid (see Figure 1).

Figure 1. In this general flowchart for your public functions, notice all the extra blocks for testing your assertions. In each public function, verify that all your assumptions are true. If any of your assumptions prove false, stop the program immediately and determine the invalid states as quickly as possible.

I recently wrote a small console program to accomplish one specific task - converting an IIS log into an XML file. The program was laden with assumptions. I'll walk through parts of the program, show you the assumptions made, and show you how to ensure the classes handle any upcoming cases where those assumptions prove false. Using these techniques, you enable other programmers to extend the class for other tasks more easily.

The input file, an IIS log file, is a structured text file:

#Software: Microsoft Internet Information Services 5.1

#Version: 1.0

#Date: 2002-07-02 01:05:50

#Fields: time c-ip cs-method cs-uri-stem sc-status

01:05:50 127.0.0.1 DEBUG /VacationPlanner/index.aspx 200

01:05:58 127.0.0.1 GET /VacationPlanner/index.aspx 401

The first four lines show information about the Web server software, the date of the log, and the fields in the log file. Every following line shows one request. My sample turns that format into an XML file (see Figure 2).

WebServer="Microsoft Internet Information Services 5.1"

Version="1.0" Date="2002-07-02 01:05:50">

01:05:50

127.0.0.1

DEBUG

/VacationPlanner/index.aspx

200

01:05:58

127.0.0.1

GET

/VacationPlanner/index.aspx

401

Figure 2. XML files are easier to process than IIS log files. This small program turns IIS log files into XML files so you can process them using standard tools.

The program's main method simply parses the arguments, which might contain an input file and an output file. Users can put almost anything imaginable on a command line, so that code doesn't assume anything. It tests whether an input file actually exists and whether the output file can be created. The input file exits if the user gives the program any invalid arguments.

Test Preconditions

As programmers, we're accustomed to being careful when we deal with user input, so we test every possible error condition where the user is involved. When we write methods called by other programmers, however, we're far more trusting. We think programmers are like us, so they'll get it right. When we're in a hurry, we are even sure they'll get it right. This brings you to the LogProcessor class, which reads the input file and writes the corresponding output file. This class has one public, static method:

public static int ConvertLog (TextReader inStream,

XmlWriter oStream)

This method reads the input file, which is assumed to be an IIS log, and writes the output XML file. Many assumptions have been made about this function's parameters. The preconditions that must be satisfied for this function to work properly are that the input stream, inStream, is not null and is attached to an IIS log file, and that the output stream, oStream, is not null. Thankfully, that's not a lot of assumptions. You can test the first and third assumptions easily:

Debug.Assert (inStream != null,

"Input stream must not be null");

Debug.Assert (oStream != null,

"output stream must not be null");

The Assert statement is appropriate because you expect the caller to construct both the input and output streams before calling this method; not doing so is an error. Also, in release builds, these conditions throw exceptions (System.NullReferenceException, to be exact) if either of these conditions are not met. The addition of the Assert statement shows other programmers what you expect of them, and it helps catch errors earlier in the development cycle. Figure 3 shows the flowchart from Figure 1, with specific tests for this function.

Figure 3. In this specific flowchart for the ConvertLog function, notice that all the extra blocks are making tests that help you know quickly when someone calls your code with invalid or incorrect parameters.

The middle assumption - that inStream is attached to an IIS log file - forces you to think harder, both about how to test it and how to report it. You expect the input file to be an IIS log file, but what if it isn't? And, how can you tell? Here is the original code that read and processed the first line of the file:

// The first line should say what Web Server is running.

string WebServerName =inStream.ReadLine ();

// Everything after "#Software:" should be written:

string tag = WebServerName.Replace ("#Software: ", "");

oStream.WriteStartElement ("WebLog");

oStream.WriteAttributeString ("WebServer", tag);

This code fragment happily assumes a valid log file. You easily can see how to test the assumption here: Ensure the line read from the input stream actually starts with the "#Software:" string. But how should you handle this error condition? Is an Assert statement the correct answer? Probably not. Calling procedure to make sure a file truly is an IIS log file would be difficult, at least not without reading it. So this is not a programmer error in the same sense that the last assumption is. Your other two choices for handling the error are an exception or some kind of an error-return code. Try to reserve exceptions for runtime conditions you can't handle gracefully. Because that's not the case here, simply return an error code:

string WebServerName =inStream.ReadLine ();

if (false == WebServerName.StartsWith ("#Software: "))

return 1;

Go on to validate the file's other header lines in the same way. This function is static, so few preconditions exist. Instance methods in a class often have many more preconditions. Here's a list to go over when you're thinking about the preconditions for a function:

What does it mean for the object to be valid? What resources - files, member objects, properties - should have been allocated already before this method gets called? Testing your class invariant should be enough to tell.
Does this object rely on any other objects to perform its work? What state do those objects need to have for this method to work?
What must be true of any input parameters? (Previously, you checked the input and output streams and ensured the input stream is an IIS log file.) What about output parameters?
Is there any event that must not have happened before this method gets called? (This precondition is common with classes that support the IDisposable interface.)

If you look at all these cases, you should have covered any preconditions completely.

Test Post-Conditions

Post-conditions describe the changes a function makes to the system. Post-conditions involve changes to out or in/out parameters, changes to the object whose methods are being called, or changes to global objects. Basically, testing post-conditions is testing that a function did the work it promised to do.

This function has two post-conditions: The input file should have been read completely, and the output file should have been written. Testing these two conditions is simple:

Debug.Assert (null == inStream.ReadLine (),

"Input file not completely read!");

Debug.Assert (oStream.WriteState == WriteState.Closed,

"XmlWriter still open");

When I show these techniques in seminars, I get predictable comments. Aren't these tests simply showing the obvious? In some sense, they are, which is why they're important. If something that should obviously be true turns out to be false, programs fail. That's also why these tests are in the form of Assert statements. They should always be true, so you don't want to spend time testing them in a production environment. If some other member of the team violates one of your assumptions, however, you need to find out quickly. What you do varies from case to case. Maybe the assumption needs to be rethought, or maybe the caller needs to rework the code so it doesn't break your assumptions.

I've mentioned returning errors using error codes or using exceptions. Exceptions are for more serious problems. Suppose you're walking down the street and you see someone drop their keys. You might say, "Excuse me, you dropped your keys." A bit farther up the street, you see someone not paying attention and walking into the path of a speeding bus. You might grab them by the arm and forcefully pull them out of the way of the oncoming traffic. That illustrates the difference between return codes and exceptions. The pedestrian can choose to ignore your little aside about the keys, but he can't ignore you grabbing him and pulling him away from the bus. Exceptions can't be ignored, and if you try to ignore exceptions, the runtime terminates your program.

You choose exceptions for the same reason you chose the different response in my analogy: The consequences of dropping your keys is considerably less than the consequences of stepping in front of a bus. One reason to choose exceptions is to report errors that can't be ignored - catastrophic errors that completely stop your application from continuing in a normal way.

The second property you should associate with throwing an exception is that an exception should not be used for error conditions that are reasonably easy to anticipate. That's why I chose a return code in this article's sample application when the input file is not an IIS log file - a user might easily pass some other file to the application. I can't fix that problem, but I felt it was far too likely an occurrence to use exceptions.

Exceptions Happen

Like it or not, you now need to write exception-safe code. At one time, exceptions were regarded as so difficult to master that many coding shops made a coding standard that expressly forbade their use. This was an easier practice in the days when many C++ compilers did not support exceptions well. But you can't live like that anymore. The .NET Framework throws many different exceptions in many different methods. The runtime itself also generates exceptions, such as the System.NullReferenceException. Like it or not, you need to structure your code to handle exceptions properly. The C++ community, with its many years working with exceptions, has developed some useful guidelines in this area. (For more information, read Exceptional C++, by Herb Sutter.) Your goal in a function is simple: Either the function should do all its work and exit normally, or the function should do none of its work and throw an exception. This strategy gives your callers the most options when they catch whatever exceptions you throw. Look again at the first few lines of the function that processes the log file (see Figure 4).

// Read the first four lines.

// The first line should say what Web Server

// is running.

string WebServerName =inStream.ReadLine (); // (1)

if (false == WebServerName.StartsWith ("#Software: "))

return 1;

// The second line is the version.

string ver = inStream.ReadLine ();

if (false == ver.StartsWith ("#Version: "))

return 2;

// The third line gives the date:

string date = inStream.ReadLine ();

if (false == date.StartsWith ("#Date: "))

return 3;

// The fourth line gives all the field names.

string fields = inStream.ReadLine ();

if (false == fields.StartsWith ("#Fields: "))

return 4;

// Everything after "#Software:" should be written:

oStream.WriteStartDocument (true); // (2)

string tag = WebServerName.Replace ("#Software: ", "");

oStream.WriteStartElement ("WebLog");

oStream.WriteAttributeString ("WebServer", tag);

Figure 4. This code checks the first few lines of the IIS log for errors. Note the structure of the == comparison. By putting the constant value - false - on the left side of the equation, the compiler catches the common mistake of using only one equals sign.

Only two lines here can throw exceptions, and they're noted with (1) and (2) in the comments of Figure 2. Both of those lines are called before any of the real work gets done in the function. By putting the code in this order, either all or none of the work gets done. Before any of the output file gets written, you have checked the input file as best you can to determine if it is really an IIS log file, and the input and output streams have been checked against null. You almost always can accomplish the same goal in your programs by ordering the logic in a function so you perform any tasks that might cause exceptions first, then continue with the rest of the function's code.

This ordering is extremely important for clients of your class. If you throw exceptions anywhere in your code at any time, all a programmer using your class knows is something went wrong. The current program state is indeterminate when your code throws an exception. This greatly limits the client's ability to clean up the mess. In this simple program, you can see that if an exception gets thrown, the output stream did not get written to at all. Effectively, the function has no output.

This principle is even more important in a more complicated function. Suppose you have a method that writes several records to multiple database tables. What if that function exits with an exception? Is the database connection open? Did any of the records get written? Unless you know what did or did not happen when an exception gets generated, you can't possibly clean up after them. You might well have left the system in an invalid state, and because you wrote some records to the database, the persistent storage is now invalid. This is why you should always try to structure code so anything that might throw exceptions does before you modify the system state. If necessary, you can use temporary objects to do your scratch work.

The techniques I've described are all built around three simple principles: First, discover problems as early as you can, both in the development cycle, and at run time. Second, if your code discovers any errors, report as much information as possible about the error to the caller (or to the tester in a test environment). Third, if your functions exit by throwing exceptions, try to ensure you have not modified the system state.

The sample code in this article is available for download.

Bill Wagner is SRT Solutions' Windows technology expert (http://www.SRTSolutions.com). He is the author of C# Core Language Little Black Book (The Coriolis Group), an advanced reference for C# developers. Bill has taught developers and led the design for many projects in his 16 years of software development. E-mail him at mailto:[email protected].

Tell us what you think! Please send any comments about this article to [email protected]. Please include the article title and author.

Rules to Remember

Remember these rules when you write a class (see Figure A).

Condition	Program Behavior
Invalid object state	Failed assertion
Invalid input parameter	Failed assertion; throw exception or return error code
Invalid preconditions	Failed assertion; throw exception
Invalid post-conditions	Failed assertions; throw exception
Invalid object state (post process)	Failed assertions

Figure A. You should check states according to this table, which describes what your program should do when it encounters these states. Some options are on the right-hand side, but if you look for these five possibilities in your functions, you'll catch many programming errors early, quickly, and easily.

Class invariants are statements that are always true from the time an object has been created until it is destroyed. You should check the class invariants at the beginning of every public instance function.

Precondition and post-condition checks help find faulty assumptions (yours and those of a client class).You should check these assumptions using Assert statements at the beginning and end of every public function.

Exceptions express serious errors that cannot be handled easily locally. You should throw your own exceptions when you can't possibly handle an error locally and when you can't anticipate the consequences of an error.

Finally, writing exception-safe code means ensuring your functions preserve program state in the face of exceptions. You should strive to make sure that no functions change any program state when they exit with an exception.

Comments

Plain text