I got back from a meeting on the Microsoft campus only to discover that I could not get my laptop’s browser to see Google. I spun around to check another system and it too acted like Google was down. “Right… that’s going to happen.” I started to troubleshoot the problem.
After making sure the cats had not hidden one of their critters in the system rack (again), I reset the DSL modem but that didn’t fix anything. Its status lights were all on—it thought it was seeing the web. I bypassed my router and still no access to the web. Hummm. I looked in the back yard for that mob of Verizon techs that stand outside my house waiting for something to go wrong but, for once, no one was there. I tried to call Verizon business support but got a fast busy signal. Strange. I then tried another support number. Same deal—I had no land-line phone service so I turned on CNN. Perhaps someone had launched a Chevy Camero into orbit and it had taken out a couple of communications satellites.
I then tried using my AT&T cell phone to call Verizon—ah, this worked. A recording said that there were “issues” in Redmond, Washington and if I really wanted to talk to someone to hold on the line—I did. I got a sweet and rather rattled little girl who sounded as if it was her first day on the job. She must have apologized a dozen times in ten minutes about how the Verizon systems were really slow. Eventually she found out that some knucklehead had sliced through a fiber cable that served 20,000 customers in Redmond. Not good. Prognosis: 36 hours to fix it. Sigh.
The lucky part of this adventure is that I was in the office about 45 minutes after the systems went down. If I had been out of town or asleep, I would not have known it. Since I could not work mail, make my reservations or download the latest Dilbert, I decided to see if I could write a program to periodically check the LAN and WAN for connectivity and sound an alarm when it failed and another when it came back up.
One thing led to another and I ended up with a pretty involved 150 line application that leverages a number of .NET classes and a Windows API class as well. When you’re served lemons, make lemonade. The application plays a recording when the web goes down and keeps playing it until someone notices. While this was somewhat of a learning process for me since I didn’t have access to the web, it was hard on Visual Studio as well—it really seems to bog down when it can’t see the web even when you try to use local help.
This article walks through the network/WAN monitoring application that I wrote and illustrates several useful .NET Framework classes—some of which can be useful when writing an application that depends on remote access to the LAN or WAN. It also illustrates several “alerting” techniques to draw attention to the fact that something is wrong. Would I leave this running on my desktop to tell me when some confused backhoe driver has cut a cable? Probably not, but I would run it on a server that had a sound card and was not in some sealed room on the fourth floor. I guess the application could send an email message if the LAN failed but somehow I don’t think that would work as well. I also considered using a BSR/X10 signal to flash the porch light but I didn’t go that far.
This problem does raise a serious issue. How dependent is your organization on an external ISP? In my case I could neither use dialup or the “copper” DSL connected to the office. My Exchange server was useless and my customers and co-workers were unable to communicate with me. I can see that I need to setup a backup mail forwarding server in the UK that would (might) not be subject to the same attack. In this case, I would be able to access mail from some other site not affected by the incident.
Testing for LAN Connectivity
Let’s walk through the application. Sure, it does not make much sense to try to ping a WAN server if the local area network is not available. That’s why I added code to test for LAN availability before I attempted to ping outside the domain. Incidentally, I add this code to SQL Server front-end applications that need to access LAN-based remote servers. Testing the LAN is really easy to implement—just use the NetNetwork class IsAvailable method. If this property changes state, or its NetworkAvailability event fires, you know the local area network is down. See line 29 in the example code provided.
Sure, it makes sense to code the NetworkAvailability event handler so you can immediately notify the user as any SQL Server connections are probably broken as are off-system files and, of course, WAN connectivity is also likely unavailable.
Some systems (especially laptops) automatically hunt around for replacement LAN connectivity so your system might burn considerable CPU cycles trying to restore a connection long before you’ll notice. Consider that it’s part of the Ethernet protocols to permit short-term disconnects so it might be some time before the system realizes that the LAN is down.
Testing for WEB Connectivity
While it’s possible to test the WAN, there are any number of failure modes as the WAN experiences short-term problems, heavy use or outright loss of service. No, there are no easy ways to determine if the LAN is bogged down by that guy down the hall downloading movies off of YouTube but there is a lot of information you can get back from the WAN.
The technique I used to test the WAN is (again) pretty simple—I used the Send method of the Ping class. The Network class exposes a Ping method but it only returns a Boolean, which might be enough information but I wanted a bit more detail on what is wrong. The Ping class Send method returns a PingReply object that can provide this detail and how bad (or well) the WAN is working. I was lucky; I was able to trap the rare exception that is thrown when I tried to ping during the cable outage. I chose to ping Google as it’s a site that responds quickly and is heavily backed up—a good bellwether for WAN availability.
Later, once the Verizon folks got the cable fixed, Ping began to return entirely different results when I tried to simulate the failure by disconnecting the DSL cable. For some reason, during the cable failure, Ping threw a System.Net.NetworkInformation.PingException whose inner exception reported that the URL could not be found.
The code shown in Figure 1 illustrates use of the Send method and capturing the PingReply object that was returned. This object exposes a Status property that includes a fairly comprehensive set of details. These include (among others)
- DestinationNetworkUnreachable as well as Host, Protocol and Port being unreachable. These are typical of the errors you’ll get when the infrastructure is not working. For example, when the DSN is not up and Windows can’t resolve the name (www.google.com) to an IP address.
- DestinationProhibited. This means you didn’t use www.google.com or some other public site.
- NoResources. This would be bad. It suggests there are “insufficient network resources” available.
- HardwareError. Again this is self-explanatory. Your NIC or network infrastructure is busted.
- TimedOut. This is the exception I got when the DSL modem was simply powered down. You set a timeout (in milliseconds) when calling Ping. My pings to Google took about 100ms so if it does not return in 5000ms then there is something busted.
- BadRoute. This suggests that there is a more serious problem with the Internet or ISP (assuming you’re accessing www.google.com).
- BadDestination. No, this does not mean you were trying to access one of the naughty sites, but that the URL you provided was not correct. I got this when using HTTP://www.google.com instead of just www.google.com as the address.
- DestinationUnreachable. Again, I would suspect the Web before suspecting the target.
- Unknown. The Ping could not figure out what went wrong.
The RoundTrip property also tells you how long it took to get to the chosen site. In my case, it took about 100ms to get to Google.com—your mileage might vary.
The Ping routine (as shown in Figure 1) is called when an application timer fires—every so often. Yes, you’ll probably want to set this value to check for availability every few minutes to reduce the overhead.
The ReportFailure routine (as shown in Figure 2) simply reports on the reason for the failure and starts notifying the user—turning on the bells and whistles. This code was more of a challenge to write. While I had no trouble recording voice WAV files to report the problem, I had a little trouble figuring out how to play these in .NET.
Be sure to take a look at the routine I used to blink the program icon when the form is minimized (in the attached code). Initially, I was puzzled that the .NET Framework did not have any way to do this for me. Thanks to a fellow MVP’s (Rob Teixeira) suggestion, I was able to build an old-fashioned API call to the FlashWindowEx Windows function. I built this into a class that’s easy to use from Visual Basic .NET (or even C# with some recoding).
The application also recorded the activity to a unique log file. Again, I found an interesting bug in Visual Studio and while it has been reported, the folks at Microsoft have not (yet) been able to fix it. It seems that the Intellisense feedback does not show the contents of a GUID variable and gets thoroughly confused. See Connect for details.
Clearing Up the Mystery
I hope this routine can help clear up a few of the mysteries of the Network and Ping classes. Since I do most of my work with SQL Server front-ends of various ilks, I find these classes especially valuable when trying to figure out why my ADO.NET connection won’t connect. And no, I don’t recommend that you connect to SQL Server over the WWW—not if you want to keep your data secure. However, if some two-day-on-the-job backhoe operator gets clumsy again, I’m ready. I hope you will be too.