The Emulex SCSI Port Driver Failover Test

Getting the Emulex SCSI Port 1.24 driver to work was a challenge. On my first attempt, the Emulex Configuration utility (elxcfg.exe) that you use to configure the port driver hung at my attempts to configure a LUN for an Availability Manager that didn't already have an assigned LUN. The only way I was able to recover was to reboot. However, with a little experimentation, I found a solution to this dilemma.

This process takes several system reboots, so when you configure the port driver, be patient. First, check your zone mapping in McDATA's ED-5000 Enterprise Fibre Channel Director. You need two zones, and each zone needs to have one host bus adapter (HBA) and one Availability Manager as the zone's only members. This zone mapping configuration is standard for redundant server HBAs that Amdahl's A+LVS Redundant Disk Array Controller (RDAC) uses. When you install the port driver, select the Fabric—no automatic SCSI mapping option from the list that the installation routine displays. After you reboot the system to make the driver operational, start up the Emulex Configuration utility from the Start menu. You'll see a line near the top of the first screen that the Emulex Configuration utility displays for each installed Emulex controller on the system. The window's right-hand pane lists the options that you must individually set and apply for each Emulex controller. The top setting—Automap SCSI Devices—will be unchecked because you chose to install the driver to map LUNs, not SCSI devices. Select the Map LUNs and Automap LUNs options, then select Apply to save the setting. You select the same settings for the other Emulex controllers.

After you reboot the system, the server will see all the LUNs that you created on the Amdahl LVS 4600 Storage System. You next use the Configuration utility to configure each controller to expose only the LUNs that this server needs to access. You select a controller and the SCSI target that appears in the Configuration utility window's lower pane to access the LUN Map button. Selecting the LUN Map button displays a window that shows a list with as many entries as the Maximum Number of LUNs parameter in the previous window. Click the Disable LUN Mapping button, then click Done and Apply to remove all automatically mapped LUNs from the controller's list. However, the Configuration utility will still contain the LUN list. Select the LUN Map button again, then select Add to display a list of all the LUNs you created on the LVS 4600. Select the LUNs that you want this server to access (you must always select LUN 0), then click Done and Apply. Select the next controller and complete the same series of steps. Your last step is to uncheck the Automatic LUN Mapping option, click Apply for both controllers, and reboot. Only the LUNs that you selected will be available to the server.

When I performed my failover test with the full port driver, which was similar to the test I used for the Emulex Mini Port 4.31 driver, failover occurred faster than it did with the miniport driver. Data started flowing again about 15 seconds after I pulled out a cable between an HBA and the ED-5000 Director. I used the A+LVS Recovery utility to put the failed Availability Manager back online, but this part of the process didn't work as I expected. The A+LVS Recovery utility caused the Availability Manager to reset and the LED to indicate a normal state, but the A+LVS Recovery utility never reported that the controller was back online. The file-copy operation never resumed along the original path, and the server locked up. Although the server responded to pings, I couldn't start new programs. The server didn't even respond to a remote shutdown command. However, the ED-5000 Director LED activity indicated that the data transfer I initiated continued to occur. I tried another approach and pulled the cable for the Emulex controller that was communicating with the active Availability Manager. The controller failed back over to Availability Manager A, and Availability Manager B went out of service. Data was still flowing, and the server's GUI came back to life, but I still had only one data path open. This time, the A+LVS Recovery utility wasn't able to communicate with Controller A, which was the active and working controller, and couldn't repair the damage.

I waited until the data transfer completed, then rebooted the server. However, I was still unable to run the Recovery utility and couldn't access the three LUNs that I had configured earlier. I checked the Emulex Configuration utility and discovered that all LUN mapping for Controller A was gone but that the LUNs remained configured for Controller B, the offline controller. I remapped the LUNs for Controller A, then rebooted the server. Windows Explorer could see and access the LUNs, but the Emulex Configuration utility couldn't see Controller A to repair the damage. I needed to run the Recovery utility at another server (one that was using the miniport driver) to place Controller B back online. During a quick file-copy test, the ED-5000 Director's LEDs showed that both controller paths were in use. However, the A+LVS Configuration and Recovery utilities still were unable to communicate with the LVS 4600. I contacted an Amdahl Technical Support representative, who explained that when A+LVS can't see all the defined LUNs, it has a problem. The representative told me that Amdahl is working to correct the software's problem. I was able to use the Emulex Configuration utility to reconfigure the SCSI port driver so that both adapters gave access to all the LUNs I had created. I rebooted and verified that the A+LVS utilities were working properly.

A shortcoming that Amdahl's limited support for the SCSI port driver creates is that you must configure LUN 0 for all cards, even when the server doesn't need to access that disk volume. The A+LVS software requires LUN 0 for operation. To verify this requirement, I used the Emulex Configuration utility to give the server access only to LUNs 2 and 3. After I rebooted the server, it had the expected access; however, failover didn't function at all. When the OS was copying a large file from LUN 3 to LUN 2, the system detected a failure when I simulated a link failure. However, failover didn't occur, and both Availability Managers remained online. I plugged the cable back in to restore the link, and normal functions resumed. Also, when you don't define LUN 0, A+LVS won't start, and you receive the message Could not find any RAID Modules.

The bottom line is that when you use the SCSI port driver, the server sees only the LUNs it should be using and will fail over to the alternate path when necessary. However, you must run the Recovery utility from a system that is using the SCSI miniport driver, and the SCSI port driver requires you to create a small, unused LUN 0 that is visible to all servers.

Hide comments


  • Allowed HTML tags: <em> <strong> <blockquote> <br> <p>

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.