IP Multicast and Your Network

This data delivery process might be your network's salvation

IP multicast is a process that transmits information from a source to multiple destinations with one data stream, rather than with multiple streams. Multicast can greatly reduce the network traffic that bandwidth-hungry applications such as videoconferencing, software distribution, and Webcast create. Many Windows NT network administrators realized the power of IP multicast and the benefit of IP multicast applications in 1996, when Microsoft released NetShow 1.0, an IP multicast-enabled multimedia application. Microsoft also developed Active Channel Multicaster in Site Server 3.0 to multicast Web contents to browsers, and the company implements Internet Group Management Protocol (IGMP) version 2, an enhanced IP multicast function, in its Windows OSs. Other pioneering multicast-application companies have delivered commercial IP multicast applications—Cisco Systems' IP/TV and StarBurst's Multicast File Transfer Protocol (MFTP) are two examples.

Network vendors widely support IP multicast in their routers. Internet Service Providers (ISPs) have started to offer multicast service in their backbones (e.g., UUNet's UUCast service). IP multicast and its applications have come out of test labs and Multicast Backbone (MBone), an Internet test bed, to find homes in corporate and ISP production networks. For example, Toys R Us uses MFTP to distribute software updates to its more than 800 stores, paring the time required to make a 1MB file transfer to those stores from 1 day to 4 minutes. Many other companies are using or starting to use IP multicast. The sidebar "A Case in Point: Microsoft's Multicast Network," page 110, reveals how Microsoft uses real-world IP multicast in its company network.

Before you can implement an efficient multicast network, you need to understand IP multicast technologies. I discussed IP multicast basics, network checkpoints, and NetShow Live in the May 1997 article "Be Prepared for IP Multicasting Applications." In this article, I further explore IP multicast technologies and their enhancements, and I describe how you will use them in your multicast network.

What Is an IP Multicast Network?
The purpose of IP multicast is to deliver a data stream from a source to a group of receivers. Receivers interested in a particular multicast application join the application's multicast group. A multicast group is dynamic and transient—receivers can join and leave the group at will, and the group disappears when no receivers remain. Group members listen to and receive data that the source delivers to the group's IP multicast address. Individual Class D IP addresses in the range from 224.0.0.0 to 239.255.255.255 represent all multicast groups. The multicast's source doesn't need to join its group, nor does the source know who and where its receivers are. The source simply transmits multicast streams to the IP multicast address of its multicast group, then lets the network handle multicast data delivery. An IP multicast-enabled network can efficiently forward and route multicast data to receivers. Three key techniques manage multicast network delivery: scoping, group management, and distribution trees.

Multicast scoping determines how far a multicast stream can travel from its source. Limiting the range of a multicast can prevent business data from reaching outside a network, thereby providing security. In multicast group management, multicast-enabled routers keep track of multicast group membership through subnets the routers directly attach to. The multicast-enabled routers forward multicast data only to the subnets that have group members, thereby saving network bandwidth. Multicast distribution trees define data delivery from a source to a multicast group, then build an optimized distribution tree that contains a set of routers and links to let group members receive data from the source. Let's take a close look at each of these network delivery technologies.

Multicast Scoping
Traditionally, IP multicast uses a Time to Live (TTL) parameter in an IP multicast application and multicast routers to control the multicast distribution. When you define the TTL value in an IP multicast application, contents don't transmit beyond the TTL value. For example, if you set Site Server's Active Channel Multicaster TTL value to 16, you ensure that Site Server's Web contents don't multicast beyond 16 router hops. Each multicast packet carries a TTL value in its IP header. Just as in unicast, every time a multicast router forwards a multicast packet, the router decreases the packet's TTL by 1. By default, a router won't forward packets with a value of TTL=1. You can modify the default TTL threshold to another value on each interface in a multicast router. For example, if you set the TTL threshold to 10 on a router interface, only packets with a TTL value that is greater than 10 can pass that router interface.

TTL-based multicast scoping has a couple of shortcomings. First, defining a proper TTL value in a multicast application can be difficult. If the value is too large, your multicast data might go out of your network. If the TTL value is too small, your multicast data might not reach interested receivers beyond the multicast scope. Second, if someone sets custom TTL thresholds in certain router interfaces, your multicast range can be unpredictable. For example, you wouldn't be able to divide your network into multicast regions to limit multicast applications to those regions, because if one router has an interface TTL threshold that is lower than a packet's TTL value, the router will forward the packet.

To overcome these limitations, the Internet Engineering Task Force (IETF) proposed Administratively Scoped IP Multicast as an Internet standard in its Request for Comments (RFC) 2365 in July 1998. Administrative scoping lets you scope a multicast to a certain network boundary (e.g., within your organization) by using an administratively scoped address. IETF has designated IP multicast addresses between 239.0.0.0 and 239.255.255.255 as administratively scoped addresses for local use in intranets. You can configure routers that support administratively scoped addressing on the border of your network to confine your private multicast region. You can also define multiple isolated multicast regions in your network so that sensitive multicast data will travel only within a designated area. Figure 1 shows a network with three multicast regions: Region 239.253.1.0 in Data Center 1, region 239.253.2.0 in Data Center 2, and region 239.253.3.0 in Data Center 3. When multicast data traverses Data Center 1, Router 1—which the boundary 239.253.1.0 defines—blocks any multicast packets with administratively scoped addresses of 239.253.1.x from transmitting to other data centers.

Multicast Group Management
A multicast network forwards multicast data only to network subnets that have receivers in the corresponding multicast group in a scoped multicast region. (Selective forwarding is the biggest difference between multicast and broadcast. Broadcast floods data to all subnets.) To forward multicast data to receivers in a scoped multicast region, routers need information about group membership on their local subnets. One router on each subnet periodically multicasts membership query messages to all computers on the local subnet. Computers on the local subnet who are group members respond to the router's query message with a membership report about the group they belong to. The router keeps this membership information in its group database. The local subnet computers also multicast membership reports to the groups they belong to. When other group members receive a member computer's report, the group members postpone their membership reports and wait for a variable period of time. This waiting period reduces membership report traffic and router processing time. As long as a router knows one group member on the local subnet, the router forwards multicast data to that subnet, and other group members will receive the multicast data. When a new member joins a group, the member doesn't need to wait for the next membership query from the router. Instead, the new member immediately sends a membership report as if in response to a membership query. When the router receives this report, it immediately forwards multicast data to the subnet on which the new member resides if the new member is the first member of that subnet's multicast group.

IGMP. Routers and computers use IGMP to exchange membership information. IGMP is an integral part of IP. The two standard versions of IGMP are IGMPv1 (RFC 1112) and IGMPv2 (RFC 2236). IETF released IGMPv2 as an enhanced version of IGMPv1 in November 1997. Today, many routers and OSs support IGMPv2. Microsoft implements IGMPv2 in Windows 2000 (Win2K), in NT with Service Pack 4 (SP4), in Windows 98, and in Win95 with Winsock 2.

IGMPv2's biggest enhancement is a group notification feature. In IGMPv1, a receiver that leaves a multicast group doesn't automatically notify the router. Rather, the router assumes no group member is on the local subnet if the router doesn't receive a membership report after several queries and waiting intervals. Several minutes or more can then pass before the router stops forwarding data to that subnet. In IGMPv2, receivers leaving a group directly inform the router. The router then queries the subnet to see whether any other group members remain. If the router doesn't receive a response, it assumes that no other group members exist on that subnet and stops multicast forwarding to that subnet.

IETF is working on IGMPv3 to further improve IGMPv2. IGMPv3 will include several new features. One such feature will let a computer specify which sources in a specific group the computer will receive data from.

Multicast Distribution Trees
Using routers' IGMP group membership information, a multicast routing protocol builds an optimized distribution tree to deliver multicast data from a source to the multicast group. Two kinds of distribution tree exist—source-based and shared.

A multicast routing protocol constructs a source-based tree with data that a source sends. When the local router connecting to the source receives the source's first multicast packet, the router broadcasts the packet to the edge of the network to search for receivers. If a router on the edge of the network doesn't find any receivers on its local subnets after checking its IGMP group database, the edge router sends a prune message to its nearest parent router on the path (or branch) from the source. The prune message removes the edge router from the branch. This pruning process proceeds, router by router, backward along the branch toward the source until coming to a router that leads to group members (i.e., an active branch). Through this broadcast-and-prune procedure, the multicast routing protocol forms a distribution tree that contains only active branches. The routing protocol repeats the broadcast-and-prune procedure periodically during multicast to update the distribution tree and reflect group membership changes.

Figure 2 illustrates the broadcast-and-prune procedure. In Figure 2, when the first multicast data reaches router R5, R5 sends a prune message to router R3 and removes itself from the tree because it doesn't have local group members. R3 is on an active branch, because its child router, R4, has local group members. The pruning process on R5's branch stops at R3. Routers R6 and R7 also prune themselves from the tree. The final distribution tree contains only two branches: one branch from the source to router R2, and the other branch from the source to R4.

A multicast routing protocol that uses the broadcast-and-prune procedure to set up a source-based tree is a broadcast-and-prune protocol. Because broadcast-and-prune protocols make periodic broadcasts, these protocols are suitable for use only in LAN environments with substantial bandwidth and densely populated receivers. The term for such an environment is dense mode. Another disadvantage of using broadcast-and-prune protocols is that if two different sources send data to the same multicast group, the protocol must create a source-based distribution tree for each source.

In contrast to the source-based tree, a shared tree, which the shared-tree multicast protocol builds, can support multiple sources to multicast data to the same multicast group using the same distribution tree. Shared tree fits into sparse mode when receivers are sparsely distributed on a low-bandwidth WAN.

A shared-tree protocol either automatically chooses or lets a network administrator manually define the root of a shared tree in a network. The root, which is a router, is known as a core or rendezvous point (RP). Roots are often the center of multicast groups. When a source's local router receives multicast data, the router forwards the data to the core of the multicast group. The core further multicasts the data to all receivers in the group. The shared-tree setup doesn't use broadcast-and-prune to find group members but requires all members to join the tree. When a router finds a local group member by checking its received IGMP group membership report, the router sends a join request to the core. The core or an intermediate router that is already in the tree responds to the request with a join acknowledgment, sending the acknowledgment to the requesting router. Figure 3 illustrates the join-and-acknowledgment process. When router R3 has a new local group member, R3 sends a join request to and receives an acknowledgment from the core of the group through the intermediate router R2. When router R5 sends a join request to the core via router R4, R4 returns the join acknowledgment to R5 because R4 was already in the tree for its existing local group members.

Because of the join function, shared-tree protocols require all routers in a multicast region to know the region's multicast group and core mapping information. A bootstrap router in the network collects the mapping information that cores advertise and distributes the information to other routers.

Multicast Routing Protocols
Five multicast routing protocols currently exist, and you can classify each into either the source-based tree (dense mode) or shared-tree (sparse mode) protocols. The three dense-mode protocols are Distance Vector Multicast Routing Protocol (DVMRP), Multicast Open Shortest Path First (MOSPF), and Protocol Independent Multicast-Dense Mode (PIM-DM). The two sparse-mode protocols are Protocol Independent Multicast-Sparse Mode (PIM-SM) and Core Based Trees (CBT).

DVMRP (RFC 1075) is the first multicast routing protocol researchers developed to implement MBone in the Internet, and DVMRP is still prevalent in MBone. DVMRP builds a source-based distribution tree based on broadcast-and-prune. The DVMRP tree includes a dedicated Routing Information Protocol (RIP)-like unicast routing protocol and depends on this protocol to determine the shortest path from the source to the multicast group when setting up the distribution tree. UNIX machines were among the first to implement DVMRP. DVMRP support is ubiquitous in almost all vendors' routers.

MOSPF (RFC 1584) is simply an extension of Open Shortest Path First (OSPF), a well-known unicast routing protocol in IP networks. OSPF divides a network into one or more OSPF areas and uses link-state information (i.e., information about router interfaces and network wires) to set up and maintain a unicast routing table. (To learn more about OSPF, see "Steelhead's OSPF Routing," August 1997.) MOSPF uses OSPF as the native protocol to advertise IGMP group membership in each router as part of the link-state information within an OSPF area. MOSPF can easily construct a source-based distribution tree by using the link-state database in a router instead of the usual broadcast-and-prune procedure. MOSPF supports multicast between multiple OSPF areas and uses border routers that link OSPF areas to forward IGMP group membership information and multicast data between OSPF areas. MOSPF is a native choice if your network uses OSPF as its unicast routing protocol. 3Com and Nortel support MOSPF in their routers.

PIM-DM, an Internet draft, uses broadcast-and-prune to form a source-based tree, similarly to DVMRP. However, PIM-DM uses the existing unicast routing protocol in your network, such as RIP or OSPF, to determine the shortest path from the source to the multicast group. The use of existing protocols is the reason behind PIM's name (Protocol Independent Multicast).

PIM-SM (RFC 2362) is another PIM protocol, but PIM-SM is suitable for use in sparse mode. PIM-SM uses shared trees to deliver data and refers to the root of a shared tree as an RP. However, a shared tree might not reflect the shortest path from source to multicast group. Thus, PIM-SM can let routers optionally switch to a source-based tree to receive source data after initial data delivery in the shared tree and based on some triggered conditions (e.g., if the shared tree's data delivery rate is too low). Cisco is PIM's primary advocate and supports both PIM protocols in its routers.

CBT (RFC 2189) is similar to PIM-SM. However, CBT uses only shared trees for data delivery and can't switch from shared trees to source-based trees, as PIM-SM can. CBT calls the root of the shared tree a core. Vendors haven't widely implemented CBT; I haven't found support for CBT in 3Com, Cisco, or Nortel routers.

Multicast on the Internet
MBone is a set of multicast networks within the Internet. MBone serves as a test bed researchers use to develop IP multicast and its applications on the Internet. MBone also hosts audio and video multicasts for IETF and some government organizations. The most popular multicast applications in MBone include the visual audio tool (vat), videoconferencing tool (vic), and whiteboard tool (wb). These applications run on UNIX as well as on NT and Win95. MBone uses the session directory tool (sdr) to announce public multicast sessions (i.e., applications). MBone users can use sdr to find particular sessions. Sdr is itself a multicast application.

MBone has connected thousands of networks to its backbone since 1992. Each network in MBone is a separate multicast region. Although MBone runs DVMRP on its backbone, individual multicast regions can use any multicast routing protocols internally. A network connects to MBone by a DVMRP tunnel, as Figure 4 shows, which lets multicast traffic pass through nonmulticast-enabled routers in the Internet by encapsulating a multicast packet in a unicast packet. You connect your network to MBone through major ISPs, and you can use DVMRP to tunnel your network to MBone. Most router vendors today support PIM-to-DVMRP and MOSPF-to-DVMRP interoperability, so you can easily use a multicast routing protocol other than DVMRP inside your network.

MBone, however, isn't for commercial use. And the Internet doesn't support native multicast, because many routers on the Internet don't speak multicast routing protocols. Most important, none of the IP multicast routing protocols I've described scales enough to work on the Internet, which contains many more routers than MBone does. Although PIM-SM and CBT are suitable for use over a WAN, both of these protocols require that all routers know all multicast groups and their cores or RPs. This requirement makes implementation impossible on the Internet. 3Com, Lucent, and Sun Microsystems are working on a new protocol, Simple Multicast Protocol, to implement IP multicast on the Internet. Simple Multicast Protocol will simplify the method by which routers keep track of the source and receivers of a multicast stream. The protocol uses an eight-bit identifier that consists of the IP multicast address and the IP address of the multicast source or core. When a receiver joins the multicast group, the receiver sends this identifier to the local router, which can immediately identify the multicast source and group.

Researchers are working on other prospective multicast routing protocols for the Internet, including Multicast Border Gateway Protocol (M-BGP) and Border Gateway Multicast Protocol (BGMP). M-BGP is integrated with border gateway protocol (BGP), an exterior routing protocol that the Internet uses widely to link routing domains between ISPs and organizations. M-BGP can support multitiered multicast and routing policies among ISPs and organizations on the Internet. BGMP is based on PIM-SM and CBT. BGMP identifies a root domain, rather than a root router, in PIM-SM and CBT for a multicast group on the Internet. From the root of the domain, BGMP builds a tree of domains.

IETF is doing the important work of defining a standard architecture of global multicast address allocation for multicast applications on the Internet. Currently, two multicast applications can't use the same multicast address on the Internet at the same time, or both applications will fail. This limitation is similar to the limitation whereby two computers can't have the same IP unicast address in an IP network. The IETF's draft of the global Internet multicast address allocation architecture uses multicast extensions to Dynamic Host Configuration Protocol (DHCP), called MDHCP, to dynamically assign multicast addresses from a multicast address allocation server (MAAS) to applications in an allocation domain, such as an ISP network and intranets. However, no authority—such as the Internet Assigned Numbers Authority (IANA), which centrally manages and controls IP unicast addresses—currently exists to assign a block of IP multicast addresses to ISPs and organizations.

Ready for Your Intranet
Although IP multicast technology isn't quite ready to be widely deployed over the commercial Internet, the existing IP multicast routing protocols such as DVMRP, MOSPF, PIM-DM, and PIM-SM work well in intranet environments. When you take advantage of Microsoft's and other vendors' delivery of multicast applications, you can build a multicast NT network and deploy such multicast applications as videoconferencing, training, Webcasting, file replication, and software distribution on your intranet to save your network bandwidth and help your users. The sidebar "IP Multicast Resources" lists some references that can help you plan your IP multicast network.

IP Multicast and Your Network

Comments

Plain text