Chapter 12. IP Multicasting

Introduction

Recall from Chapter 8 that class D IP addresses (224.0.0.0 to 239.255.255.255) do not identify individual interfaces in an internet but instead identify groups of interfaces. For this reason, class D addresses are called multicast groups. A datagram with a class D destination address is delivered to every interface in an internet that has joined the corresponding multicast group.

Experimental applications on the Internet that take advantage of multicasting include audio and video conferencing applications, resource discovery tools, and shared whiteboards.

Group membership is determined dynamically as interfaces join and leave groups based on requests from processes running on each system. Since group membership is relative to an interface, it is possible for a multihomed host have different group membership lists for each interface. We’ll refer to group membership on a particular interface as an {interface, group} pair.

Group membership on a single network is communicated between systems by the IGMP protocol (Chapter 13). Multicast routers propagate group membership information using multicast routing protocols (Chapter 14), such as DVMRP (Distance Vector Multicast Routing Protocol). A standard IP router may support multicast routing, or multicast routing may be handled by a router dedicated to that purpose.

Networks such as Ethernet, token ring, and FDDI directly support hardware multicasting. In Net/3, if an interface supports multicasting, the IFF_MULTICAST bit is on in if_flags in the interface’s ifnet structure (Figure 3.7). We’ll use Ethernet to illustrate hardware-supported IP multicasting, since Ethernet is in widespread use and Net/3 includes sample Ethernet drivers. Multicast services are trivially implemented on point-to-point networks such as SLIP and the loopback interface.

IP multicasting services may not be available on a particular interface if the local network does not support hardware-level multicast. RFC 1122 does not prevent the interface layer from providing a software-level multicast service as long as it is transparent to IP.

RFC 1112 [Deering 1989] describes the host requirements for IP multicasting. There are three levels of conformance:

Level 0

The host cannot send or receive IP multicasts.

Such a host should silently discard any packets it receives with a class D destination address.

Level 1

The host can send but cannot receive IP multicasts.

A host is not required to join an IP multicast group before sending a datagram to the group. A multicast datagram is sent in the same way as a unicast datagram except the destination address is the IP multicast group. The network drivers must recognize this and multicast the datagram on the local network.

Level 2

The host can send and receive IP multicasts.

To receive IP multicasts, the host must be able to join and leave multicast groups and must support IGMP for exchanging group membership information on at least one interface. A multihomed host may support multicasting on a subset of its interfaces.

Net/3 meets the level 2 host requirements and can additionally act as a multicast router. As with unicast IP routing, we assume that the system we are describing is a multicast router and we include the Net/3 multicast routing code in our presentation.

Well-Known IP Multicast Groups

As with UDP and TCP port numbers, the Internet Assigned Numbers Authority (IANA) maintains a list of registered IP multicast groups. The current list can be found in RFC 1700. For more information about the IANA, see RFC 1700. Figure 12.1 shows only some of the well-known groups.

Table 12.1. Some registered IP multicast groups.

Group

Description

Net/3 constant

224.0.0.0

reserved

INADDR_UNSPEC_GROUP

224.0.0.1

all systems on this subnet

INADDR_ALLHOSTS_GROUP

224.0.0.2

all routers on this subnet

 

224.0.0.3

unassigned

 

224.0.0.4

DVMRP routers

 

224.0.0.255

unassigned

INADDR_MAX_LOCAL_GROUP

224.0.1.1

NTP Network Time Protocol

 

224.0.1.2

SGI-Dogfight

 

The first 256 groups (224.0.0.0 to 224.0.0.255) are reserved for protocols that implement IP unicast and multicast routing mechanisms. Datagrams sent to any of these groups are not forwarded beyond the local network by multicast routers, regardless of the TTL value in the IP header.

RFC 1075 places this requirement only on the 224.0.0.0 and 224.0.0.1 groups but mrouted, the most common multicast routing implementation, restricts the remaining groups as described here. Group 224.0.0.0 (INADDR_UNSPEC_GROUP) is reserved and group 224.0.0.255 (INADDR_MAX_LOCAL_GROUP) marks the last local multicast group.

Every level-2 conforming system is required to join the 224.0.0.1 (INADDR_ALLHOSTS_GROUP) group on all multicast interfaces at system initialization time (Figure 6.17) and remain a member of the group until the system is shut down. There is no multicast group that corresponds to every interface on an internet.

Imagine if your voice-mail system had the option of sending a message to every voice mailbox in your company. Maybe you have such an option. Do you find it useful? Does it scale to larger companies? Can anyone send to the “all-mailbox” group, or is it restricted?

Unicast and multicast routers may join group 224.0.0.2 to communicate with each other. The ICMP router solicitation message and router advertisement messages may be sent to 224.0.0.2 (the all-routers group) and 224.0.0.1 (the all-hosts group), respectively, instead of to the limited broadcast address (255.255.255.255).

The 224.0.0.4 group supports communication between multicast routers that implement DVMRP. Other groups within the local multicast group range are similarly assigned for other routing protocols. Beyond the first 256 groups, the remaining groups (224.0.1.0-239.255.255.255) are assigned to various multicast application protocols or remain unassigned. Figure 12.1 lists two examples, the Network Time Protocol (224.0.1.1), and SGI-Dogfight (224.0.1.2).

Throughout this chapter, we note that multicast packets are sent and received by the transport layer on a host. While the multicasting code is not aware of the specific transport protocol that sends and receives multicast datagrams, the only Internet transport protocol that supports multicasting is UDP.

Code Introduction

The basic multicasting code discussed in this chapter is contained within the same files as the standard IP code. Figure 12.2 lists the files that we examine.

Table 12.2. Files discussed in this chapter.

File

Description

netinet/if_ether.h

Ethernet multicasting structure and macro definitions

netinet/in.h

more Internet multicast structures

netinet/in_var.h

Internet multicast structure and macro definitions

netinet/ip_var.h

IP multicast structures

net/if_ethersubr.c

Ethernet multicast functions

netinet/in.c

group membership functions

netinet/ip_input.c

input multicast processing

netinet/ip_output.c

output multicast processing

Global Variables

Three new global variables are introduced in this chapter:???

Table 12.3. Global variables introduced in this chapter.

Variable

Datatype

Description

ether_ipmulticast_min

u_char []

minimum Ethernet multicast address reserved for IP

ether_ipmulticast_max

u_char []

maximum Ethernet multicast address reserved for IP

ip_mrouter

struct socket *

pointer to socket created by multicast routing daemon

Statistics

The code in this chapter updates a few of the counters maintained in the global ipstat structure.???

Table 12.4. Multicast processing statistics.

ipstat member

Description

ips_forward

#packets forwarded by this system

ips_cantforward

#packets that cannot be forwarded—system is not a router

ips_noroute

#packets that cannot be forwarded because a route is not available

Link-level multicast statistics are collected in the ifnet structure (Figure 4.5) and may include multicasting of protocols other than IP.

Ethernet Multicast Addresses

An efficient implementation of IP multicasting requires IP to take advantage of hardware-level multicasting, without which each IP datagram would have to be broadcast to the network and every host would have to examine each datagram and discard those not intended for the host. The hardware filters unwanted datagrams before they reach the IP layer.

For the hardware filter to work, the network interface must convert the IP multicast group destination to a link-layer multicast address recognized by the network hardware. On point-to-point networks, such as SLIP and the loopback interface, the mapping is implicit since there is only one possible destination. On other networks, such as Ethernet, an explicit mapping function is required. The standard mapping for Ethernet applies to any network that employs 802.3 addressing.

Figure 4.12 illustrated the difference between a Ethernet unicast and multicast address: if the low-order bit of the high-order byte of the Ethernet address is a 1, it is a multicast address; otherwise it is a unicast address. Unicast Ethernet addresses are assigned by the interface’s manufacturer, but multicast addresses are assigned dynamically by network protocols.

IP to Ethernet Multicast Address Mapping

Because Ethernet supports multiple protocols, a method to allocate the multicast addresses and prevent conflicts is needed. Ethernet addresses allocation is administered by the IEEE. A block of Ethernet multicast addresses is assigned to the IANA by the IEEE to support IP multicasting. The addresses in the block all start with 01:00:5e.

The block of Ethernet unicast addresses starting with 00:00:5e is also assigned to the IANA but remains reserved for future use.

Figure 12.5 illustrates the construction of an Ethernet multicast address from a class D IP address.

Mapping between IP and Ethernet addresses.

Figure 12.5. Mapping between IP and Ethernet addresses.

The mapping illustrated by Figure 12.5 is a many-to-one mapping. The high-order 9 bits of the class D IP address are not used when constructing the Ethernet address. 32 IP multicast groups map to a single Ethernet multicast address (Exercise 12.3). In Section 12.14 we’ll see how this affects input processing. Figure 12.6 shows the macro that implements this mapping in Net/3.

Table 12.6. ETHER_MAP_IP_MULTICAST macro.

---------------------------------------------------------------------- if_ether.h
 61 #define ETHER_MAP_IP_MULTICAST(ipaddr, enaddr) 
 62     /* struct in_addr *ipaddr; */ 
 63     /* u_char enaddr[6];       */ 
 64 { 
 65     (enaddr)[0] = 0x01; 
 66     (enaddr)[1] = 0x00; 
 67     (enaddr)[2] = 0x5e; 
 68     (enaddr)[3] = ((u_char *)ipaddr)[1] & 0x7f; 
 69     (enaddr)[4] = ((u_char *)ipaddr)[2]; 
 70     (enaddr)[5] = ((u_char *)ipaddr)[3]; 
 71 }
---------------------------------------------------------------------- if_ether.h

IP to Ethernet multicast mapping

61-71

ETHER_MAP_IP_MULTICAST implements the mapping shown in Figure 12.5. ipaddr points to the class D multicast address, and the matching Ethernet address is constructed in enaddr, an array of 6 bytes. The first 3 bytes of the Ethernet multicast address are 0x01,0x00, and 0x5e followed by a 0 bit and then the low-order 23 bits of the class D IP address.

ether_multi Structure

For each Ethernet interface, Net/3 maintains a list of Ethernet multicast address ranges to be received by the hardware. This list defines the multicast filtering to be implemented by the device. Because most Ethernet devices are limited in the number of addresses they can selectively receive, the IP layer must be prepared to discard datagrams that pass through the hardware filter. Each address range is stored in an ether_multi structure:???

Table 12.7. ether_multi structure.

------------------------------------------------------------------------- if_ether.h
147 struct ether_multi {
148     u_char  enm_addrlo[6];      /* low  or only address of range */
149     u_char  enm_addrhi[6];      /* high or only address of range */
150     struct arpcom *enm_ac;      /* back pointer to arpcom */
151     u_int   enm_refcount;       /* no. claims to this addr/range */
152     struct ether_multi *enm_next;   /* ptr to next ether_multi */
153 };
------------------------------------------------------------------------- if_ether.h

Ethernet multicast addresses

147-153

enm_addrlo and enm_addrhi specify a range of Ethernet multicast addresses that should be received. A single Ethernet address is specified when enm_addrlo and enm_addrhi are the same. The entire list of ether_multi structures is attached to the arpcom structure of each Ethernet interface (Figure 3.26). Ethernet multicasting is independent of ARP usin g the arpcom structure is a matter of convenience, since the structure is already included in every Ethernet interface structure.

We’ll see that the start and end of the ranges are always the same since there is no way in Net/3 for a process to specify an address range.

enm_ac points back to the arpcom structure of the associated interface and enm_refcount tracks the usage of the ether_multi structure. When the reference count drops to 0, the structure is released. enm_next joins the ether_multi structures for a single interface into a linked list. Figure 12.8 shows a list of three ether_multi structures attached to le_softc[0], the ifnet structure for our sample Ethernet interface.

The LANCE interface with three ether_multi structures.

Figure 12.8. The LANCE interface with three ether_multi structures.

In Figure 12.8 we see that:

  • The interface has joined three groups. Most likely they are: 224.0.0.1 (all-hosts), 224.0.0.2 (all-routers), and 224.0.1.2 (SGI-dogfight). Because the Ethernet to IP mapping is a one-to-many mapping, we cannot determine the exact IP multicast groups by examining the resulting Ethernet multicast addresses. The interface may have joined 225.0.0.1, 225.0.0.2, and 226.0.1.2, for example.

  • The most recently joined group appears at the front of the list.

  • The enm_ac back-pointer makes it easy to find the beginning of the list and to release an ether_multi structure, without having to implement a doubly linked list.

  • The ether_multi structures apply to Ethernet devices only. Other multicast devices may have a different multicast implementation.

The ETHER_LOOKUP_MULTI macro, shown in Figure 12.9, searches an ether_multi list for a range of addresses.

Table 12.9. ETHER_LOOKUP_MULTI macro.

---------------------------------------------------------------------- if_ether.h
166 #define ETHER_LOOKUP_MULTI(addrlo, addrhi, ac, enm) 
167     /* u_char addrlo[6]; */ 
168     /* u_char addrhi[6]; */ 
169     /* struct arpcom *ac; */ 
170     /* struct ether_multi *enm; */ 
171 { 
172     for ((enm) = (ac)->ac_multiaddrs; 
173         (enm) != NULL && 
174         (bcmp((enm)->enm_addrlo, (addrlo), 6) != 0 || 
175          bcmp((enm)->enm_addrhi, (addrhi), 6) != 0); 
176         (enm) = (enm)->enm_next); 
177 }
---------------------------------------------------------------------- if_ether.h

Ethernet multicast lookups

166-177

addrlo and addrhi specify the search range and ac points to the arpcom structure containing the list to search. The for loop performs a linear search, stopping at the end of the list or when enm_addrlo and enm_addrhi both match the supplied addrlo and addrhi addresses. When the loop terminates, enm is null or points to a matching ether_multi structure.

Ethernet Multicast Reception

After this section, this chapter discusses only IP multicasting, but it is possible in Net/3 to configure the system to receive any Ethernet multicast packet. Although not useful with the IP protocols, other protocol families within the kernel might be prepared to receive these multicasts. Explicit multicast configuration is done by issuing the ioctl commands shown in Figure 12.10.

Table 12.10. Multicast ioctl commands.

Command

Argument

Function

Description

SIOCADDMULTI

struct ifreq *

ifioctl

add multicast address to reception list

SIOCDELMULTI

struct ifreq *

ifioctl

delete multicast address from reception list

These two commands are passed by ifioctl (Figure 12.11) directly to the device driver for the interface specified in the ifreq structure (Figure 6.12).

Table 12.11. ifioctl function: multicast commands.

---------------------------------------------------------------------- if.c
440     case SIOCADDMULTI:
441     case SIOCDELMULTI:
442         if (error = suser(p->p_ucred, &p->p_acflag))
443             return (error);
444         if (ifp->if_ioctl == NULL)
445             return (EOPNOTSUPP);
446         return ((*ifp->if_ioctl) (ifp, cmd, data));
---------------------------------------------------------------------- if.c

440-446

If the process does not have superuser privileges, or if the interface does not have an if_ioctl function, ifioctl returns an error; otherwise the request is passed directly to the device driver.

in_multi Structure

The Ethernet multicast data structures described in Section 12.4 are not specific to IP; they must support multicast activity by any of the protocol families supported by the kernel. At the network level, IP maintains a list of IP multicast groups associated with each interface.

As a matter of implementation convenience, the IP multicast list is attached to the in_ifaddr structure associated with the interface. Recall from Section 6.5 that this structure contains the unicast address for the interface. There is no relationship between the unicast address and the attached multicast group list other than that they both are associated with the same interface.

This is an artifact of the Net/3 implementation. It is possible for an implementation to support IP multicast groups on an interface that does not accept IP unicast packets.

Each IP multicast {interface, group} pair is described by an in_multi structure shown in Figure 12.12.

Table 12.12. in_multi structure.

------------------------------------------------------------------------ in_var.h
111 struct in_multi {
112     struct in_addr inm_addr;    /* IP multicast address */
113     struct ifnet *inm_ifp;      /* back pointer to ifnet */
114     struct in_ifaddr *inm_ia;   /* back pointer to in_ifaddr */
115     u_int   inm_refcount;       /* no. membership claims by sockets */
116     u_int   inm_timer;          /* IGMP membership report timer */
117     struct in_multi *inm_next;  /* ptr to next multicast address */
118 };
------------------------------------------------------------------------ in_var.h

IP multicast addresses

111-118

inm_addr is a class D multicast address (e.g., 224.0.0.1, the all-hosts group). inm_ifp points back to the ifnet structure of the associated interface and inm_ia points back to the interface’s in_ifaddr structure.

An in_multi structure exists only if at least one process on the system has notified the kernel that it wants to receive multicast datagrams for a particular {interface, group} pair. Since multiple processes may elect to receive datagrams sent to a particular pair, inm_refcount keeps track of the number of references to the pair. When no more processes are interested in the pair, inm_refcount drops to 0 and the structure is released. This action may cause an associated ether_multi structure to be released if its reference count also drops to 0.

inm_timer is part of the IGMP protocol implementation described in Chapter 13. Finally, inm_next points to the next in_multi structure in the list.

Figure 12.13 illustrates the relationship between an interface, its IP unicast address, and its IP multicast group list using the le_softc[0] sample interface.

An IP multicast group list for the le interface.

Figure 12.13. An IP multicast group list for the le interface.

We’ve omitted the corresponding ether_multi structures for clarity (but see Figure 12.34). If the system had two Ethernet cards, the second card would be managed through le_softc[1] and would have its own multicast group list attached to its arpcom structure. The macro IN_LOOKUP_MULTI (Figure 12.14) searches the IP multicast list for a particular multicast group.

Table 12.14. IN_LOOKUP_MULTI macro.

------------------------------------------------------------------------ in_var.h
131 #define IN_LOOKUP_MULTI(addr, ifp, inm) 
132     /* struct in_addr addr; */ 
133     /* struct ifnet *ifp; */ 
134     /* struct in_multi *inm; */ 
135 { 
136      struct in_ifaddr *ia; 
137 
138     IFP_TO_IA((ifp), ia); 
139     if (ia == NULL) 
140         (inm) = NULL; 
141     else 
142         for ((inm) = ia->ia_multiaddrs; 
143             (inm) != NULL && (inm)->inm_addr.s_addr != (addr).s_addr; 
144              (inm) = inm->inm_next) 
145              continue; 
146 }
------------------------------------------------------------------------ in_var.h

IP multicast lookups

131-146

IN_LOOKUP_MULTI looks for the multicast group addr in the multicast group list associated with interface ifp. IFP_TO_IA searches the Internet address list, in_ifaddr, for the in_ifaddr structure associated with the interface identified by ifp. If IFP_TO_IA finds an interface, the for loop searches its IP multicast list. After the loop, inm is null or points to the matching in_multi structure.

ip_moptions Structure

The ip_moptions structure contains the multicast options through which the transport layer controls multicast output processing. For example, the UDP call to ip_output is:

error = ip_output(m, inp->inp_options, &inp->inp_route,
                  inp->inp_socket->so_options & (SO_DONTROUTE|SO_BROADCAST),
                  inp->inp_moptions);

In Chapter 22 we’ll see that inp points to an Internet protocol control block (PCB) and that UDP associates a PCB with each socket created by a process. Within the PCB, inp_moptions is a pointer to an ip_moptions structure. From this we see that a different ip_moptions structure may be passed to ip_output for each outgoing datagram. Figure 12.15 shows the definition of the ip_moptions structure.

Table 12.15. ip_moptions structure.

------------------------------------------------------------------------ ip_var.h
100 struct ip_moptions {
101     struct  ifnet *imo_multicast_ifp; /* ifp for outgoing multicasts */
102     u_char  imo_multicast_ttl;        /* TTL for outgoing multicasts */
103     u_char  imo_multicast_loop;       /* 1 => hear sends if a member */
104     u_short imo_num_memberships;      /* no. memberships this socket */
105     struct  in_multi *imo_membership[IP_MAX_MEMBERSHIPS];
106 };
------------------------------------------------------------------------ ip_var.h

Multicast options

100-106

ip_output routes outgoing multicast datagrams through the interface pointed to by imo_multicast_ifp or, if imo_multicast_ifp is null, through the default interface for the destination multicast group (Chapter 14).

imo_multicast_ttl specifies the initial IP TTL value for outgoing multicasts. The default is 1, which causes multicast datagrams to remain on the local network.

If imo_multicast_loop is 0, the multicast datagram is not looped back and delivered to the transmitting interface even if the interface is a member of the multicast group. If imo_multicast_loop is 1, the multicast datagram is looped back to the transmitting interface if the interface is a member of the multicast group.

Finally, the integer imo_num_memberships and the array imo_membership maintain the list of {interface, group} pairs associated with the structure. Changes to the list are communicated to IP, which announces membership changes on the locally attached network. Each entry in the imo_membership array is a pointer to an in_multi structure attached to the in_ifaddr structure of the appropriate interface.

Multicast Socket Options

Several IP-level socket options, shown in Figure 12.16, provide process-level access to ip_moptions structures.

Table 12.16. Multicast socket options.

Command

Argument

Function

Description

IP_MULTICAST_IF

struct in_addr

ip_ctloutput

select default interface for outgoing multicasts

IP_MULTICAST_TTL

u_char

ip_ctloutput

select default TTL for outgoing multicasts

IP_MULTICAST_LOOP

u_char

ip_ctloutput

enable or disable loopback of outgoing multicasts

IP_ADD_MEMBERSHIP

struct ip_mreq

ip_ctloutput

join a multicast group

IP_DROP_MEMBERSHIP

struct ip_mreq

ip_ctloutput

leave a multicast group

In Figure 8.31 we looked at the overall structure of the ip_ctloutput function. Figure 12.17 shows the cases relevant to changing and retrieving multicast options.

Table 12.17. ip_ctloutput function: multicast options.

---------------------------------------------------------------------- ip_output.c
448         case PRCO_SETOPT:
449             switch (optname) {
                                                                                  
                                 /* other set cases */" fill .1                   
                                                                                  
486             case IP_MULTICAST_IF:
487             case IP_MULTICAST_TTL:
488             case IP_MULTICAST_LOOP:
489             case IP_ADD_MEMBERSHIP:
490             case IP_DROP_MEMBERSHIP:
491                 error = ip_setmoptions(optname, &inp->inp_moptions, m);
492                 break;
493               freeit:
494             default:
495                 error = EINVAL;
496                 break;
497             }
498             if (m)
499                 (void) m_free(m);
500             break;
501         case PRCO_GETOPT:
502             switch (optname) {
                                                                                  
                                 /* other get cases */" fill .1                   
                                                                                  
539             case IP_MULTICAST_IF:
540             case IP_MULTICAST_TTL:
541             case IP_MULTICAST_LOOP:
542             case IP_ADD_MEMBERSHIP:
543             case IP_DROP_MEMBERSHIP:
544                 error = ip_getmoptions(optname, inp->inp_moptions, mp);
545                 break;
546             default:
547                 error = ENOPROTOOPT;
548                 break;
549             }
---------------------------------------------------------------------- ip_output.c

486-491 539-549

All the multicast options are handled through the ip_setmoptions and ip_getmoptions functions. The ip_moptions structure passed by reference to ip_getmoptions or to ip_setmoptions is the one associated with the socket on which the ioctl command was issued.

The error code returned when an option is not recognized is different for the get and set cases. ENOPROTOOPT is the more reasonable choice.

Multicast TTL Values

Multicast TTL values are difficult to understand because they have two purposes. The primary purpose of the TTL value, as with all IP packets, is to limit the lifetime of the packet within an internet and prevent it from circulating indefinitely. The second purpose is to contain packets within a region of the internet specified by administrative boundaries. This administrative region is specified in subjective terms such as “this site,” “this company,” or “this state,” and is relative to the starting point of the packet. The region associated with a multicast packet is called its scope.

The standard implementation of RFC 1112 multicasting merges the two concepts of lifetime and scope into the single TTL value in the IP header. In addition to discarding packets when the IP TTL drops to 0, multicast routers associate with each interface a TTL threshold that limits multicast transmission on that interface. A packet must have a TTL greater than or equal to the interface’s threshold value for it to be transmitted on the interface. Because of this, a multicast packet may be dropped even before its TTL value reaches 0.

Threshold values are assigned by an administrator when configuring a multicast router. These values define the scope of multicast packets. The significance of an initial TTL value for multicast datagrams is defined by the threshold policy used by the administrator and the distance between the source of the datagram and the multicast interfaces.

Figure 12.18 shows the recommended TTL values for various applications as well as recommended threshold values.

Table 12.18. TTL values for IP multicast datagrams.

ip_ttl

Application

Scope

0

 

same interface

1

 

same subnet

31

local event video

 

32

 

same site

63

local event audio

 

64

 

same region

95

IETF channel 2 video

 

127

IETF channel 1 video

 

128

 

same continent

159

IETF channel 2 audio

 

191

IETF channel 1 audio

 

223

IETF channel 2 low-rate audio

 

255

IETF channel 1 low-rate audio unrestricted in scope

 

The first column lists the starting value of ip_ttl in the IP header. The second column illustrates an application specific use of threshold values ([Casner 1993]). The third column lists the recommended scopes to associate with the TTL values.

For example, an interface that communicates to a network outside the local site would be configured with a multicast threshold of 32. The TTL field of any datagram that start with a TTL of 32 (or less) is less than 32 when it reaches this interface (there is at least one hop between the source and the router) and is discarded before the router forwards it to the external network even if the TTL is still greater than 0.

A multicast datagram that starts with a TTL of 128 would pass through site interfaces with a threshold of 32 (as long as it reached the interface within 128 - 32 = 96 hops) but would be discarded by intercontinental interfaces with a threshold of 128.

The MBONE

A subset of routers on the Internet supports IP multicast routing. This multicast backbone is called the MBONE, which is described in [Casner 1993]. It exists to support experimentation with IP multicasting in particular with audio and video data streams. In the MBONE, threshold values limit how far various data streams propagate. In Figure 12.18, we see that local event video packets always start with a TTL of 31. An interface with a threshold of 32 always blocks local event video. At the other end of the scale, IETF channel 1 low-rate audio is restricted only by the inherent IP TTL maximum of 255 hops. It propagates through the entire MBONE. An administrator of a multicast router within the MBONE can select a threshold value to accept or discard MBONE data streams selectively.

Expanding-Ring Search

Another use of the multicast TTL is to probe the internet for a resource by varying the initial TTL value of the probe datagram. This technique is called an expanding-ring search ([Boggs 1982]). A datagram with an initial TTL of 0 reaches only a resource on the local system associated with the outgoing interface. A TTL of 1 reaches the resource if it exists on the local subnet. A TTL of 2 reaches resources within two hops of the source. An application increases the TTL exponentially to probe a large internet quickly.

RFC 1546 [Partridge, Mendez, and Milliken 1993] describes a related service called anycasting. As proposed, anycasting relies on a distinguished set of IP addresses to represent groups of hosts much like multicasting. Unlike multicast addresses, the network is expected to propagate an anycast packet until it is received by at least one host. This simplifies the implementation of an application, which no longer needs to perform expanding-ring searches.

ip_setmoptions Function

The bulk of the ip_setmoptions function consists of a switch statement to handle each option. Figure 12.19 shows the beginning and end of ip_setmoptions. The body of the switch is discussed in the following sections.

Table 12.19. ip_setmoptions function.

---------------------------------------------------------------------- ip_output.c
650 int
651 ip_setmoptions(optname, imop, m)
652 int     optname;
653 struct ip_moptions **imop;
654 struct mbuf *m;
655 {
656     int     error = 0;
657     u_char  loop;
658     int     i;
659     struct in_addr addr;
660     struct ip_mreq *mreq;
661     struct ifnet *ifp;
662     struct ip_moptions *imo = *imop;
663     struct route ro;
664     struct sockaddr_in *dst;
665     if (imo == NULL) {
666         /*
667          * No multicast option buffer attached to the pcb;
668          * allocate one and initialize to default values.
669          */
670         imo = (struct ip_moptions *) malloc(sizeof(*imo), M_IPMOPTS,
671                                             M_WAITOK);
672         if (imo == NULL)
673             return (ENOBUFS);
674         *imop = imo;
675         imo->imo_multicast_ifp = NULL;
676         imo->imo_multicast_ttl = IP_DEFAULT_MULTICAST_TTL;
677         imo->imo_multicast_loop = IP_DEFAULT_MULTICAST_LOOP;
678         imo->imo_num_memberships = 0;
679     }
680     switch (optname) {
                                                                               
                                 /* switch cases */" fill .1                   
                                                                               
857     default:
858         error = EOPNOTSUPP;
859         break;
860     }
861     /*
862      * If all options have default values, no need to keep the structure.
863      */
864     if (imo->imo_multicast_ifp == NULL &&
865         imo->imo_multicast_ttl == IP_DEFAULT_MULTICAST_TTL &&
866         imo->imo_multicast_loop == IP_DEFAULT_MULTICAST_LOOP &&
867         imo->imo_num_memberships == 0) {
868         free(*imop, M_IPMOPTS);
869         *imop = NULL;
870     }
871     return (error);
872 }
---------------------------------------------------------------------- ip_output.c

650-664

The first argument, optname, indicates which multicast option is being changed. The second argument, imop, references a pointer to an ip_moptions structure. If *imop is nonnull, ip_setmoptions modifies the structure it points to. Otherwise, ip_setmoptions allocates a new ip_moptions structure and saves its address in *imop. If no memory is available, ip_setmoptions returns ENOBUFS immediately. Any subsequent errors that occur are posted in error, which is returned to the caller at the end of the function. The third argument, m, points to an mbuf that contains the data for the option to be changed (second column of Figure 12.16).

Construct the defaults

665-679

When a new ip_moptions structure is allocated, ip_setmoptions initializes the default multicast interface pointer to null, initializes the default TTL to 1 (IP_DEFAULT_MULTICAST_TTL), enables the loopback of multicast datagrams, and clears the group membership list. With these defaults, ip_output selects an outgoing interface by consulting the routing tables, multicasts are kept on the local network, and the system receives its own multicast transmissions if the outgoing interface is a member of the destination group.

Process options

680-860

The body of ip_setmoptions consists of a switch statement with a case for each option. The default case (for unknown options) sets error to EOPNOTSUPP.

Discard structure if defaults are OK

861-872

After the switch statement, ip_setmoptions examines the ip_moptions structure. If all the multicast options match their respective default values, the structure is unnecessary and is released. ip_setmoptions returns 0 or the posted error code.

Selecting an Explicit Multicast Interface: IP_MULTICAST_IF

When optname is IP_MULTICAST_IF, the mbuf passed to ip_setmoptions contains the unicast address of a multicast interface, which specifies the particular interface for multicasts sent on this socket. Figure 12.20 shows the code for this option.

Table 12.20. ip_setmoptions function: selecting a multicast output interface.

---------------------------------------------------------------------- ip_output.c
681     case IP_MULTICAST_IF:
682         /*
683          * Select the interface for outgoing multicast packets.
684          */
685         if (m == NULL || m->m_len != sizeof(struct in_addr)) {
686             error = EINVAL;
687             break;
688         }
689         addr = *(mtod(m, struct in_addr *));
690         /*
691          * INADDR_ANY is used to remove a previous selection.
692          * When no interface is selected, a default one is
693          * chosen every time a multicast packet is sent.
694          */
695         if (addr.s_addr == INADDR_ANY) {
696             imo->imo_multicast_ifp = NULL;
697             break;
698         }
699         /*
700          * The selected interface is identified by its local
701          * IP address.  Find the interface and confirm that
702          * it supports multicasting.
703          */
704         INADDR_TO_IFP(addr, ifp);
705         if (ifp == NULL || (ifp->if_flags & IFF_MULTICAST) == 0) {
706             error = EADDRNOTAVAIL;
707             break;
708         }
709         imo->imo_multicast_ifp = ifp;
710         break;
---------------------------------------------------------------------- ip_output.c

Validation

681-698

If no mbuf has been provided or the data within the mbuf is not the size of an in_addr structure, ip_setmoptions posts an EINVAL error; otherwise the data is copied into addr. If the interface address is INADDR_ANY, any previously selected interface is discarded. Subsequent multicasts with this ip_moptions structure are routed according to their destination group instead of through an explicitly named interface (Figure 12.40).

Select the default interface

699-710

If addr contains an address, INADDR_TO_IFP locates the matching interface. If a match can’t be found or the interface does not support multicasting, EADDRNOTAVAIL is posted. Otherwise, ifp, the matching interface, becomes the multicast interface for output requests associated with this ip_moptions structure.

Selecting an Explicit Multicast TTL: IP_MULTICAST_TTL

When optname is IP_MULTICAST_TTL, the mbuf is expected to contain a single byte specifying the IP TTL for outgoing multicasts. This TTL is inserted by ip_output into every multicast datagram sent on the associated socket. Figure 12.21 shows the code for this option.

Table 12.21. ip_setmoptions function: selecting an explicit multicast TTL.

---------------------------------------------------------------------- ip_output.c
711     case IP_MULTICAST_TTL:
712         /*
713          * Set the IP time-to-live for outgoing multicast packets.
714          */
715         if (m == NULL || m->m_len != 1) {
716             error = EINVAL;
717             break;
718         }
719         imo->imo_multicast_ttl = *(mtod(m, u_char *));
720         break;
---------------------------------------------------------------------- ip_output.c

Validate and select the default TTL

711-720

If the mbuf contains a single byte of data, it is copied into imo_multicast_ttl. Otherwise, EINVAL is posted.

Selecting Multicast Loopbacks: IP_MULTICAST_LOOP

In general, multicast applications come in two forms:

  • An application with one sender per system and multiple remote receivers. In this configuration only one local process is sending datagrams to the group so there is no need to loopback outgoing multicasts. Examples include a multicast routing daemon and conferencing systems.

  • An application with multiple senders and receivers on a system. Datagrams must be looped back so that each process receives the transmissions of the other senders on the system.

The IP_MULTICAST_LOOP option (Figure 12.22) selects the loopback policy associated with an ip_moptions structure.

Table 12.22. ip_setmoptions function: selecting multicast loopbacks.

---------------------------------------------------------------------- ip_output.c
721     case IP_MULTICAST_LOOP:
722         /*
723          * Set the loopback flag for outgoing multicast packets.
724          * Must be zero or one.
725          */
726         if (m == NULL || m->m_len != 1 ||
727             (loop = *(mtod(m, u_char *))) > 1) {
728             error = EINVAL;
729             break;
730         }
731         imo->imo_multicast_loop = loop;
732         break;
---------------------------------------------------------------------- ip_output.c

Validate and select the loopback policy

721-732

If m is null, does not contain 1 byte of data, or the byte is not 0 or 1, EINVAL is posted. Otherwise, the byte is copied into imo_multicast_loop. A 0 indicates that datagrams should not be looped back, and a 1 enables the loopback mechanism.

Figure 12.23 shows the relationship between, the maximum scope of a multicast datagram, imo_multicast_ttl, and imo_multicast_loop.

Table 12.23. Loopback and TTL effects on multicast scope.

imo_multicast-

Recipients

_loop

_ttl

Outgoing Interface?

Local Network?

Remote Networks?

Other Interfaces?

1

0

   

1

1

  

1

>1

see text

Figure 12.23 shows that the set of interfaces that may receive a multicast packet depends on what the loopback policy is for the transmission and what TTL value is specified in the packet. A packet may be received on an interface if the hardware receives its own transmissions, regardless of the loopback policy. A datagram may be routed through the network and arrive on another interface attached to the system (Exercise 12.6). If the sending system is itself a multicast router, outgoing packets may be forwarded to the other interfaces, but they will only be accepted for input processing on one interface (Chapter 14).

Joining an IP Multicast Group

Other than the IP all-hosts group, which the kernel automatically joins (Figure 6.17), membership in a group is driven by explicit requests from processes on the system. The process of joining (or leaving) a multicast group is more involved than the other multicast options. The in_multi list for an interface must be modified as well as any link-layer multicast structures such as the ether_multi list we described for Ethernet.

The data passed in the mbuf when optname is IP_ADD_MEMBERSHIP is an ip_mreq structure shown in Figure 12.24.

Table 12.24. ip_mreq structure.

---------------------------------------------------------------------------- in.h
148 struct ip_mreq {
149     struct in_addr imr_multiaddr;   /* IP multicast address of group */
150     struct in_addr imr_interface;   /* local IP address of interface */
151 };
---------------------------------------------------------------------------- in.h

148-151

imr_multiaddr specifies the multicast group and imr_interface identifies the interface by its associated unicast IP address. The ip_mreq structure specifies the {interface, group} pair for membership changes.

Figure 12.25 illustrates the functions involved with joining and leaving a multicast group associated with our example Ethernet interface.

Joining and leaving a multicast group.

Figure 12.25. Joining and leaving a multicast group.

We start by describing the changes to the ip_moptions structure in the IP_ADD_MEMBERSHIP case in ip_setmoptions (Figure 12.26). Then we follow the request down through the IP layer, the Ethernet driver, and to the physical device in our case, the LANCE Ethernet card.

Table 12.26. ip_setmoptions function: joining a multicast group.

---------------------------------------------------------------------- ip_output.c
733     case IP_ADD_MEMBERSHIP:
734         /*
735          * Add a multicast group membership.
736          * Group must be a valid IP multicast address.
737          */
738         if (m == NULL || m->m_len != sizeof(struct ip_mreq)) {
739             error = EINVAL;
740             break;
741         }
742         mreq = mtod(m, struct ip_mreq *);
743         if (!IN_MULTICAST(ntohl(mreq->imr_multiaddr.s_addr))) {
744             error = EINVAL;
745             break;
746         }
747         /*
748          * If no interface address was provided, use the interface of
749          * the route to the given multicast address.
750          */
751         if (mreq->imr_interface.s_addr == INADDR_ANY) {
752             ro.ro_rt = NULL;
753             dst = (struct sockaddr_in *) &ro.ro_dst;
754             dst->sin_len = sizeof(*dst);
755             dst->sin_family = AF_INET;
756             dst->sin_addr = mreq->imr_multiaddr;
757             rtalloc(&ro);
758             if (ro.ro_rt == NULL) {
759                 error = EADDRNOTAVAIL;
760                 break;
761             }
762             ifp = ro.ro_rt->rt_ifp;
763             rtfree(ro.ro_rt);
764         } else {
765             INADDR_TO_IFP(mreq->imr_interface, ifp);
766         }
767         /*
768          * See if we found an interface, and confirm that it
769          * supports multicast.
770          */
771         if (ifp == NULL || (ifp->if_flags & IFF_MULTICAST) == 0) {
772             error = EADDRNOTAVAIL;
773             break;
774         }
775         /*
776          * See if the membership already exists or if all the
777          * membership slots are full.
778          */
779         for (i = 0; i < imo->imo_num_memberships; ++i) {
780             if (imo->imo_membership[i]->inm_ifp == ifp &&
781                 imo->imo_membership[i]->inm_addr.s_addr
782                 == mreq->imr_multiaddr.s_addr)
783                 break;
784         }
785         if (i < imo->imo_num_memberships) {
786             error = EADDRINUSE;
787             break;
788         }
789         if (i == IP_MAX_MEMBERSHIPS) {
790             error = ETOOMANYREFS;
791             break;
792         }
793         /*
794          * Everything looks good; add a new record to the multicast
795          * address list for the given interface.
796          */
797         if ((imo->imo_membership[i] =
798              in_addmulti(&mreq->imr_multiaddr, ifp)) == NULL) {
799             error = ENOBUFS;
800             break;
801         }
802         ++imo->imo_num_memberships;
803         break;
---------------------------------------------------------------------- ip_output.c

Validation

733-746

ip_setmoptions starts by validating the request. If no mbuf was passed, if it is not the correct size, or if the address (imr_multiaddr) within the structure is not a multicast group, then ip_setmoptions posts EINVAL. mreq points to the valid ip_mreq structure.

Locate the interface

747-774

If the unicast address of the interface (imr_interface) is INADDR_ANY, ip_setmoptions must locate the default interface for the specified group. A route structure is constructed with the group as the desired destination and passed to rtalloc, which locates a route for the group. If no route is available, the add request fails with the error EADDRNOTAVAIL. If a route is located, a pointer to the outgoing interface for the route is saved in ifp and the route entry, which is no longer needed, is released.

If imr_interface is not INADDR_ANY, an explicit interface has been requested. The macro INADDR_TO_IFP searches for the interface with the requested unicast address. If an interface isn’t found or if it does not support multicasting, the request fails with the error EADDRNOTAVAIL.

We described the route structure in Section 8.5. The function rtalloc is described in Section 19.2, and the use of the routing tables for selecting multicast interfaces is described in Chapter 14.

Already a member?

775-792

The last check performed on the request is to examine the imo_membership array to see if the selected interface is already a member of the requested group. If the for loop finds a match, or if the membership array is full, EADDRINUSE or ETOOMANYREFS is posted and processing of this option stops.

Join the group

793-803

At this point the request looks reasonable. in_addmulti arranges for IP to begin receiving multicast datagrams for the group. The pointer returned by in_addmulti points to a new or existing in_multi structure (Figure 12.12) in the interface’s multicast group list. It is saved in the membership array and the size of the array is incremented.

in_addmulti Function

in_addmulti and its companion in_delmulti (Figures 12.27 and 12.36) maintain the list of multicast groups that an interface has joined. Join requests either add a new in_multi structure to the interface list or increase the reference count of an existing structure.

Table 12.27. in_addmulti function: first half.

---------------------------------------------------------------------- in.c
469 struct in_multi *
470 in_addmulti(ap, ifp)
471 struct in_addr *ap;
472 struct ifnet *ifp;
473 {
474     struct in_multi *inm;
475     struct ifreq ifr;
476     struct in_ifaddr *ia;
477     int     s = splnet();

478     /*
479      * See if address already in list.
480      */
481     IN_LOOKUP_MULTI(*ap, ifp, inm);
482     if (inm != NULL) {
483         /*
484          * Found it; just increment the reference count.
485          */
486         ++inm->inm_refcount;
487     } else {
---------------------------------------------------------------------- in.c

Already a member

469-487

ip_setmoptions has already verified that ap points to a class D multicast address and that ifp points to a multicast-capable interface. IN_LOOKUP_MULTI (Figure 12.14) determines if the interface is already a member of the group. If it is a member, in_addmulti updates the reference count and returns.

If the interface is not yet a member of the group, the code in Figure 12.28 is executed.

Table 12.28. in_addmulti function: second half.

---------------------------------------------------------------------- in.c
487     } else {
488         /*
489          * New address; allocate a new multicast record
490          * and link it into the interface's multicast list.
491          */
492         inm = (struct in_multi *) malloc(sizeof(*inm),
493                                          M_IPMADDR, M_NOWAIT);
494         if (inm == NULL) {
495             splx(s);
496             return (NULL);
497         }
498         inm->inm_addr = *ap;
499         inm->inm_ifp = ifp;
500         inm->inm_refcount = 1;
501         IFP_TO_IA(ifp, ia);
502         if (ia == NULL) {
503             free(inm, M_IPMADDR);
504             splx(s);
505             return (NULL);
506         }
507         inm->inm_ia = ia;
508         inm->inm_next = ia->ia_multiaddrs;
509         ia->ia_multiaddrs = inm;
510         /*
511          * Ask the network driver to update its multicast reception
512          * filter appropriately for the new address.
513          */
514         ((struct sockaddr_in *) &ifr.ifr_addr)->sin_family = AF_INET;
515         ((struct sockaddr_in *) &ifr.ifr_addr)->sin_addr = *ap;
516         if ((ifp->if_ioctl == NULL) ||
517             (*ifp->if_ioctl) (ifp, SIOCADDMULTI, (caddr_t) & ifr) != 0) {
518             ia->ia_multiaddrs = inm->inm_next;
519             free(inm, M_IPMADDR);
520             splx(s);
521             return (NULL);
522         }
523         /*
524          * Let IGMP know that we have joined a new IP multicast group.
525          */
526         igmp_joingroup(inm);
527     }
528     splx(s);
529     return (inm);
530 }
---------------------------------------------------------------------- in.c

Update the in_multi list

487-509

If the interface isn’t a member yet, in_addmulti allocates, initializes, and inserts the new in_multi structure at the front of the ia_multiaddrs list in the interface’s in_ifaddr structure (Figure 12.13).

Update the interface and announce the change

510-530

If the interface driver has defined an if_ioctl function, in_addmulti constructs an ifreq structure (Figure 4.23) containing the group address and passes the SIOCADDMULTI request to the interface. If the interface rejects the request, the in_multi structure is unlinked from the interface and released. Finally, in_addmulti calls igmp_joingroup to propagate the membership change to other hosts and routers.

in_addmulti returns a pointer to the in_multi structure or null if an error occurred.

slioctl and loioctl Functions: SIOCADDMULTI and SIOCDELMULTI

Multicast group processing for the SLIP and loopback interfaces is trivial: there is nothing to do other than error checking. Figure 12.29 shows the SLIP processing.

Table 12.29. slioctl function: multicast processing.

---------------------------------------------------------------------- if_sl.c
673     case SIOCADDMULTI:
674     case SIOCDELMULTI:
675         ifr = (struct ifreq *) data;
676         if (ifr == 0) {
677             error = EAFNOSUPPORT;   /* XXX */
678             break;
679         }
680         switch (ifr->ifr_addr.sa_family) {

681         case AF_INET:
682             break;

683         default:
684             error = EAFNOSUPPORT;
685             break;
686         }
687         break;
---------------------------------------------------------------------- if_sl.c

673-687

EAFNOSUPPORT is returned whether the request is empty or not for the AF_INET protocol family.

Figure 12.30 shows the loopback processing.

Table 12.30. loioctl function: multicast processing.

---------------------------------------------------------------------- if_loop.c
152     case SIOCADDMULTI:
153     case SIOCDELMULTI:
154         ifr = (struct ifreq *) data;
155         if (ifr == 0) {
156             error = EAFNOSUPPORT;   /* XXX */
157             break;
158         }
159         switch (ifr->ifr_addr.sa_family) {

160         case AF_INET:
161             break;

162         default:
163             error = EAFNOSUPPORT;
164             break;
165         }
166         break;
---------------------------------------------------------------------- if_loop.c

152-166

The processing for the loopback interface is identical to the SLIP code in Figure 12.29. EAFNOSUPPORT is returned whether the request is empty or not for the AF_INET protocol family.

leioctl Function: SIOCADDMULTI and SIOCDELMULTI

Recall from Figure 4.2 that leioctl is the if_ioctl function for the LANCE Ethernet driver. Figure 12.31 shows the code for the SIOCADDMULTI and SIOCDELMULTI options.

Table 12.31. leioctl function: multicast processing.

---------------------------------------------------------------------- if_le.c
657     case SIOCADDMULTI:
658     case SIOCDELMULTI:
659         /* Update our multicast list  */
660         error = (cmd == SIOCADDMULTI) ?
661             ether_addmulti((struct ifreq *) data, &le->sc_ac) :
662             ether_delmulti((struct ifreq *) data, &le->sc_ac);

663         if (error == ENETRESET) {
664             /*
665              * Multicast list has changed; set the hardware
666              * filter accordingly.
667              */
668             lereset(ifp->if_unit);
669             error = 0;
670         }
671         break;
---------------------------------------------------------------------- if_le.c

657-671

leioctl passes add and delete requests directly to the ether_addmulti or ether_delmulti functions. Both functions return ENETRESET if the request changes the set of IP multicast addresses that must be received by the physical hardware. If this occurs, leioctl calls lereset to reinitialize the hardware with the new multicast reception list.

We don’t show lereset, as it is specific to the LANCE Ethernet hardware. For multicasting, lereset arranges for the hardware to receive frames addressed to any of the Ethernet multicast addresses contained in the ether_multi list associated with the interface. The LANCE driver uses a hashing mechanism if each entry on the multicast list is a single address. The hash code allows the hardware to receive multicast packets selectively. If the driver finds an entry that describes a range of addresses, it abandons the hash strategy and configures the hardware to receive all multicast packets. If the driver must fall back to receiving all Ethernet multicast addresses, the IFF_ALLMULTI flag is on when lereset returns.

ether_addmulti Function

Every Ethernet driver calls ether_addmulti to process the SIOCADDMULTI request. This function maps the IP class D address to the appropriate Ethernet multicast address (Figure 12.5) and updates the ether_multi list. Figure 12.32 shows the first half of the ether_addmulti function.

Table 12.32. ether_addmulti function: first half.

---------------------------------------------------------------------- if_ethersubr.c
366 int
367 ether_addmulti(ifr, ac)
368 struct ifreq *ifr;
369 struct arpcom *ac;
370 {
371     struct ether_multi *enm;
372     struct sockaddr_in *sin;
373     u_char  addrlo[6];
374     u_char  addrhi[6];
375     int     s = splimp();

376     switch (ifr->ifr_addr.sa_family) {

377     case AF_UNSPEC:
378         bcopy(ifr->ifr_addr.sa_data, addrlo, 6);
379         bcopy(addrlo, addrhi, 6);
380         break;

381     case AF_INET:
382         sin = (struct sockaddr_in *) &(ifr->ifr_addr);
383         if (sin->sin_addr.s_addr == INADDR_ANY) {
384             /*
385              * An IP address of INADDR_ANY means listen to all
386              * of the Ethernet multicast addresses used for IP.
387              * (This is for the sake of IP multicast routers.)
388              */
389             bcopy(ether_ipmulticast_min, addrlo, 6);
390             bcopy(ether_ipmulticast_max, addrhi, 6);
391         } else {
392             ETHER_MAP_IP_MULTICAST(&sin->sin_addr, addrlo);
393             bcopy(addrlo, addrhi, 6);
394         }
395         break;

396     default:
397         splx(s);
398         return (EAFNOSUPPORT);
399     }
---------------------------------------------------------------------- if_ethersubr.c

Initialize address range

366-399

First, ether_addmulti initializes a range of multicast addresses in addrlo and addrhi (both are arrays of six unsigned characters). If the requested address is from the AF_UNSPEC family, ether_addmulti assumes the address is an explicit Ethernet multicast address and copies it into addrlo and addrhi. If the address is in the AF_INET family and is INADDR_ANY (0.0.0.0), ether_addmulti initializes addrlo to ether_ipmulticast_min and addrhi to ether_ipmulticast_max. These two constant Ethernet addresses are defined as:

u_char ether_ipmulticast_min[6] = { 0x01, 0x00, 0x5e, 0x00, 0x00, 0x00 };
u_char ether_ipmulticast_max[6] = { 0x01, 0x00, 0x5e, 0x7f, 0xff, 0xff };

As with etherbroadcastaddr (Section 4.3), this is a convenient way to define a 48-bit constant.

IP multicast routers must listen for all IP multicasts. Specifying the group as INADDR_ANY is considered a request to join every IP multicast group. The Ethernet address range selected in this case spans the entire block of IP multicast addresses allocated to the IANA.

The mrouted(8) daemon issues a SIOCADDMULTI request with INADDR_ANY when it begins routing packets for a multicast interface.

ETHER_MAP_IP_MULTICAST maps any other specific IP multicast group to the appropriate Ethernet multicast address. Requests for other address families are rejected with an EAFNOSUPPORT error.

While the Ethernet multicast list supports address ranges, there is no way for a process or the kernel to request a specific range, other than to enumerate the addresses, since addrlo and addrhi are always set to the same address.

The second half of ether_addmulti, shown in Figure 12.33, verifies the address range and adds it to the list if it is new.

Table 12.33. ether_addmulti function: second half.

---------------------------------------------------------------------- if_ethersubr.c
400     /*
401      * Verify that we have valid Ethernet multicast addresses.
402      */
403     if ((addrlo[0] & 0x01) != 1 || (addrhi[0] & 0x01) != 1) {
404         splx(s);
405         return (EINVAL);
406     }
407     /*
408      * See if the address range is already in the list.
409      */
410     ETHER_LOOKUP_MULTI(addrlo, addrhi, ac, enm);
411     if (enm != NULL) {
412         /*
413          * Found it; just increment the reference count.
414          */
415         ++enm->enm_refcount;
416         splx(s);
417         return (0);
418     }
419     /*
420      * New address or range; malloc a new multicast record
421      * and link it into the interface's multicast list.
422      */
423     enm = (struct ether_multi *) malloc(sizeof(*enm), M_IFMADDR, M_NOWAIT);
424     if (enm == NULL) {
425         splx(s);
426         return (ENOBUFS);
427     }
428     bcopy(addrlo, enm->enm_addrlo, 6);
429     bcopy(addrhi, enm->enm_addrhi, 6);
430     enm->enm_ac = ac;
431     enm->enm_refcount = 1;
432     enm->enm_next = ac->ac_multiaddrs;
433     ac->ac_multiaddrs = enm;
434     ac->ac_multicnt++;
435     splx(s);
436     /*
437      * Return ENETRESET to inform the driver that the list has changed
438      * and its reception filter should be adjusted accordingly.
439      */
440     return (ENETRESET);
441 }
---------------------------------------------------------------------- if_ethersubr.c

Already receiving

400-418

ether_addmulti checks the multicast bit (Figure 4.12) of the high and low addresses to ensure that they are indeed Ethernet multicast addresses. ETHER_LOOKUP_MULTI (Figure 12.9) determines if the hardware is already listening for the specified multicast addresses. If so, the reference count (enm_refcount) in the matching ether_multi structure is incremented and ether_addmulti returns 0.

Update ether_multi list

419-441

If this is a new address range, a new ether_multi structure is allocated, initialized, and linked to the ac_multiaddrs list in the interfaces arpcom structure (Figure 12.8). If ENETRESET is returned by ether_addmulti, the device driver that called the function knows that the multicast list has changed and the hardware reception filter must be updated.

Figure 12.34 shows the relationships between the ip_moptions, in_multi, and ether_multi structures after the LANCE Ethernet interface has joined the all-hosts group.

Overview of multicast data structures.

Figure 12.34. Overview of multicast data structures.

Leaving an IP Multicast Group

In general, the steps required to leave a group are the reverse of those required to join a group. The membership list in the ip_moptions structure is updated, the in_multi list for the IP interface is updated, and the ether_multi list for the device is updated. First, we return to ip_setmoptions and the IP_DROP_MEMBERSHIP case, which we show in Figure 12.35.

Table 12.35. ip_setmoptions function: leaving a multicast group.

---------------------------------------------------------------------- ip_output.c
804     case IP_DROP_MEMBERSHIP:
805         /*
806          * Drop a multicast group membership.
807          * Group must be a valid IP multicast address.
808          */
809         if (m == NULL || m->m_len != sizeof(struct ip_mreq)) {
810             error = EINVAL;
811             break;
812         }
813         mreq = mtod(m, struct ip_mreq *);
814         if (!IN_MULTICAST(ntohl(mreq->imr_multiaddr.s_addr))) {
815             error = EINVAL;
816             break;
817         }
818         /*
819          * If an interface address was specified, get a pointer
820          * to its ifnet structure.
821          */
822         if (mreq->imr_interface.s_addr == INADDR_ANY)
823             ifp = NULL;
824         else {
825             INADDR_TO_IFP(mreq->imr_interface, ifp);
826             if (ifp == NULL) {
827                 error = EADDRNOTAVAIL;
828                 break;
829             }
830         }
831         /*
832          * Find the membership in the membership array.
833          */
834         for (i = 0; i < imo->imo_num_memberships; ++i) {
835             if ((ifp == NULL ||
836                  imo->imo_membership[i]->inm_ifp == ifp) &&
837                 imo->imo_membership[i]->inm_addr.s_addr ==
838                 mreq->imr_multiaddr.s_addr)
839                 break;
840         }
841         if (i == imo->imo_num_memberships) {
842             error = EADDRNOTAVAIL;
843             break;
844         }
845         /*
846          * Give up the multicast address record to which the
847          * membership points.
848          */
849         in_delmulti(imo->imo_membership[i]);
850         /*
851          * Remove the gap in the membership array.
852          */
853         for (++i; i < imo->imo_num_memberships; ++i)
854             imo->imo_membership[i - 1] = imo->imo_membership[i];
855         --imo->imo_num_memberships;
856         break;
---------------------------------------------------------------------- ip_output.c

Validation

804-830

The mbuf must contain an ip_mreq structure, within the structure imr_multiaddr must be a multicast group, and there must be an interface associated with the unicast address imr_interface. If these conditions aren’t met, EINVAL or EADDRNOTAVAIL is posted and processing continues at the end of the switch.

Delete membership references

831-856

The for loop searches the group membership list for an in_multi structure with the requested {interface, group} pair. If a match isn’t found, EADDRNOTAVAIL is posted. Otherwise, in_delmulti updates the in_multi list and the second for loop removes the unused entry in the membership array by shifting subsequent entries to fill the gap. The size of the array is updated accordingly.

in_delmulti Function

Since many processes may be receiving multicast datagrams, calling in_delmulti (Figure 12.36) results only in leaving the specified group when there are no more references to the in_multi structure.

Table 12.36. in_delmulti function.

---------------------------------------------------------------------- in.c
534 int
535 in_delmulti(inm)
536 struct in_multi *inm;
537 {
538     struct in_multi **p;
539     struct ifreq ifr;
540     int     s = splnet();

541     if (--inm->inm_refcount == 0) {
542         /*
543          * No remaining claims to this record; let IGMP know that
544          * we are leaving the multicast group.
545          */
546         igmp_leavegroup(inm);
547         /*
548          * Unlink from list.
549          */
550         for (p = &inm->inm_ia->ia_multiaddrs;
551              *p != inm;
552              p = &(*p)->inm_next)
553             continue;
554         *p = (*p)->inm_next;
555         /*
556          * Notify the network driver to update its multicast reception
557          * filter.
558          */
559         ((struct sockaddr_in *) &(ifr.ifr_addr))->sin_family = AF_INET;
560         ((struct sockaddr_in *) &(ifr.ifr_addr))->sin_addr =
561             inm->inm_addr;
562         (*inm->inm_ifp->if_ioctl) (inm->inm_ifp, SIOCDELMULTI,
563                                    (caddr_t) & ifr);
564         free(inm, M_IPMADDR);
565     }
566     splx(s);
567 }
---------------------------------------------------------------------- in.c

Update in_multi structure

534-567

in_delmulti starts by decrementing the reference count of the in_multi structure and returning if the reference count is nonzero. If the reference count drops to 0, there are no longer any processes waiting for the multicast datagrams on the specified {interface, group} pair. igmp_leavegroup is called, but as we’ll see in Section 13.8, the function does nothing.

The for loop traverses the linked list of in_multi structures until it locates the matching structure.

The body of this for loop consists of the single continue statement. All the work is done by the expressions at the top of the loop. The continue is not required but stands out more clearly than a bare semicolon.

The ETHER_LOOKUP_MULTI macro in Figure 12.9 does not use the continue and the bare semicolon is almost undetectable.

After the loop, the matching in_multi structure is unlinked and in_delmulti issues the SIOCDELMULTI request to the interface so that any device-specific data structures can be updated. For Ethernet interfaces, this means the ether_multi list is updated. Finally, the in_multi structure is released.

The SIOCDELMULTI case for the LANCE driver was included in Figure 12.31 where we also discussed the SIOCADDMULTI case.

ether_delmulti Function

When IP releases an in_multi structure associated with an Ethernet device, the device may be able to release the matching ether_multi structure. We say may because IP may be unaware of other software listening for IP multicasts. When the reference count for the ether_multi structure drops to 0, it can be released. Figure 12.37 shows the ether_delmulti function.

Table 12.37. ether_delmulti function.

---------------------------------------------------------------------- if_ethersubr.c
445 int
446 ether_delmulti(ifr, ac)
447 struct ifreq *ifr;
448 struct arpcom *ac;
449 {
450     struct ether_multi *enm;
451     struct ether_multi **p;
452     struct sockaddr_in *sin;
453     u_char  addrlo[6];
454     u_char  addrhi[6];
455     int     s = splimp();

456     switch (ifr->ifr_addr.sa_family) {

457     case AF_UNSPEC:
458         bcopy(ifr->ifr_addr.sa_data, addrlo, 6);
459         bcopy(addrlo, addrhi, 6);
460         break;

461     case AF_INET:
462         sin = (struct sockaddr_in *) &(ifr->ifr_addr);
463         if (sin->sin_addr.s_addr == INADDR_ANY) {
464             /*
465              * An IP address of INADDR_ANY means stop listening
466              * to the range of Ethernet multicast addresses used
467              * for IP.
468              */
469             bcopy(ether_ipmulticast_min, addrlo, 6);
470             bcopy(ether_ipmulticast_max, addrhi, 6);
471         } else {
472             ETHER_MAP_IP_MULTICAST(&sin->sin_addr, addrlo);
473             bcopy(addrlo, addrhi, 6);
474         }
475         break;

476     default:
477         splx(s);
478         return (EAFNOSUPPORT);
479     }

480     /*
481      * Look up the address in our list.
482      */
483     ETHER_LOOKUP_MULTI(addrlo, addrhi, ac, enm);
484     if (enm == NULL) {
485         splx(s);
486         return (ENXIO);
487     }
488     if (--enm->enm_refcount != 0) {
489         /*
490          * Still some claims to this record.
491          */
492         splx(s);
493         return (0);
494     }
495     /*
496      * No remaining claims to this record; unlink and free it.
497      */
498     for (p = &enm->enm_ac->ac_multiaddrs;
499          *p != enm;
500          p = &(*p)->enm_next)
501         continue;
502     *p = (*p)->enm_next;
503     free(enm, M_IFMADDR);
504     ac->ac_multicnt--;
505     splx(s);
506     /*
507      * Return ENETRESET to inform the driver that the list has changed
508      * and its reception filter should be adjusted accordingly.
509      */
510     return (ENETRESET);
511 }
---------------------------------------------------------------------- if_ethersubr.c

445-479

ether_delmulti initializes the addrlo and addrhi arrays in the same way as ether_addmulti does.

Locate ether_multi structure

480-494

ETHER_LOOKUP_MULTI locates a matching ether_multi structure. If it isn’t found, ENXIO is returned. If the matching structure is found, the reference count is decremented and if the result is nonzero, ether_delmulti returns immediately. In this case, the structure may not be released because another protocol has elected to receive the same multicast packets.

Delete ether_multi structure

495-511

The for loop searches the ether_multi list for the matching address range. The matching structure is unlinked from the list and released. Finally, the size of the list is updated and ENETRESET is returned so that the device driver can update its hardware reception filter.

ip_getmoptions Function

Fetching the current option settings is considerably easier than setting them. All the work is done by ip_getmoptions, shown in Figure 12.38.

Table 12.38. ip_getmoptions function.

---------------------------------------------------------------------- ip_output.c
876 int
877 ip_getmoptions(optname, imo, mp)
878 int     optname;
879 struct ip_moptions *imo;
880 struct mbuf **mp;
881 {
882     u_char *ttl;
883     u_char *loop;
884     struct in_addr *addr;
885     struct in_ifaddr *ia;

886     *mp = m_get(M_WAIT, MT_SOOPTS);

887     switch (optname) {

888     case IP_MULTICAST_IF:
889         addr = mtod(*mp, struct in_addr *);
890         (*mp)->m_len = sizeof(struct in_addr);
891         if (imo == NULL || imo->imo_multicast_ifp == NULL)
892             addr->s_addr = INADDR_ANY;
893         else {
894             IFP_TO_IA(imo->imo_multicast_ifp, ia);
895             addr->s_addr = (ia == NULL) ? INADDR_ANY
896                 : IA_SIN(ia)->sin_addr.s_addr;
897         }
898         return (0);

899     case IP_MULTICAST_TTL:
900         ttl = mtod(*mp, u_char *);
901         (*mp)->m_len = 1;
902         *ttl = (imo == NULL) ? IP_DEFAULT_MULTICAST_TTL
903             : imo->imo_multicast_ttl;
904         return (0);

905     case IP_MULTICAST_LOOP:
906         loop = mtod(*mp, u_char *);
907         (*mp)->m_len = 1;
908         *loop = (imo == NULL) ? IP_DEFAULT_MULTICAST_LOOP
909             : imo->imo_multicast_loop;
910         return (0);

911     default:
912         return (EOPNOTSUPP);
913     }
914 }
---------------------------------------------------------------------- ip_output.c

Copy the option data and return

876-914

The three arguments to ip_getmoptions are: optname, the option to fetch; imo, the ip_moptions structure; and mp, which points to a pointer to an mbuf. m_get allocates an mbuf to hold the option data. For each of the three options, a pointer (addr, ttl, and loop, respectively) is initialized to the data area of the mbuf and the length of the mbuf is set to the length of the option data.

For IP_MULTICAST_IF, the unicast address found by IFP_TO_IA is returned or INADDR_ANY is returned if no explicit multicast interface has been selected.

For IP_MULTICAST_TTL, imo_multicast_ttl is returned or if an explicit multicast TTL has not been selected, 1 (IP_DEFAULT_MULTICAST_TTL) is returned.

For IP_MULTICAST_LOOP, imo_multicast_loop is returned or if an explicit multicast loopback policy has not been selected, 1 (IP_DEFAULT_MULTICAST_LOOP) is returned.

Finally, EOPNOTSUPP is returned if the option isn’t recognized.

Multicast Input Processing: ipintr Function

Now that we have described multicast addressing, group memberships, and the various data structures associated with IP and Ethernet multicasting, we can move on to multicast datagram processing.

In Figure 4.13 we saw that an incoming Ethernet multicast packet is detected by ether_input, which sets the M_MCAST flag in the mbuf header before placing an IP packet on the IP input queue (ipintrq). The ipintr function processes each packet in turn. The multicast processing code we omitted from the discussion of ipintr appears in Figure 12.39.

Table 12.39. ipintr function: multicast input processing.

---------------------------------------------------------------------- ip_input.c
214     if (IN_MULTICAST(ntohl(ip->ip_dst.s_addr))) {
215         struct in_multi *inm;
216         extern struct socket *ip_mrouter;

217         if (ip_mrouter) {
218             /*
219              * If we are acting as a multicast router, all
220              * incoming multicast packets are passed to the
221              * kernel-level multicast forwarding function.
222              * The packet is returned (relatively) intact; if
223              * ip_mforward() returns a non-zero value, the packet
224              * must be discarded, else it may be accepted below.
225              *
226              * (The IP ident field is put in the same byte order
227              * as expected when ip_mforward() is called from
228              * ip_output().)
229              */
230             ip->ip_id = htons(ip->ip_id);
231             if (ip_mforward(m, m->m_pkthdr.rcvif) != 0) {
232                 ipstat.ips_cantforward++;
233                 m_freem(m);
234                 goto next;
235             }
236             ip->ip_id = ntohs(ip->ip_id);

237             /*
238              * The process-level routing demon needs to receive
239              * all multicast IGMP packets, whether or not this
240              * host belongs to their destination groups.
241              */
242             if (ip->ip_p == IPPROTO_IGMP)
243                 goto ours;
244             ipstat.ips_forward++;
245         }
246         /*
247          * See if we belong to the destination multicast group on the
248          * arrival interface.
249          */
250         IN_LOOKUP_MULTI(ip->ip_dst, m->m_pkthdr.rcvif, inm);
251         if (inm == NULL) {
252             ipstat.ips_cantforward++;
253             m_freem(m);
254             goto next;
255         }
256         goto ours;
257     }
---------------------------------------------------------------------- ip_input.c

The code is from the section of ipintr that determines if a packet is addressed to the local system or if it should be forwarded. At this point, the packet has been checked for errors and any options have been processed. ip points to the IP header within the packet.

Forward packets if configured as multicast router

214-245

This entire section of code is skipped if the destination address is not an IP multicast group. If the address is a multicast group and the system is configured as an IP multicast router (ip_mrouter), ip_id is converted to network byte order (the form that ip_mforward expects), and the packet is passed to ip_mforward. If ip_mforward returns a nonzero value, an error was detected or the packet arrived through a multicast tunnel. The packet is discarded and ips_cantforward incremented.

We describe multicast tunnels in Chapter 14. They transport multicast packets between multicast routers separated by standard IP routers. Packets that arrive through a tunnel must be processed by ip_mforward and not ipintr.

If ip_mforward returns 0, ip_id is converted back to host byte order and ipintr may continue processing the packet.

If ip points to an IGMP packet, it is accepted and execution continues at ours (ipintr, Figure 10.11). A multicast router must accept all IGMP packets irrespective of their individual destination groups or of the group memberships of the incoming interface. The IGMP packets contain announcements of membership changes.

246-257

The remaining code in Figure 12.39 is executed whether or not the system is configured as a multicast router. IN_LOOKUP_MULTI searches the list of multicast groups that the interface has joined. If a match is not found, the packet is discarded. This occurs when the hardware filter accepts unwanted packets or when a group associated with the interface and the destination group of the packet map to the same Ethernet multicast address.

If the packet is accepted, execution continues at the label ours in ipintr (Figure 10.11).

Multicast Output Processing: ip_output Function

When we discussed ip_output in Chapter 8, we postponed discussion of the mp argument to ip_output and the multicast processing code. In ip_output, if mp points to an ip_moptions structure, it overrides the default multicast output processing. The omitted code from ip_output appears in Figures 12.40 and 12.41. ip points to the outgoing packet, m points to the mbuf holding the packet, and ifp points to the interface selected by the routing tables for the destination group.

Table 12.40. ip_output function: defaults and source address.

---------------------------------------------------------------------- ip_output.c
129     if (IN_MULTICAST(ntohl(ip->ip_dst.s_addr))) {
130         struct in_multi *inm;
131         extern struct ifnet loif;

132         m->m_flags |= M_MCAST;
133         /*
134          * IP destination address is multicast.  Make sure "dst"
135          * still points to the address in "ro".  (It may have been
136          * changed to point to a gateway address, above.)
137          */
138         dst = (struct sockaddr_in *) &ro->ro_dst;
139         /*
140          * See if the caller provided any multicast options
141          */
142         if (imo != NULL) {
143             ip->ip_ttl = imo->imo_multicast_ttl;
144             if (imo->imo_multicast_ifp != NULL)
145                 ifp = imo->imo_multicast_ifp;
146         } else
147             ip->ip_ttl = IP_DEFAULT_MULTICAST_TTL;
148         /*
149          * Confirm that the outgoing interface supports multicast.
150          */
151         if ((ifp->if_flags & IFF_MULTICAST) == 0) {
152             ipstat.ips_noroute++;
153             error = ENETUNREACH;
154             goto bad;
155         }
156         /*
157          * If source address not specified yet, use address
158          * of outgoing interface.
159          */
160         if (ip->ip_src.s_addr == INADDR_ANY) {
161             struct in_ifaddr *ia;

162             for (ia = in_ifaddr; ia; ia = ia->ia_next)
163                 if (ia->ia_ifp == ifp) {
164                     ip->ip_src = IA_SIN(ia)->sin_addr;
165                     break;
166                 }
167         }
---------------------------------------------------------------------- ip_output.c

Table 12.41. ip_output function: loopback, forward, and send.

---------------------------------------------------------------------- ip_output.c
168         IN_LOOKUP_MULTI(ip->ip_dst, ifp, inm);
169         if (inm != NULL &&
170             (imo == NULL || imo->imo_multicast_loop)) {
171             /*
172              * If we belong to the destination multicast group
173              * on the outgoing interface, and the caller did not
174              * forbid loopback, loop back a copy.
175              */
176             ip_mloopback(ifp, m, dst);
177         } else {
178             /*
179              * If we are acting as a multicast router, perform
180              * multicast forwarding as if the packet had just
181              * arrived on the interface to which we are about
182              * to send.  The multicast forwarding function
183              * recursively calls this function, using the
184              * IP_FORWARDING flag to prevent infinite recursion.
185              *
186              * Multicasts that are looped back by ip_mloopback(),
187              * above, will be forwarded by the ip_input() routine,
188              * if necessary.
189              */
190             extern struct socket *ip_mrouter;
191             if (ip_mrouter && (flags & IP_FORWARDING) == 0) {
192                 if (ip_mforward(m, ifp) != 0) {
193                     m_freem(m);
194                     goto done;
195                 }
196             }
197         }
198         /*
199          * Multicasts with a time-to-live of zero may be looped-
200          * back, above, but must not be transmitted on a network.
201          * Also, multicasts addressed to the loopback interface
202          * are not sent -- the above call to ip_mloopback() will
203          * loop back a copy if this host actually belongs to the
204          * destination group on the loopback interface.
205          */
206         if (ip->ip_ttl == 0 || ifp == &loif) {
207             m_freem(m);
208             goto done;
209         }
210         goto sendit;
211     }
---------------------------------------------------------------------- ip_output.c

Establish defaults

129-155

The code in Figure 12.40 is executed only if the packet is destined for a multicast group. If so, ip_output sets M_MCAST in the mbuf and dst is reset to the final destination as it may have been set to the next-hop router earlier in ip_output (Figure 8.24).

If an ip_moptions structure was passed, ip_ttl and ifp are changed accordingly. Otherwise, ip_ttl is set to 1 (IP_DEFAULT_MULTICAST_TTL), which prevents the multicast from escaping to a remote network. The interface selected by consulting the routing tables or the interface specified within the ip_moptions structure must support multicasting. If they do not, ip_output discards the packet and returns ENETUNREACH.

Select source address

156-167

If the source address is unspecified, the for loop finds the Internet unicast address associated with the outgoing interface and fills in ip_src in the IP header.

Unlike a unicast packet, an outgoing multicast packet may be transmitted on more than one interface if the system is configured as a multicast router. Even if the system is not a multicast router, the outgoing interface may be a member of the destination group and may need to receive the packet. Finally, we need to consider the multicast loopback policy and the loopback interface itself. Taking all this into account, there are three questions to consider:

  • Should the packet be received on the outgoing interface?

  • Should the packet be forwarded to other interfaces?

  • Should the packet be transmitted on the outgoing interface?

Figure 12.41 shows the code from ip_output that answers these questions.

Loopback or not?

168-176

If IN_LOOKUP_MULTI determines that the outgoing interface is a member of the destination group and imo_multicast_loop is nonzero, the packet is queued for input on the output interface by ip_mloopback. In this case, the original packet is not considered for forwarding, since the copy is forwarded during input processing if necessary.

Forward or not?

178-197

If the packet is not looped back, but the system is configured as a multicast router and the packet is eligible for forwarding, ip_mforward distributes copies to other multicast interfaces. If ip_mforward does not return 0, ip_output discards the packet and does not attempt to transmit it. This indicates an error with the packet.

To prevent infinite recursion between ip_mforward and ip_output, ip_mforward always turns on IP_FORWARDING before calling ip_output. A datagram originating on the system is eligible for forwarding because the transport protocols do not turn on IP_FORWARDING.

Transmit or not?

198-209

Packets with a TTL of 0 may be looped back, but they are never forwarded (ip_mforward discards them) and are never transmitted. If the TTL is 0 or if the output interface is the loopback interface, ip_output discards the packet since the TTL has expired or the packet has already been looped back by ip_mloopback.

Send packet

210-211

If the packet has made it this far, it is ready to be physically transmitted on the output interface. The code at sendit (ip_output, Figure 8.25) may fragment the datagram before passing it (or the resulting fragments) to the interface’s if_output function. We’ll see in Section 21.10 that the Ethernet output function, ether_output, calls arpresolve, which calls ETHER_MAP_IP_MULTICAST to construct an Ethernet multicast destination address based on the IP multicast destination address.

ip_mloopback Function

ip_mloopback relies on looutput (Figure 5.27) to do its job. Instead of passing a pointer to the loopback interface to looutput, ip_mloopback passes a pointer to the output multicast interface. The ip_mloopback function is shown in Figure 12.42.

Table 12.42. ip_mloopback function.

---------------------------------------------------------------------- ip_output.c
935 static void
936 ip_mloopback(ifp, m, dst)
937 struct ifnet *ifp;
938 struct mbuf *m;
939 struct sockaddr_in *dst;
940 {
941     struct ip *ip;
942     struct mbuf *copym;

943     copym = m_copy(m, 0, M_COPYALL);
944     if (copym != NULL) {
945         /*
946          * We don't bother to fragment if the IP length is greater
947          * than the interface's MTU.  Can this possibly matter?
948          */
949         ip = mtod(copym, struct ip *);
950         ip->ip_len = htons((u_short) ip->ip_len);
951         ip->ip_off = htons((u_short) ip->ip_off);
952         ip->ip_sum = 0;
953         ip->ip_sum = in_cksum(copym, ip->ip_hl << 2);
954         (void) looutput(ifp, copym, (struct sockaddr *) dst, NULL);
955     }
956 }
---------------------------------------------------------------------- ip_output.c

Duplicate and queue packet

929-956

Copying the packet isn’t enough; the packet must look as though it was received on the output interface, so ip_mloopback converts ip_len and ip_off to network byte order and computes the checksum for the packet. looutput takes care of putting the packet on the IP input queue.

Performance Considerations

The multicast implementation in Net/3 has several potential performance bottlenecks. Since many Ethernet cards do not support perfect filtering of multicast addresses, the operating system must be prepared to discard multicast packets that pass through the hardware filter. In the worst case, an Ethernet card may fall back to receiving all multicast packets, most of which must be discarded by ipintr when they are found not to contain a valid IP multicast group address.

IP uses a simple linear list and linear search to filter incoming IP datagrams. If the list grows to any appreciable length, a caching mechanism such as moving the most recently received address to the front of the list would help performance.

Summary

In this chapter we described how a single host processes IP multicast datagrams. We looked at the format of an IP class D address and an Ethernet multicast address and the mapping between the two.

We discussed the in_multi and ether_multi structures, and we saw that each IP multicast interface maintains its own group membership list and that each Ethernet interface maintains a list of Ethernet multicast addresses.

During input processing, IP multicasts are accepted only if they arrive on an interface that is a member of their destination group, although they may be forwarded to other interfaces if the system is configured as a multicast router.

Systems configured as multicast routers must accept all multicast packets on every interface. This can be done quickly by issuing the SIOCADDMULTI command for the INADDR_ANY address.

The ip_moptions structure is the cornerstone of multicast output processing. It controls the selection of an output interface, the TTL field of the multicast datagram, and the loopback policy. It also holds references to the in_multi structures, which determine when an interface joins or leaves an IP multicast group.

We also discussed the two concepts implemented by the multicast TTL value: packet lifetime and packet scope.

Exercises

12.1

What is the difference between sending an IP broadcast packet to 255.255.255.255 and sending an IP multicast to the all-hosts group 224.0.0.1?

12.1

On an Ethernet, the IP broadcast address 255.255.255.255 translates to the Ethernet broadcast address ff:ff:ff:ff:ff:ff and is received by every Ethernet interface on the network. Systems that aren’t running IP software must actively receive and discard each of these broadcast packets.

A packet sent to the IP all-hosts multicast group 224.0.0.1 translates to the Ethernet multicast address 01:00:5e:00:00:01 and is received only by systems that have explicitly instructed their interfaces to receive IP multicast datagrams. Systems that aren’t running IP or that aren’t level-2 compliant never receive these datagrams, as they are discarded by the Ethernet interface hardware itself.

12.2

Why are interfaces identified by their IP unicast addresses in the multicasting code? What must be changed so that an interface could send and receive multicast datagrams but not have a unicast IP address?

12.2

One alternative would be to specify interfaces by their text name as with the ifreq structure and the ioctl commands for accessing interface information. ip_setmoptions and ip_getmoptions would have to call ifunit instead of INADDR_TO_IFP to locate the pointer to the interface’s ifnet structure.

12.3

In Section 12.3 we said that 32 IP groups are mapped to a single Ethernet address. Since 9 bits of a 32-bit address are not included in the mapping, why didn’t we say that 512 (29) IP groups mapped to a single Ethernet address?

12.3

The high-order 4 bits of a multicast group are always 1110, so only 5 significant bits are discarded by the mapping function.

12.4

Why do you think IP_MAX_MEMBERSHIPS is set to 20? Could it be set to a larger value? Hint: Consider the size of the ip_moptions structure (Figure 12.15).

12.4

The entire ip_moptions structure must fit within an mbuf, which limits the size of the structure to 108 bytes (remember the 20-byte mbuf header). IP_MAX_MEMBERSHIPS can be larger but must be less than or equal to 25. (4 + 1 + 1 + 2 + (4 × 25) = 108)

12.5

What happens when a multicast datagram is looped back by IP and is also received by the hardware interface on which it is transmitted (i.e., a nonsimplex interface)?

12.5

The datagram is duplicated and two copies appear on the IP input queue. A multicast application must be prepared to discard duplicate datagrams.

12.6

Draw a picture of a network with a multihomed host so that a multicast packet sent on one interface may be received on the other interface even if the host is not acting as a multicast router.

12.6

Exercises

12.7

Trace the membership add request through the SLIP and loopback interfaces instead of the Ethernet interface.

12.8

How could a process request that the kernel join more than IP_MAX_MEMBERSHIPS?

12.8

The process could create a second socket and request another IP_MAX_MEMBERSHIPS through the second socket.

12.9

Computing the checksum on a looped back packet is superfluous. Design a method to avoid the checksum computation for loopback packets.

12.9

Define a new mbuf flag M_LOCAL for the m_flags member of the mbuf header. The flag can be set on loopback packets by ip_output instead of computing the checksum. ipintr can skip the checksum verification if the flag is on. SunOS 5.X has an option to do this (ip_local_cksum, page 531, Volume 1).

12.10

How many IP multicast groups could an interface join without reusing an Ethernet multicast address?

12.10

There are 223−1 (8,388,607) unique Ethernet IP multicast addresses. Remember that IP group 224.0.0.0 is reserved.

12.11

The careful reader might have noticed that in_delmulti assumes that the interface has defined an ioctl function when it issues the SIOCDELMULTI request. Why is this OK?

12.11

This assumption is correct since in_addmulti rejects all add requests if the interface does not have an ioctl function, and this implies that in_delmulti is never called if if_ioctl is null.

12.12

What happens to the mbuf allocated in ip_getmoptions if an unrecognized option is requested?

12.12

The mbuf is never released. It appears that ip_getmoptions contains a memory leak. ip_getmoptions is called from ip_ctloutput, which allows a call such as:

ip_getmoptions(IP_ADD_MEMBERSHIP, 0, mp)

which exercises the bug in ip_getmoptions.

12.13

Why is the group membership mechanism separate from the binding mechanism used to receive unicast and broadcast datagrams?

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.222.67.251