Chapter 3. Interface Layer

Introduction

This chapter starts our discussion of Net/3 at the bottom of the protocol stack with the interface layer, which includes the hardware and software that sends and receives packets on locally attached networks.

We use the term device driver to refer to the software that communicates with the hardware and network interface (or just interface) for the hardware and device driver for a particular network.

The Net/3 interface layer attempts to provide a hardware-independent programming interface between the network protocols and the drivers for the network devices connected to a system. The interface layer supports provides for all devices:

  • a well-defined set of interface functions,

  • a standard set of statistics and control flags,

  • a device-independent method of storing protocol addresses, and

  • a standard queueing method for outgoing packets.

There is no requirement that the interface layer provide reliable delivery of packets, only a best-effort service is required. Higher protocol layers must compensate for this lack of reliability. This chapter describes the generic data structures maintained for all network interfaces. To illustrate the relevant data structures and algorithms, we refer to three particular network interfaces from Net/3:

  1. An AMD 7990 LANCE Ethernet interface: an example of a broadcast-capable local area network.

  2. A Serial Line IP (SLIP) interface: an example of a point-to-point network running over asynchronous serial lines.

  3. A loopback interface: a logical network that returns all outgoing packets as input packets.

Code Introduction

The generic interface structures and initialization code are found in three headers and two C files. The device-specific initialization code described in this chapter is found in three different C files. All eight files are listed in Figure 3.1.

Table 3.1. Files discussed in this chapter.

File

Description

sys/socket.h

address structure definitions

net/if.h

interface structure definitions

net/if_dl.h

link-level structure definitions

kern/init_main.c

system and interface initialization

net/if.c

generic interface code

net/if_loop.c

loopback device driver

net/if_sl.c

SLIP device driver

hp300/dev/if_le.c

LANCE Ethernet device driver

Global Variables

The global variables introduced in this chapter are described in Figure 3.2.

Table 3.2. Global variables introduced in this chapter.

Variable

Data type

Description

pdevinit

struct pdevinit []

array of initialization parameters for pseudo-devices such as SLIP and loopback interfaces

ifnet

struct ifnet *

head of list of ifnet structures

ifnet_addrs

struct ifaddr **

array of pointers to link-level interface addresses

if_indexlim

int

size of ifnet_addrs array

if_index

int

index of the last configured interface

ifqmaxlen

int

maximum size of interface output queues

hz

int

the clock-tick frequency for this system (ticks/second)

SNMP Variables

The Net/3 kernel collects a wide variety of networking statistics. In most chapters we summarize the statistics and show how they relate to the standard TCP/IP information and statistics defined in the Simple Network Management Protocol Management Information Base (SNMP MIB-II). RFC 1213 [McCloghrie and Rose 1991] describe SNMP MIB-II, which is organized into 10 distinct information groups shown in Figure 3.3.

Table 3.3. SNMP groups in MIB-II.

SNMP Group

Description

System

general information about the system

Interfaces

network interface information

Address Translation

network-address-to-hardware-address-translation tables (deprecated)

IP

IP protocol information

ICMP

ICMP protocol information

TCP

TCP protocol information

UDP

UDP protocol information

EGP

EGP protocol information

Transmission

media-specific information

SNMP

SNMP protocol information

Net/3 does not include an SNMP agent. Instead, an SNMP agent for Net/3 is implemented as a process that accesses the kernel statistics in response to SNMP queries through the mechanism described in Section 2.2.

While most of the MIB-II variables are collected by Net/3 and may be accessed directly by an SNMP agent, others must be derived indirectly. MIB-II variables fall into three categories: (1) simple variables such an integer value, a timestamp, or a byte string; (2) lists of simple variables such as an individual routing entry or an interface description entry; and (3) lists of lists such as the entire routing table and the list of all interface entries.

The ISODE package includes a sample SNMP agent for Net/3. See Appendix B for information about ISODE.

Figure 3.4 shows the one simple variable maintained for the SNMP interface group. We describe the SNMP interface table later in Figure 4.7.

Table 3.4. Simple SNMP variable in the interface group.

SNMP variable

Net/3 variable

Description

ifNumber

if_index + 1

if_index is the index of the last interface in the system and starts at 0; 1 is added to get ifNumber, the number of interfaces in the system.

ifnet Structure

The ifnet structure contains information common to all interfaces. During system initialization, a separate ifnet structure is allocated for each network device. Every ifnet structure has a list of one or more protocol addresses associated with it. Figure 3.5 illustrates the relationship between an interface and its addresses.

Each ifnet structure has a list of associated ifaddr structures.

Figure 3.5. Each ifnet structure has a list of associated ifaddr structures.

The interface in Figure 3.5 is shown with three protocol addresses stored in ifaddr structures. Although some network interfaces, such as SLIP, support only a single protocol, others, such as Ethernet, support multiple protocols and need multiple addresses. For example, a system may use a single Ethernet interface for both Internet and OSI protocols. A type field identifies the contents of each Ethernet frame, and since the Internet and OSI protocols employ different addressing schemes, the Ethernet interface must have an Internet address and an OSI address. All the addresses are connected by a linked list (the arrows on the right of Figure 3.5), and each contains a back pointer to the related ifnet structure (the arrows on the left of Figure 3.5).

It is also possible for a single network interface to support multiple addresses within a single protocol. For example, two Internet addresses may be assigned to a single Ethernet interface in Net/3.

This feature first appeared in Net/2. Having two IP addresses for an interface is useful when renumbering a network. During a transition period, the interface can accept packets addressed to the old and new addresses.

The ifnet structure is large so we describe it in five sections:

  • implementation information,

  • hardware information,

  • interface statistics,

  • function pointers, and

  • the output queue.

Figure 3.6 shows the implementation information contained in the ifnet structure.

Table 3.6. ifnet structure: implementation information.

------------------------------------------------------------------------------ if.h
 80 struct ifnet {
 81     struct ifnet *if_next;      /* all struct ifnets are chained */
 82     struct ifaddr *if_addrlist; /* linked list of addresses per if */
 83     char   *if_name;            /* name, e.g. 'le' or 'lo' */
 84     short   if_unit;            /* sub-unit for lower level driver */
 85     u_short if_index;           /* numeric abbreviation for this if  */
 86     short   if_flags;           /* Figure 3.7 */
 87     short   if_timer;           /* time 'til if_watchdog called */
 88     int     if_pcount;          /* number of promiscuous listeners */
 89     caddr_t if_bpf;             /* packet filter structure */
------------------------------------------------------------------------------ if.h

80-82

if_next joins the ifnet structures for all the interfaces into a linked list. The if_attach function constructs the list during system initialization. if_addrlist points to the list of ifaddr structures for the interface (Figure 3.16). Each ifaddr structure holds addressing information for a protocol that expects to communicate through the interface.

Common interface information

83-86

if_name is a short string that identifies the interface type, and if_unit identifies multiple instances of the same type. For example, if a system had two SLIP interfaces, both would have an if_name consisting of the 2 bytes "s1" and an if_unit of 0 for the first interface and 1 for the second. if_index uniquely identifies the interface within the kernel and is used by the sysctl system call (Section 19.14) as well as in the routing domain.

Sometimes an interface is not uniquely identified by a protocol address. For example, several SLIP connections can have the same local IP address. In these cases, if_index specifies the interface explicitly.

if_flags specifies the operational state and properties of the interface. A process can examine all the flags but cannot change the flags marked in the “Kernel only” column in Figure 3.7. The flags are accessed with the SIOCGIFFLAGS and SIOCSIFFLAGS commands described in Section 4.4.

Table 3.7. if_flags values.

if_flags

Kernel only

Description

IFF_BROADCAST

the interface is for a broadcast network

IFF_MULTICAST

the interface supports multicasting

IFF_POINTOPOINT

the interface is for a point-to-point network

IFF_LOOPBACK

 

the interface is for a loopback network

IFF_OACTIVE

a transmission is in progress

IFF_RUNNING

resources are allocated for this interface

IFF_SIMPLEX

the interface cannot receive its own transmissions

IFF_LINK0

see text

defined by device driver

IFF_LINK1

see text

defined by device driver

IFF_LINK2

see text

defined by device driver

IFF_ALLMULTI

 

the interface is receiving all multicast packets

IFF_DEBUG

 

debugging is enabled for the interface

IFF_NOARP

 

don’t use ARP on this interface

IFF_NOTRAILERS

 

avoid using trailer encapsulation

IFF_PROMISC

 

the interface receives all network packets

IFF_UP

 

the interface is operating

The IFF_BROADCAST and IFF_POINTOPOINT flags are mutually exclusive.

The macro IFF_CANTCHANGE is a bitwise OR of all the flags in the “Kernel only” column.

The device-specific flags (IFF_LINKx) may or may not be modifiable by a process depending on the device. For example, Figure 3.29 shows how these flags are defined by the SLIP driver.

Interface timer

87

if_timer is the time in seconds until the kernel calls the if_watchdog function for the interface. This function may be used by the device driver to collect interface statistics at regular intervals or to reset hardware that isn’t operating correctly.

BSD Packet Filter

88-89

The next two members, if_pcount and if_bpf, support the BSD Packet Filter (BPF). Through BPF, a process can receive copies of packets transmitted or received by an interface. As we discuss the device drivers, we also describe how packets are passed to BPF. BPF itself is described in Chapter 31.

The next section of the ifnet structure, shown in Figure 3.8, describes the hardware characteristics of the interface.

Table 3.8. ifnet structure: interface characteristics.

------------------------------------------------------------------------------- if.h
 90     struct if_data {
 91 /* generic interface information */
 92         u_char  ifi_type;       /* Figure 3.9 */
 93         u_char  ifi_addrlen;    /* media address length */
 94         u_char  ifi_hdrlen;     /* media header length */
 95         u_long  ifi_mtu;        /* maximum transmission unit */
 96         u_long  ifi_metric;     /* routing metric (external only) */
 97         u_long  ifi_baudrate;   /* linespeed */
                                                                             
                              /* other ifnet members */                      
                                                                             
138 #define if_mtu      if_data.ifi_mtu
139 #define if_type     if_data.ifi_type
140 #define if_addrlen  if_data.ifi_addrlen
141 #define if_hdrlen   if_data.ifi_hdrlen
142 #define if_metric   if_data.ifi_metric
143 #define if_baudrate if_data.ifi_baudrate
------------------------------------------------------------------------------- if.h

Net/3 and this text use the short names provided by the #define statements on lines 138 through 143 to specify the ifnet members.

Interface characteristics

90-92

if_type specifies the hardware address type supported by the interface. Figure 3.9 lists several common values from net/if_types.h.

Table 3.9. if_type: data-link types.

if_type

Description

IFT_OTHER

unspecified

IFT_ETHER

Ethernet

IFT_ISO88023

IEEE 802.3 Ethernet (CMSA/CD)

IFT_ISO88025

IEEE 802.5 token ring

IFT_FDDI

Fiber Distributed Data Interface

IFT_LOOP

loopback interface

IFT_SLIP

serial line IP

93-94

if_addrlen is the length of the datalink address and if_hdrlen is the length of the header attached to any outgoing packet by the hardware. An Ethernet network, for example, has an address length of 6 bytes and a header length of 14 bytes (Figure 4.8).

95

if_mtu is the maximum transmission unit of the interface: the size in bytes of the largest unit of data that the interface can transmit in a single output operation. This is an important parameter that controls the size of packets created by the network and transport protocols. For Ethernet, the value is 1500.

96-97

if_metric is usually 0; a higher value makes routes through the interface less favorable. if_baudrate specifies the transmission speed of the interface. It is set only by the SLIP interface.

Interface statistics are collected by the next group of members in the ifnet structure shown in Figure 3.10.

Table 3.10. ifnet structure: interface statistics.

---------------------------------------------------------------------------- if.h
 98 /* volatile statistics */
 99         u_long  ifi_ipackets;   /* #packets received on interface */
100         u_long  ifi_ierrors;    /* #input errors on interface */
101         u_long  ifi_opackets;   /* #packets sent on interface */
102         u_long  ifi_oerrors;    /* #output errors on interface */
103         u_long  ifi_collisions; /* #collisions on csma interfaces */
104         u_long  ifi_ibytes;     /* #bytes received */
105         u_long  ifi_obytes;     /* #bytes sent */
106         u_long  ifi_imcasts;    /* #packets received via multicast */
107         u_long  ifi_omcasts;    /* #packets sent via multicast */
108         u_long  ifi_iqdrops;    /* #packets dropped on input, for this
109                                    interface */
110         u_long  ifi_noproto;    /* #packets destined for unsupported
111                                    protocol */
112         struct timeval ifi_lastchange;  /* last updated */
113     } if_data;
                                                                             
                              /* other ifnet members */                      
                                                                             
144 #define if_ipackets if_data.ifi_ipackets
145 #define if_ierrors  if_data.ifi_ierrors
146 #define if_opackets if_data.ifi_opackets
147 #define if_oerrors  if_data.ifi_oerrors
148 #define if_collisions   if_data.ifi_collisions
149 #define if_ibytes   if_data.ifi_ibytes
150 #define if_obytes   if_data.ifi_obytes
151 #define if_imcasts  if_data.ifi_imcasts
152 #define if_omcasts  if_data.ifi_omcasts
153 #define if_iqdrops  if_data.ifi_iqdrops
154 #define if_noproto  if_data.ifi_noproto
155 #define if_lastchange   if_data.ifi_lastchange
---------------------------------------------------------------------------- if.h

Once again, Net/3 and this text use the short names provided by the #define statements on lines 144 through 155 to specify the ifnet members.

Interface statistics

98-111

Most of these statistics are self-explanatory. if_collisions is incremented when packet transmission is interrupted by another transmission on shared media such as Ethernet. if_noproto counts the number of packets that can’t be processed because the protocol is not supported by the system or the interface (e.g., an OSI packet that arrives at a system that supports only IP). The SLIP interface increments if_noproto if a non-IP packet is placed on its output queue.

These statistics were not part of the ifnet structure in Net/1. They were added to support the standard SNMP MIB-II variables for interfaces.

if_iqdrops is accessed only by the SLIP device driver. SLIP and the other network drivers increment if_snd.ifq_drops (Figure 3.13) when IF_DROP is called. ifq_drops was already in the BSD software when the SNMP statistics were added. The ISODE SNMP agent ignores if_iqdrops and uses ifsnd.ifq_drops.

Change timestamp

112-113

if_lastchange records the last time any of the statistics were changed.

The next section of the ifnet structure, shown in Figure 3.11, contains pointers to the standard interface-layer functions, which isolate device-specific details from the network layer. Each network interface implements these functions as appropriate for the particular device.

Table 3.11. ifnet structure: interface procedures.

----------------------------------------------------------------------------- if.h
114 /* procedure handles */
115     int     (*if_init)          /* init routine */
116             (int);
117     int     (*if_output)        /* output routine (enqueue) */
118             (struct ifnet *, struct mbuf *, struct sockaddr *,
119              struct rtentry *);
120     int     (*if_start)         /* initiate output routine */
121             (struct ifnet *);
122     int     (*if_done)          /* output complete routine */
123             (struct ifnet *);   /* (XXX not used; fake prototype) */
124     int     (*if_ioctl)         /* ioctl routine */
125             (struct ifnet *, int, caddr_t);
126     int     (*if_reset)
127             (int);              /* new autoconfig will permit removal */
128     int     (*if_watchdog)      /* timer routine */
129             (int);
----------------------------------------------------------------------------- if.h

Interface functions

114-129

Each device driver initializes its own ifnet structure, including the seven function pointers, at system initialization time. Figure 3.12 describes the generic functions.

Table 3.12. ifnet structure: function pointers.

Function

Description

if_init

initialize the interface

if_output

queue outgoing packets for transmission

if_start

initiate transmission of packets

if_done

cleanup after transmission completes (not used)

if_ioctl

process I/O control commands

if_reset

reset the interface device

if_watchdog

periodic interface routine

We will see the comment /* XXX */ throughout Net/3. It is a warning to the reader that the code is obscure, contains nonobvious side effects, or is quick solution to a more difficult problem. In this case, it indicates that if_done is not used in Net/3.

In Chapter 4 we look at the device-specific functions for the Ethernet, SLIP, and loopback interfaces, which the kernel calls indirectly through the pointers in the ifnet structure. For example, if ifp points to an ifnet structure,

(*ifp->if_start)(ifp)

calls the if_start function of the device driver associated with the interface.

The remaining member of the ifnet structure is the output queue for the interface and is shown in Figure 3.13.

Table 3.13. ifnet structure: the output queue.

---------------------------------------------------------------------------- if.h
130     struct ifqueue {
131         struct mbuf *ifq_head;
132         struct mbuf *ifq_tail;
133         int     ifq_len;        /* current length of queue */
134         int     ifq_maxlen;     /* maximum length of queue */
135         int     ifq_drops;      /* packets dropped because of full queue */
136     } if_snd;                   /* output queue */
137 };
---------------------------------------------------------------------------- if.h

130-137

if_snd is the queue of outgoing packets for the interface. Each interface has its own ifnet structure and therefore its own output queue. ifq_head points to the first packet on the queue (the next one to be output), ifq_tail points to the last packet on the queue, if_len is the number of packets currently on the queue, and ifq_maxlen is the maximum number of buffers allowed on the queue. This maximum is set to 50 (from the global integer ifqmaxlen, which is initialized at compile time from IFQ_MAXLEN) unless the driver changes it. The queue is implemented as a linked list of mbuf chains. ifq_drops counts the number of packets discarded because the queue was full. Figure 3.14 lists the macros and functions that access a queue.

Table 3.14. ifqueue routines.

Function

Description

IF_QFULL

Is ifq full?

int IF_QFULL(struct ifqueue *ifq);

IF_DROP

IF_DROP only increments the ifq_drops counter associated with ifq. The name is misleading; the caller drops the packet.

void IF_DROP (struct ifqueue *ifq);

IF_ENQUEUE

Add the packet m to the end of the ifq queue. Packets are linked together by m_nextpkt in the mbuf header.

void IF_ENQUEUE (struct ifqueue *ifq, struct mbuf *m);

IF_PREPEND

Insert the packet m at the front of the ifq queue.

void IF_PREPEND (struct ifqueue *ifq, struct mbuf *m);

IF_DEQUEUE

Take the first packet off the ifq queue. m points to the dequeued packet or is null if the queue was empty.

void IF_DEQUEUE (struct ifqueue *ifq, struct mbuf *m);

if_qflush

Discard all packets on the queue ifq, for example, when an interface is shut down.

void if_qflush (struct ifqueue *ifq);

The first five routines are macros defined in net/if.h and the last routine, if_qflush, is a function defined in net/if.c. The macros often appear in sequences such as:

s = splimp();
if (IF_QFULL(inq))  {
    IF_DROP(inq);        /* queue is full, drop new packet */
    m_freem(m);
} else
    IF_ENQUEUE(inq, m);  /* there is room, add to end of queue */
splx(s);

This code fragment attempts to add a packet to the queue. If the queue is full, IF_DROP increments ifq_drops and the packet is discarded. Reliable protocols such as TCP will retransmit discarded packets. Applications using an unreliable protocol such as UDP must detect and handle the retransmission on their own.

Access to the queue is bracketed by splimp and splx to block network interrupts and to prevent the network interrupt service routines from accessing the queue while it is in an indeterminate state.

m_freem is called before splx because the mbuf code has a critical section that runs at splimp. It would be wasted effort to call splx before m_freem only to enter another critical section during m_freem (Section 2.5).

ifaddr Structure

The next structure we look at is the interface address structure, ifaddr, shown in Figure 3.15. Each interface maintains a linked list of ifaddr structures because some data links, such as Ethernet, support more than one protocol. A separate ifaddr structure describes each address assigned to the interface, usually one address per protocol. Another reason to support multiple addresses is that many protocols, including TCP/IP, support multiple addresses assigned to a single physical interface. Although Net/3 supports this feature, many implementations of TCP/IP do not.

Table 3.15. ifaddr structure.

---------------------------------------------------------------------------- if.h
217 struct ifaddr {
218         struct  ifaddr *ifa_next;       /* next address for interface */
219         struct  ifnet *ifa_ifp;         /* back-pointer to interface */
220         struct  sockaddr *ifa_addr;     /* address of interface */
221         struct  sockaddr *ifa_dstaddr;  /* other end of p-to-p link */
222 #define ifa_broadaddr   ifa_dstaddr     /* broadcast address interface */
223         struct  sockaddr *ifa_netmask;  /* used to determine subnet */
224         void    (*ifa_rtrequest)();     /* check or clean routes */
225         u_short ifa_flags;              /* mostly rt_flags for cloning */
226         short   ifa_refcnt;             /* references to this structure */
227         int     ifa_metric;             /* cost for this interface */
228 };
---------------------------------------------------------------------------- if.h

217-219

The ifaddr structure links all addresses assigned to an interface together by ifa_next and contains a pointer, ifa_ifp, back to the interface’s ifnet structure. Figure 3.16 shows the relationship between the ifnet structures and the ifaddr structures.

ifnet and ifaddr structures.

Figure 3.16. ifnet and ifaddr structures.

220

ifa_addr points to a protocol address for the interface and ifa_netmask points to a bit mask that selects the network portion of ifa_addr. Bits that represent the network portion of the address are set to 1 in the mask, and the host portion of the address is set to all 0 bits. Both addresses are stored as sockaddr structures (Section 3.5). Figure 3.38 shows an address and its related mask structure. For IP addresses, the mask selects the network and subnet portions of the IP address.

221-223

ifa_dstaddr (or its alias ifa_broadaddr) points to the protocol address of the interface at the other end of a point-to-point link or to the broadcast address assigned to the interface on a broadcast network such as Ethernet. The mutually exclusive flags IFF_BROADCAST and IFF_POINTOPOINT (Figure 3.7) in the interface’s ifnet structure specify the applicable name.

224-228

ifa_rtrequest, ifa_flags, and ifa_metric support routing lookups for the interface.

ifa_refcnt counts references to the ifaddr structure. The macro IFAFREE only releases the structure when the reference count drops to 0, such as when addresses are deleted with the SIOCDIFADDR ioctl command. The ifaddr structures are reference-counted because they are shared by the interface and routing data structures.

IFAFREE decrements the counter and returns if there are other references. This is the common case and avoids a function call overhead for all but the last reference. If this is the last reference, IFAFREE calls the function ifafree, which releases the structure.

sockaddr Structure

Addressing information for an interface consists of more than a single host address. Net/3 maintains host, broadcast, and network masks in structures derived from a generic sockaddr structure. By using a generic structure, hardware and protocol-specific addressing details are hidden from the interface layer.

Figure 3.17 shows the current definition of the structure as well as the definition from earlier BSD releases—an osockaddr structure.

Table 3.17. sockaddr and osockaddr structures.

------------------------------------------------------------------------- socket.h
120 struct sockaddr {
121     u_char  sa_len;             /* total length */
122     u_char  sa_family;          /* address family (Figure 3.19) */
123     char    sa_data[14];        /* actually longer; address value */
124 };

271 struct osockaddr {
272     u_short sa_family;          /* address family (Figure 3.19) */
273     char    sa_data[14];        /* up to 14 bytes of direct address */
274 };
------------------------------------------------------------------------- socket.h

Figure 3.18 illustrates the organization of these structures.

sockaddr and osockaddr structures (sa_ prefix dropped).

Figure 3.18. sockaddr and osockaddr structures (sa_ prefix dropped).

In many figures, we omit the common prefix in member names. In this case, we’ve dropped the sa_ prefix.

sockaddr structure

120-124

Every protocol has its own address format. Net/3 handles generic addresses in a sockaddr structure. sa_len specifies the length of the address (OSI and Unix domain protocols have variable-length addresses) and sa_family specifies the type of address. Figure 3.19 lists the address family constants that we encounter.

Table 3.19. sa_family constants.

sa_family

Protocol

AF_INET

Internet

AF_ISO,AF_OSI

OSI

AF_UNIX

Unix

AF_ROUTE

routing table

AF_LINK

data link

AF_UNSPEC

(see text)

The contents of a sockaddr when AF_UNSPEC is specified depends on the context. In most cases, it contains an Ethernet hardware address.

The sa_len and sa_family members allow protocol-independent code to manipulate variable-length sockaddr structures from multiple protocol families. The remaining member, sa_data, contains the address in a protocol-dependent format. sa_data is defined to be an array of 14 bytes, but when the sockaddr structure overlays a larger area of memory sa_data may be up to 253 bytes long. sa_len is only a single byte, so the size of the entire address including sa_len and sa_family must be less than 256 bytes.

This is a common C technique that allows the programmer to consider the last member in a structure to have a variable length.

Each protocol defines a specialized sockaddr structure that duplicates the sa_len and sa_family members but defines the sa_data member as required for that protocol. The address stored in sa_data is a transport address; it contains enough information to identify multiple communication end points on the same host. In Chapter 6 we look at the Internet address structure sockaddr_in, which consists of an IP address and a port number.

osockaddr structure

271-274

The osockaddr structure is the definition of a sockaddr before the 4.3BSD Reno release. Since the length of an address was not explicitly available in this definition, it was not possible to write protocol-independent code to handle variable-length addresses. The desire to include the OSI protocols, which utilize variable-length addresses, motivated the change in the sockaddr definition seen in Net/3. The osockaddr structure is supported for binary compatibility with previously compiled programs.

We have omitted the binary compatibility code from this text.

ifnet and ifaddr Specialization

The ifnet and ifaddr structures contain general information applicable to all network interfaces and protocol addresses. To accommodate additional device and protocol-specific information, each driver defines and each protocol allocates a specialized version of the ifnet and ifaddr structures. These specialized structures always contain an ifnet or ifaddr structure as their first member so that the common information can be accessed without consideration for the additional specialized information.

Most device drivers handle multiple interfaces of the same type by allocating an array of its specialized ifnet structures, but others (such as the loopback driver) handle only one interface. Figure 3.20 shows the arrangement of specialized ifnet structures for our sample interfaces.

Arrangement of ifnet structures within device-dependent structures.

Figure 3.20. Arrangement of ifnet structures within device-dependent structures.

Notice that each device’s structure begins with an ifnet structure, followed by all the device-dependent data. The loopback interface declares only an ifnet structure, since it doesn’t require any device-dependent data. We show the Ethernet and SLIP driver’s softc structures with the array index of 0 in Figure 3.20 since both drivers support multiple interfaces. The maximum number of interfaces of any given type is limited by a configuration parameter when the kernel is built.

The arpcom structure (Figure 3.26) is common to all Ethernet drivers and contains information for the Address Resolution Protocol (ARP) and Ethernet multicasting. The le_softc structure (Figure 3.25) contains additional information unique to the LANCE Ethernet device driver.

Each protocol stores addressing information for each interface in a list of specialized ifaddr structures. The Internet protocols use an in_ifaddr structure (Section 6.5) and the OSI protocols an iso_ifaddr structure. In addition to protocol addresses, the kernel assigns each interface a link-level address when the interface is initialized, which identifies the interface within the kernel.

The kernel constructs the link-level address by allocating memory for an ifaddr structure and two sockaddr_dl structures—one for the link-level address itself and one for the link-level address mask. The sockaddr_dl structures are accessed by OSI, ARP, and the routing algorithms. Figure 3.21 shows an Ethernet interface with a link-level address, an Internet address, and an OSI address. The construction and initialization of the link-level address (the ifaddr and the two sockaddr_dl structures) is described in Section 3.11.

An interface address list containing link-level, Internet, and OSI addresses.

Figure 3.21. An interface address list containing link-level, Internet, and OSI addresses.

Network Initialization Overview

All the structures we have described are allocated and attached to each other during kernel initialization. In this section we give a broad overview of the initialization steps. In later sections we describe the specific device- and protocol-initialization steps.

Some devices, such as the SLIP and loopback interfaces, are implemented entirely in software. These pseudo-devices are represented by a pdevinit structure (Figure 3.22) stored in the global pdevinit array. The array is constructed during kernel configuration. For example:

Table 3.22. pdevinit structure.

    struct pdevinit pdevinit[] = {
        { slattach, 1 },
        { loopattach, 1 },
        { 0, 0 }
    };

------------------------------------------------------------------- device.h
120 struct pdevinit {
121     void    (*pdev_attach) (int);   /* attach function */
122     int     pdev_count;         /* number of devices */
123 };
------------------------------------------------------------------- device.h

120-123

In the pdevinit structures for the SLIP and the loopback interface, pdev_attach is set to slattach and loopattach respectively. When the attach function is called, pdev_count is passed as the only argument and specifies the number of devices to create. Only one loopback device is created but multiple SLIP devices may be created if the administrator configures the SLIP entry accordingly.

The network initialization functions from main are shown in Figure 3.23.

Table 3.23. main function: network initialization.

--------------------------------------------------------------------- init_main.c
 70 main(framep)
 71 void   *framep;
 72 {
                                                                             
                              /* nonnetwork code */                          
                                                                             
 96     cpu_startup();              /* locate and initialize devices */
                                                                             
                              /* nonnetwork code */                          
                                                                             
172     /* Attach pseudo-devices. (e.g., SLIP and loopback interfaces) */
173     for (pdev = pdevinit; pdev->pdev_attach != NULL; pdev++)
174         (*pdev->pdev_attach) (pdev->pdev_count);

175     /*
176      * Initialize protocols.  Block reception of incoming packets
177      * until everything is ready.
178      */
179     s = splimp();
180     ifinit();                   /* initialize network interfaces */
181     domaininit();               /* initialize protocol domains */
182     splx(s);
                                                                             
                              /* nonnetwork code */                          
                                                                             
231     /* The scheduler is an infinite loop. */
232     scheduler();
233     /* NOTREACHED */
234 }
--------------------------------------------------------------------- init_main.c

70-96

cpu_startup locates and initializes all the hardware devices connected to the system, including any network interfaces.

97-174

After the kernel initializes the hardware devices, it calls each of the pdev_attach functions contained within the pdevinit array.

175-234

ifinit and domaininit finish the initialization of the network interfaces and protocols and scheduler begins the kernel process scheduler. ifinit and domaininit are described in Chapter 7.

In the following sections we describe the initialization of the Ethernet, SLIP, and loopback interfaces.

Ethernet Initialization

As part of cpu_startup, the kernel locates any attached network devices. The details of this process are beyond the scope of this text. Once a device is identified, a device-specific initialization function is called. Figure 3.24 shows the initialization functions for our three sample interfaces.

Table 3.24. Network interface initialization functions.

Device

Initialization Function

LANCE Ethernet

leattach

SLIP

slattach

loopback

loopattach

Each device driver for a network interface initializes a specialized ifnet structure and calls if_attach to insert the structure into the linked list of interfaces. The le_softc structure shown in Figure 3.25 is the specialized ifnet structure for our sample Ethernet driver (Figure 3.20).

Table 3.25. le_softc structure.

----------------------------------------------------------------------- if_le.c
 69 struct le_softc {
 70     struct arpcom sc_ac;        /* common Ethernet structures */
 71 #define sc_if   sc_ac.ac_if     /* network-visible interface */
 72 #define sc_addr sc_ac.ac_enaddr /* hardware Ethernet address */
                                                                             
                              /* device-specific members */                  
                                                                             
 95 } le_softc[NLE];
----------------------------------------------------------------------- if_le.c

le_softc structure

69-95

An array of le_softc structures (with NLE elements) is declared in if_le.c. Each structure starts with sc_ac, an arpcom structure common to all Ethernet interfaces, followed by device-specific members. The sc_if and sc_addr macros simplify access to the ifnet structure and Ethernet address within the arpcom structure, sc_ac, shown in Figure 3.26.

Table 3.26. arpcom structure.

------------------------------------------------------------------------ if_ether.h
 95 struct arpcom {
 96     struct ifnet ac_if;         /* network-visible interface */
 97     u_char  ac_enaddr[6];       /* ethernet hardware address */
 98     struct in_addr ac_ipaddr;   /* copy of ip address - XXX  */
 99     struct ether_multi *ac_multiaddrs;  /* list of ether multicast addrs */
100     int     ac_multicnt;        /* length of ac_multiaddrs list */
101 };
------------------------------------------------------------------------ if_ether.h

arpcom structure

95-101

The first member of the arpcom structure, ac_if, is an ifnet structure as shown in Figure 3.20. ac_enaddr is the Ethernet hardware address copied by the LANCE device driver from the hardware when the kernel locates the device during cpu_startup. For our sample driver, this occurs in the leattach function (Figure 3.27). ac_ipaddr is the last IP address assigned to the device. We discuss address assignment in Section 6.6, where we’ll see that an interface can have several IP addresses. See also Exercise 6.3. ac_multiaddrs is a list of Ethernet multicast addresses represented by ether_multi structures. ac_multicnt counts the entries in the list. The multicast list is discussed in Chapter 12.

Table 3.27. leattach function.

----------------------------------------------------------------------------- if_le.c
106 leattach(hd)
107 struct hp_device *hd;
108 {
109     struct lereg0 *ler0;
110     struct lereg2 *ler2;
111     struct lereg2 *lemem = 0;
112     struct le_softc *le = &le_softc[hd->hp_unit];
113     struct ifnet *ifp = &le->sc_if;
114     char   *cp;
115     int     i;
                                                                             
                              /* device-specific code */                     
                                                                             
126     /*
127      * Read the ethernet address off the board, one nibble at a time.
128      */
129     cp = (char *) (lestd[3] + (int) hd->hp_addr);
130     for (i = 0; i < sizeof(le->sc_addr); i++) {
131         le->sc_addr[i] = (*++cp & 0xF) << 4;
132         cp++;
133         le->sc_addr[i] |= *++cp & 0xF;
134         cp++;
135     }
136     printf("le%d: hardware address %s
", hd->hp_unit,
137            ether_sprintf(le->sc_addr));
                                                                             
                              /* device-specific code */                     
                                                                             
150     ifp->if_unit = hd->hp_unit;
151     ifp->if_name = "le";
152     ifp->if_mtu = ETHERMTU;
153     ifp->if_init = leinit;
154     ifp->if_reset = lereset;
155     ifp->if_ioctl = leioctl;
156     ifp->if_output = ether_output;
157     ifp->if_start = lestart;
158     ifp->if_flags = IFF_BROADCAST | IFF_SIMPLEX | IFF_MULTICAST;
159     bpfattach(&ifp->if_bpf, ifp, DLT_EN10MB, sizeof(struct ether_header));
160     if_attach(ifp);
161     return (1);
162 }
----------------------------------------------------------------------------- if_le.c

106-115

Figure 3.27 shows the initialization code for the LANCE Ethernet driver.

The kernel calls leattach once for each LANCE card it finds in the system.

The single argument points to an hp_device structure, which contains HP-specific information since this driver is written for an HP workstation.

le points to the specialized ifnet structure for the card (Figure 3.20) and ifp points to the first member of that structure, sc_if, a generic ifnet structure. The device-specific initializations are not included in Figure 3.27 and are not discussed in this text.

Copy the hardware address from the device

126-137

For the LANCE device, the Ethernet address assigned by the manufacturer is copied from the device to sc_addr (which is sc_ac.ac_enaddr—see Figure 3.26) one nibble (4 bits) at a time in this for loop.

lestd is a device-specific table of offsets to locate information relative to hp_addr, which points to LANCE-specific information.

The complete address is output to the console by the printf statement to indicate that the device exists and is is operational.

Initialize the ifnet structure

150-157

leattach copies the device unit number from the hp_device structure into if_unit to identify multiple interfaces of the same type. if_name is "le" for this device; if_mtu is 1500 bytes (ETHERMTU), the maximum transmission unit for Ethernet; if_init, if_reset, if_ioctl, if_output, and if_start all point to device-specific implementations of the generic functions that control the network interface. Section 4.1 describes these functions.

158

All Ethernet devices support IFF_BROADCAST. The LANCE device does not receive its own transmissions, so IFF_SIMPLEX is set. The driver and hardware supports multicasting so IFF_MULTICAST is also set.

159-162

bpfattach registers the interface with BPF and is described with Figure 31.8. The if_attach function inserts the initialized ifnet structure into the linked list of interfaces (Section 3.11).

SLIP Initialization

The SLIP interface relies on a standard asynchronous serial device initialized within the call to cpu_startup. The SLIP pseudo-device is initialized when main calls slattach indirectly through the pdev_attach pointer in SLIP’s pdevinit structure.

Each SLIP interface is described by an sl_softc structure shown in Figure 3.28.

Table 3.28. sl_softc structure.

------------------------------------------------------------------------- if_slvar.h
 43 struct sl_softc {
 44     struct ifnet sc_if;         /* network-visible interface */
 45     struct ifqueue sc_fastq;    /* interactive output queue */
 46     struct tty *sc_ttyp;        /* pointer to tty structure */
 47     u_char *sc_mp;              /* pointer to next available buf char */
 48     u_char *sc_ep;              /* pointer to last available buf char */
 49     u_char *sc_buf;             /* input buffer */
 50     u_int   sc_flags;           /* Figure 3.29 */
 51     u_int   sc_escape;          /* =1 if last char input was FRAME_ESCAPE */
 52     struct slcompress sc_comp;  /* tcp compression data */
 53     caddr_t sc_bpf;             /* BPF data */
 54 };
------------------------------------------------------------------------- if_slvar.h

43-54

As with all interface structures, sl_softc starts with an ifnet structure followed by device-specific information.

In addition to the output queue found in the ifnet structure, a SLIP device maintains a separate queue, sc_fastq, for packets requesting low-delay service—typically generated by interactive applications.

sc_ttyp points to the associated terminal device. The two pointers sc_buf and sc_ep point to the first and last bytes of the buffer for an incoming SLIP packet. sc_mp points to the location for the next incoming byte and is advanced as additional bytes arrive.

The four flags defined by the SLIP driver are shown in Figure 3.29.

Table 3.29. SLIP if_flags and sc_flags values.

Constant

sc_softc member

Description

SC_COMPRESS

sc_if.if_flags

IFF_LINK0; compress TCP traffic

SC_NOICMP

sc_if.if_flags

IFF_LINK1; suppress ICMP traffic

SC_AUTOCOMP

sc_if.if_flags

IFF_LINK2; auto-enable TCP compression

SC_ERROR

sc_flags

error detected; discard incoming frame

SLIP defines the three interface flags reserved for the device driver in the ifnet structure and one additional flag defined in the sl_softc structure.

sc_escape is used by the IP encapsulation mechanism for serial lines (Section 5.3), while TCP header compression (Section 29.13) information is kept in sc_comp.

The BPF information for the SLIP device is pointed to by sc_bpf.

The sl_softc structure is initialized by slattach, shown in Figure 3.30.

Table 3.30. slattach function.

--------------------------------------------------------------------------- if_sl.c
135 void
136 slattach()
137 {
138     struct sl_softc *sc;
139     int     i = 0;

140     for (sc = sl_softc; i < NSL; sc++) {
141         sc->sc_if.if_name = "sl";
142         sc->sc_if.if_next = NULL;
143         sc->sc_if.if_unit = i++;
144         sc->sc_if.if_mtu = SLMTU;
145         sc->sc_if.if_flags =
146             IFF_POINTOPOINT | SC_AUTOCOMP | IFF_MULTICAST;
147         sc->sc_if.if_type = IFT_SLIP;
148         sc->sc_if.if_ioctl = slioctl;
149         sc->sc_if.if_output = sloutput;
150         sc->sc_if.if_snd.ifq_maxlen = 50;
151         sc->sc_fastq.ifq_maxlen = 32;
152         if_attach(&sc->sc_if);
153         bpfattach(&sc->sc_bpf, &sc->sc_if, DLT_SLIP, SLIP_HDRLEN);
154     }
155 }
--------------------------------------------------------------------------- if_sl.c

135-152

Unlike leattach, which initializes only one interface at a time, the kernel calls slattach once and slattach initializes all the SLIP interfaces. Hardware devices are initialized as they are discovered by the kernel during cpu_startup, while pseudo-devices are initialized all at once when main calls the pdev_attach function for the device. if_mtu for a SLIP device is 296 bytes (SLMTU). This accommodates the standard 20-byte IP header, the standard 20-byte TCP header, and 256 bytes of user data (Section 5.3).

A SLIP network consists of two interfaces at each end of a serial communication line. slattach turns on IFF_POINTOPOINT, SC_AUTOCOMP, and IFF_MULTICAST in if_flags.

The SLIP interface limits the length of its output packet queue, if_snd, to 50 and its own internal queue, sc_fastq, to 32. Figure 3.42 shows that the length of the if_snd queue defaults to 50 (ifqmaxlen) if the driver does not select a length, so the initialization here is redundant.

The Ethernet driver doesn’t set its output queue length explicitly and relies on ifinit (Figure 3.42) to set it to the system default.

if_attach expects a pointer to an ifnet structure so slattach passes the address of sc_if, an ifnet structure and the first member of the sl_softc structure.

A special program, slattach, is run (from the /etc/netstart initialization file) after the kernel has been initialized and joins the SLIP interface and an asynchronous serial device by opening the serial device and issuing ioctl commands (Section 5.3).

153-155

For each SLIP device, slattach calls bpfattach to register the interface with BPF.

Loopback Initialization

Finally, we show the initialization for the single loopback interface. The loopback interface places any outgoing packets back on an appropriate input queue. There is no hardware device associated with the interface. The loopback pseudo-device is initialized when main calls loopattach indirectly through the pdev_attach pointer in the loopback’s pdevinit structure. Figure 3.31 shows the loopattach function.

Table 3.31. Loopback interface initialization.

--------------------------------------------------------------------------- if_loop.c
 41 void
 42 loopattach(n)
 43 int     n;
 44 {
 45     struct ifnet *ifp = &loif;

 46     ifp->if_name = "lo";
 47     ifp->if_mtu = LOMTU;
 48     ifp->if_flags = IFF_LOOPBACK | IFF_MULTICAST;
 49     ifp->if_ioctl = loioctl;
 50     ifp->if_output = looutput;
 51     ifp->if_type = IFT_LOOP;
 52     ifp->if_hdrlen = 0;
 53     ifp->if_addrlen = 0;
 54     if_attach(ifp);
 55     bpfattach(&ifp->if_bpf, ifp, DLT_NULL, sizeof(u_int));
 56 }
--------------------------------------------------------------------------- if_loop.c

41-56

The loopback if_mtu is set to 1536 bytes (LOMTU). In if_flags, IFF_LOOPBACK and IFF_MULTICAST are set. A loopback interface has no link header or hardware address, so if_hdrlen and if_addrlen are set to 0. if_attach finishes the initialization of the ifnet structure and bpfattach registers the loopback interface with BPF.

The loopback MTU should be at least 1576 (40 + 3 x 512) to leave room for a standard TCP/IP header. Solaris 2.3, for example, sets the loopback MTU to 8232 (40 + 8 x 1024). These calculations are biased toward the Internet protocols; other protocols may have default headers larger than 40 bytes.

if_attach Function

The three interface initialization functions shown earlier each call if_attach to complete initialization of the interface’s ifnet structure and to insert the structure on the list of previously configured interfaces. Also, in if_attach, the kernel initializes and assigns each interface a link-level address. Figure 3.32 illustrates the data structures constructed by if_attach.

ifnet list.

Figure 3.32. ifnet list.

In Figure 3.32, if_attach has been called three times: from leattach with an le_softc structure, from slattach with an sl_softc structure, and from loopattach with a generic ifnet structure. Each time it is called it adds another ifnet structure to the ifnet list, creates a link-level ifaddr structure for the interface (which contains two sockaddr_d1 structures, Figure 3.33), and initializes an entry in the ifnet_addrs array.

Table 3.33. sockaddr_dl structure.

--------------------------------------------------------------------------- if_dl.h
 55 struct sockaddr_dl {
 56     u_char  sdl_len;            /* Total length of sockaddr */
 57     u_char  sdl_family;         /* AF_LINK */
 58     u_short sdl_index;          /* if != 0, system given index for
 59                                    interface */
 60     u_char  sdl_type;           /* interface type (Figure 3.9) */
 61     u_char  sdl_nlen;           /* interface name length, no trailing 0
 62                                    reqd. */
 63     u_char  sdl_alen;           /* link level address length */
 64     u_char  sdl_slen;           /* link layer selector length */
 65     char    sdl_data[12];       /* minimum work area, can be larger;
 66                                    contains both if name and ll address */
 67 };

 68 #define LLADDR(s) ((caddr_t)((s)->sdl_data + (s)->sdl_nlen))
--------------------------------------------------------------------------- if_dl.h

The structures contained within le_softc[0] and sl_softc[0] are nested as shown in Figure 3.20.

After this initialization, the interfaces are configured only with link-level addresses. IP addresses, for example, are not configured until much later by the ifconfig program (Section 6.6).

The link-level address contains a logical address for the interface and a hardware address if supported by the network (e.g., a 48-bit Ethernet address for le0). The hardware address is used by ARP and the OSI protocols, while the logical address within a sockaddr_dl contains a name and numeric index for the interface within the kernel, which supports a table lookup for converting between an interface index and the associated ifaddr structure (ifa_ifwithnet, Figure 6.32).

The sockaddr_dl structure is shown in Figure 3.33.

55-57

Recall from Figure 3.18 that sdl_len specifies the length of the entire address and sdl_family specifies the address family, in this case AF_LINK.

58

sdl_index identifies the interface within the kernel. In Figure 3.32 the Ethernet interface would have an index of 1, the SLIP interface an index of 2, and the loopback interface an index of 3. The global integer if_index contains the last index assigned by the kernel.

60

sdl_type is initialized from the if_type member of the ifnet structure associated with this datalink address.

61-68

In addition to a numeric index, each interface has a text name formed from the if_name and if_unit members of the ifnet structure. For example, the first SLIP interface is called "sl0" and the second is called "sl1". The text name is stored at the front of the sdl_data array, and sdl_nlen is the length of this name in bytes (3 in our SLIP example).

The datalink address is also stored in the structure. The macro LLADDR converts a pointer to a sockaddr_dl structure into a pointer to the first byte beyond the text name. sdl_alen is the length of the hardware address. For an Ethernet device, the 48-bit hardware address appears in the sockaddr_dl structure beyond the text name. Figure 3.38 shows an initialized sockaddr_dl structure.

Net/3 does not use sdl_slen.

if_attach updates two global variables. The first, if_index, holds the index of the last interface in the system and the second, ifnet_addrs, points to an array of ifaddr pointers. Each entry in the array points to the link-level address of an interface. The array provides quick access to the link-level address for every interface in the system.

The if_attach function is long and consists of several tricky assignment statements. We describe it in four parts, starting with Figure 3.34.

Table 3.34. if_attach function: assign interface index.

------------------------------------------------------------------------------ if.c
 59 void
 60 if_attach(ifp)
 61 struct ifnet *ifp;
 62 {
 63     unsigned socksize, ifasize;
 64     int     namelen, unitlen, masklen, ether_output();
 65     char    workbuf[12], *unitname;
 66     struct ifnet **p = &ifnet;  /* head of interface list */
 67     struct sockaddr_dl *sdl;
 68     struct ifaddr *ifa;
 69     static int if_indexlim = 8; /* size of ifnet_addrs array */
 70     extern void link_rtrequest();

 71     while (*p)                  /* find end of interface list */
 72         p = &((*p)->if_next);
 73     *p = ifp;
 74     ifp->if_index = ++if_index; /* assign next index */

 75     /* resize ifnet_addrs array if necessary */
 76     if (ifnet_addrs == 0 || if_index >= if_indexlim) {
 77         unsigned n = (if_indexlim <<= 1) * sizeof(ifa);
 78         struct ifaddr **q = (struct ifaddr **)
 79                     malloc(n, M_IFADDR, M_WAITOK);

 80         if (ifnet_addrs) {
 81             bcopy((caddr_t) ifnet_addrs, (caddr_t) q, n / 2);
 82             free((caddr_t) ifnet_addrs, M_IFADDR);
 83         }
 84         ifnet_addrs = q;
 85     }
------------------------------------------------------------------------------ if.c

59-74

if_attach has a single argument, ifp, a pointer to the ifnet structure that has been initialized by a network device driver. Net/3 keeps all the ifnet structures on a linked list headed by the global pointer ifnet. The while loop locates the end of the list and saves the address of the null pointer at the end of the list in p. After the loop, the new ifnet structure is attached to the end of the ifnet list, if_index is incremented, and the new index is assigned to ifp>if_index.

C Language Note: Notice that the same name, ifnet, is used for the variable and the type (in this case a structure name) of the variable. This is legal C and we’ll see it a lot in Net/3.

Resize ifnet_addrs array if necessary

75-85

The first time through if_attach, the ifnet_addrs array doesn’t exist so space for 16 entries (16 = 8 << 1) is allocated. When the array becomes full, a new array of twice the size is allocated and the entries from the old array are copied to the new array.

if_indexlim is a static variable private to if_attach. if_indexlim is updated by the <<= operator.

The malloc and free functions in Figure 3.34 are not the standard C library functions of the same name. The second argument in the kernel versions specifies a type, which is used by optional diagnostic code in the kernel to detect programming errors. If the third argument to malloc is M_WAITOK, the function blocks the calling process if it needs to wait for free memory to become available. If the third argument is M_DONTWAIT, the function does not block and returns a null pointer when no memory is available.

The next section of if_attach, shown in Figure 3.35, prepares a text name for the interface and computes the size of the link-level address.

Table 3.35. if_attach function: compute size of link-level address.

----------------------------------------------------------------------------- if.c
 86     /* create a Link Level name for this device */
 87     unitname = sprint_d((u_int) ifp->if_unit, workbuf, sizeof(workbuf));
 88     namelen = strlen(ifp->if_name);
 89     unitlen = strlen(unitname);

 90     /* compute size of sockaddr_dl structure for this device */
 91 #define _offsetof(t, m) ((int)((caddr_t)&((t *)0)->m))
 92     masklen = _offsetof(struct sockaddr_dl, sdl_data[0]) +
 93             unitlen + namelen;
 94     socksize = masklen + ifp->if_addrlen;
 95 #define ROUNDUP(a) (1 + (((a) - 1) | (sizeof(long) - 1)))
 96     socksize = ROUNDUP(socksize);
 97     if (socksize < sizeof(*sdl))
 98         socksize = sizeof(*sdl);
 99     ifasize = sizeof(*ifa) + 2 * socksize;
----------------------------------------------------------------------------- if.c

Create link-level name and compute size of link-level address

86-99

if_attach constructs the name of the interface from if_unit and if_name. The function sprint_d converts the numeric value of if_unit to a string stored in workbuf. masklen is the number of bytes occupied by the information before sdl_data in the sockaddr_dl array plus the size of the text name for the interface (namelen + unitlen). The function rounds socksize, which is masklen plus the hardware address length (if_addrlen), up to the boundary of a long integer (ROUNDUP). If this is less than the size of a sockaddr_dl structure, the standard sockaddr_dl structure is used, ifasize is the size of an ifaddr structure plus two times socksize, so it can hold the sockaddr_dl structures.

In the next section, if_attach allocates and links the structures together, as shown in Figure 3.36.

The link-level address and mask assigned during if_attach.

Figure 3.36. The link-level address and mask assigned during if_attach.

In Figure 3.36 there is a gap between the ifaddr structure and the two sockaddr_dl structures to illustrate that they are allocated in a contiguous area of memory but that they are not defined by a single C structure.

The organization shown in Figure 3.36 is repeated in the in_ifaddr structure; the pointers in the generic ifaddr portion of the structure point to specialized sockaddr structures allocated in the device-specific portion of the structure, in this case, sockaddr_dl structures. Figure 3.37 shows the initialization of these structures.

Table 3.37. if_attach function: allocate and initialize link-level address.

----------------------------------------------------------------------------- if.c
100     if (ifa = (struct ifaddr *) malloc(ifasize, M_IFADDR, M_WAITOK)) {
101         bzero((caddr_t) ifa, ifasize);

102         /* First: initialize the sockaddr_dl address */
103         sdl = (struct sockaddr_dl *) (ifa + 1);
104         sdl->sdl_len = socksize;
105         sdl->sdl_family = AF_LINK;
106         bcopy(ifp->if_name, sdl->sdl_data, namelen);
107         bcopy(unitname, namelen + (caddr_t) sdl->sdl_data, unitlen);
108         sdl->sdl_nlen = (namelen += unitlen);
109         sdl->sdl_index = ifp->if_index;
110         sdl->sdl_type = ifp->if_type;
111         ifnet_addrs[if_index - 1] = ifa;
112         ifa->ifa_ifp = ifp;
113         ifa->ifa_next = ifp->if_addrlist;
114         ifa->ifa_rtrequest = link_rtrequest;
115         ifp->if_addrlist = ifa;
116         ifa->ifa_addr = (struct sockaddr *) sdl;

117         /* Second: initialize the sockaddr_dl mask */
118         sdl = (struct sockaddr_dl *) (socksize + (caddr_t) sdl);
119         ifa->ifa_netmask = (struct sockaddr *) sdl;
120         sdl->sdl_len = masklen;
121         while (namelen != 0)
122             sdl->sdl_data[--namelen] = 0xff;
123     }
----------------------------------------------------------------------------- if.c

The address

100-116

If enough memory is available, bzero fills the new structure with 0s and sdl points to the first sockaddr_dl just after the ifnet structure. If no memory is available, the code is skipped.

sdl_len is set to the length of the sockaddr_dl structure, and sdl_family is set to AF_LINK. A text name is constructed within sdl_data from if_name and unitname, and the length is saved in sdl_nlen. The interface’s index is copied into sdl_index as well as the interface type into sdl_type. The allocated structure is inserted into the ifnet_addrs array and linked to the ifnet structure by ifa_ifp and ifa_addrlist. Finally, the sockaddr_dl structure is connected to the ifnet structure with ifa_addr. Ethernet interfaces replace the default function, link_rtrequest with arp_rtrequest. The loopback interface installs loop_rtrequest. We describe ifa_rtrequest and arp_rtrequest in Chapters 19 and 21. link_rtrequest and loop_rtrequest are left for readers to investigate on their own and link_rtrequest in Chapter 18. This completes the initialization of the first sockaddr_dl structure.

The mask

117-123

The second sockaddr_dl structure is a bit mask that selects the text name that appears in the first structure. ifa_netmask from the ifaddr structure points to the mask structure (which in this case selects the interface text name and not a network mask). The while loop turns on the bits in the bytes corresponding to the name.

Figure 3.38 shows the two initialized sockaddr_dl structures for our example Ethernet interface, where if_name is "le", if_unit is 0, and if_index is 1.

The initialized Ethernet sockaddr_dl structures (sdl_ prefix omitted).

Figure 3.38. The initialized Ethernet sockaddr_dl structures (sdl_ prefix omitted).

In Figure 3.38, the address is shown after ether_ifattach has done additional initialization of the structure (Figure 3.41).

Figure 3.39 shows the structures after the first interface has been attached by if_attach.

The ifnet and sockaddr_dl structures after if_attach is called for the first time.

Figure 3.39. The ifnet and sockaddr_dl structures after if_attach is called for the first time.

At the end of if_attach, the ether_ifattach function is called for Ethernet devices, as shown in Figure 3.40.

Table 3.40. if_attach function: Ethernet initialization.

----------------------------------------------------------------------------- if.c
124     /* XXX -- Temporary fix before changing 10 ethernet drivers */
125     if (ifp->if_output == ether_output)
126         ether_ifattach(ifp);
127 }
----------------------------------------------------------------------------- if.c

124-127

ether_ifattach isn’t called earlier (from leattach, for example) because it copies the Ethernet hardware address into the sockaddr_dl allocated by if_attach.

The XXX comment indicates that the author found it easier to insert the code here once than to modify all the Ethernet drivers.

ether_ifattach function

The ether_ifattach function performs the ifnet structure initialization common to all Ethernet devices.???

Table 3.41. ether_ifattach function.

----------------------------------------------------------------------- if_ethersubr.c
338 void
339 ether_ifattach(ifp)
340 struct ifnet *ifp;
341 {
342     struct ifaddr *ifa;
343     struct sockaddr_dl *sdl;

344     ifp->if_type = IFT_ETHER;
345     ifp->if_addrlen = 6;
346     ifp->if_hdrlen = 14;
347     ifp->if_mtu = ETHERMTU;
348     for (ifa = ifp->if_addrlist; ifa; ifa = ifa->ifa_next)
349         if ((sdl = (struct sockaddr_dl *) ifa->ifa_addr) &&
350             sdl->sdl_family == AF_LINK) {
351             sdl->sdl_type = IFT_ETHER;
352             sdl->sdl_alen = ifp->if_addrlen;
353             bcopy((caddr_t) ((struct arpcom *) ifp)->ac_enaddr,
354                   LLADDR(sdl), ifp->if_addrlen);
355             break;
356         }
357 }
----------------------------------------------------------------------- if_ethersubr.c

338-357

For an Ethernet device, if_type is IFT_ETHER, the hardware address is 6 bytes long, the entire Ethernet header is 14 bytes in length, and the Ethernet MTU is 1500 (ETHERMTU).

The MTU was already assigned by leattach, but other Ethernet device drivers may not have performed this initialization.

Section 4.3 discusses the Ethernet frame organization in more detail. The for loop locates the link-level address for the interface and then initializes the Ethernet hardware address information in the sockaddr_dl structure. The Ethernet address that was copied into the arpcom structure during system initialization is now copied into the link-level address.

ifinit Function

After the interface structures are initialized and linked together, main (Figure 3.23) calls ifinit, shown in Figure 3.42.

Table 3.42. ifinit function.

--------------------------------------------------------------------------- if.c
 43 void
 44 ifinit()
 45 {
 46     struct ifnet *ifp;

 47     for (ifp = ifnet; ifp; ifp = ifp->if_next)
 48         if (ifp->if_snd.ifq_maxlen == 0)
 49             ifp->if_snd.ifq_maxlen = ifqmaxlen;     /* set default length */
 50     if_slowtimo(0);
 51 }
--------------------------------------------------------------------------- if.c

43-51

The for loop traverses the interface list and sets the maximum size of each interface output queue to 50 (ifqmaxlen) if it hasn’t already been set by the interface’s attach function.

An important consideration for the size of the output queue is the number of packets required to send a maximum-sized datagram. For Ethernet, if a process calls sendto with 65,507 bytes of data, it is fragmented into 45 fragments and each fragment is put onto the interface output queue. If the queue were much smaller, the process could never send that large a datagram, as the queue wouldn’t have room.

if_slowtimo starts the interface watchdog timers. When an interface timer expires, the kernel calls the watchdog function for the interface. An interface can reset the timer periodically to prevent the watchdog function from being called, or set if_timer to 0 if the watchdog function is not needed. Figure 3.43 shows the if_slowtimo function.

Table 3.43. if_slowtimo function.

---------------------------------------------------------------------------  if.c
338 void
339 if_slowtimo(arg)
340 void   *arg;
341 {
342     struct ifnet *ifp;
343     int     s = splimp();

344     for (ifp = ifnet; ifp; ifp = ifp->if_next) {
345         if (ifp->if_timer == 0 || --ifp->if_timer)
346             continue;
347         if (ifp->if_watchdog)
348             (*ifp->if_watchdog) (ifp->if_unit);
349     }
350     splx(s);
351     timeout(if_slowtimo, (void *) 0, hz / IFNET_SLOWHZ);
352 }
---------------------------------------------------------------------------  if.c

338-343

The single argument, arg, is not used but is required by the prototype for the slow timeout functions (Section 7.4).

344-352

if_slowtimo ignores interfaces with if_timer equal to 0; if if_timer does not equal 0, if_slowtimo decrements if_timer and calls the if_watchdog function associated with the interface when the timer reaches 0. Packet processing is blocked by splimp during if_slowtimo. Before returning, ip_slowtimo calls timeout to schedule a call to itself in hz/IFNET_SLOWHZ clock ticks, hz is the number of clock ticks that occur in 1 second (often 100). It is set at system initialization and remains constant thereafter. Since IFNET_SLOWHZ is defined to be 1, the kernel calls if_slowtimo once every hz clock ticks, which is once per second.

The functions scheduled by the timeout function are called back by the kernel’s callout function. See [Leffler et al. 1989] for additional details.

Summary

In this chapter we have examined the ifnet and ifaddr structures that are allocated for each network interface found at system initialization time. The ifnet structures are linked into the ifnet list. The link-level address for each interface is initialized, attached to the ifnet structure’s address list, and entered into the if_addrs array.

We discussed the generic sockaddr structure and its sa_family, and sa_len members, which specify the type and length of every address. We also looked at the initialization of the sockaddr_dl structure for a link-level address.

In this chapter, we introduced the three example network interfaces that we use throughout the book.

Exercises

3.1

The netstat program on many Unix systems lists network interfaces and their configuration. Try netstat -i on a system you have access to. What are the names (if_name) and maximum transmission units (if_mtu) of the network interfaces?

3.2

In if_slowtimo (Figure 3.43) the splimp and splx calls appear outside the loop. What are the advantages and disadvantages of this arrangement compared with placing the calls within the loop?

3.3

Why is SLIP’s interactive queue shorter than SLIP’s standard output queue?

3.3

A large interactive queue would defeat the purpose of the queue by delaying new interactive traffic behind the existing interactive data.

3.4

Why aren’t if_hdrlen and if_addrlen initialized in slattach?

3.4

Since the sl_softc structures are all declared as global variables, they are initialized to 0 when the kernel starts.

3.5

Draw a picture similar to Figure 3.38 for the SLIP and loopback devices.

3.5

Exercises
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.119.11.28