Appendix C. RFC 1122 Compliance

This appendix summarizes the compliance of the Net/3 implementation with RFC 1122 [Braden 1989a]. This RFC summarizes these requirements in four categories

  • link layer

  • internet layer

  • UDP

  • TCP

We have chosen to present these requirements in the same breakdown and order as the chapters of this text.

Link-Layer Requirements

This section summarizes the link-layer requirements from Section 2.5 of RFC 1122 and the compliance of the Net/3 code that we’ve examined to those requirements.

  • May support trailer encapsulation.

    Partially: Net/3 does not send IP datagrams with trailer encapsulation but some Net/3 device drivers may be able to receive such datagrams. We have omitted all the trailer encapsulation code in this text. Interested readers are referred to RFC 893 and Section 11.8 of [Leffler et al. 1989] for additional details.

  • Must not send trailers by default without negotiation.

    Not applicable: Net/2 would negotiate the use of trailers but Net/3 ignores requests to send trailers and does not request trailers itself.

  • Must be able to send and receive RFC 894 Ethernet encapsulation.

    Yes: Net/3 supports RFC 894 Ethernet encapsulation.

  • Should be able to receive RFC 1042 (IEEE 802) encapsulation.

    No: Net/3 processes packets received with 802.3 encapsulation but only for use with OSI protocols. IP packets that arrive with 802.3 encapsulation are discarded by ether_input (Figure 4.13).

  • May send RFC 1042 encapsulation, in which case there must be a software configuration switch to select the encapsulation method and RFC 894 must be the default.

    No: Net/3 does not send IP packets in RFC 1042 encapsulation.

  • Must report link-layer broadcasts to the IP layer.

    Yes: The link layer reports link-layer broadcasts by setting the M_BCAST flag (or the M_MCAST flag for multicasts) in the mbuf packet header.

  • Must pass the IP TOS value to the link layer.

    Yes: The TOS value is not passed explicitly, but is part of the IP header available to the link layer.

IP Requirements

This section summarizes the IP requirements from Section 3.5 of RFC 1122 and the compliance of the Net/3 code that we’ve examined to those requirements.

  • Must implement IP and ICMP.

    Yes: inetsw[0] implements the IP protocol and inetsw[4] implements ICMP.

  • Must handle remote multihoming in application layer.

    Yes: The kernel is unaware of communication to remote multihomed hosts and neither hinders nor supports such communication by an application.

  • May support local multihoming.

    Yes: Net/3 supports multiple IP interfaces with the ifnet list and multiple addresses per IP interface with the ifaddr list for each ifnet structure.

  • Must meet router specifications if forwarding datagrams.

    Partially: See Chapter 18 for a discussion of the router requirements.

  • Must provide configuration switch for embedded router functionality. The switch must default to host operation.

    Yes: The ipforwarding variable defaults to false and controls the IP packet forwarding mechanism in Net/3.

  • Must not enable routing based on number of interfaces.

    Yes: The if_attach function does not modify ipforwarding according to the number of interfaces configured at system initialization time.

  • Should log discarded datagrams, including the contents of the datagram, and record the event in a statistics counter.

    Partially: Net/3 does not provide a mechanism for logging the contents of discarded datagrams but maintains a variety of statistics counters.

  • Must silently discard datagrams that arrive with an IP version other than 4.

    Yes: ipintr implements this requirement.

  • Must verify IP checksum and silently discard an invalid datagram.

    Yes: ipintr calls in_cksum and implements this requirement.

  • Must support subnet addressing (RFC 950).

    Yes: Every IP address has an associated subnet mask in the in_ifaddr structure.

  • Must transmit packets with host’s own IP address as the source address.

    Partially: When the transport layer sends an IP datagram with all-0 bits as the source address, IP inserts the IP address of the outgoing interface in its place. A process can bind one of the local IP broadcast addresses to the local socket, and IP will transmit it as an invalid source address.

  • Must silently discard datagrams not destined for the host.

    Yes: If the system is not configured as a router, ipintr discards datagrams that arrive with a bad destination address (i.e., an unrecognized unicast, broadcast, or multicast address).

  • Must silently discard datagrams with bad source address (nonunicast address).

    No: ipintr does not examine the source address of incoming datagrams before delivering the datagram to the transport protocols.

  • Must support reassembly.

    Yes: ip_reass implements reassembly.

  • May retain same ID field in identical datagrams.

    No: ip_output assigns a new ID to every outgoing datagram and does not allow the ID to be specified by the transport protocols. See Chapter 32.

  • Must allow the transport layer to set TOS.

    Yes: ip_output accepts any TOS value set in the IP header by the transport protocols. The transport layer must default TOS to all 0s. The TOS value for a particular datagram or connection may be set by the application through the IP_TOS socket option.

  • Must pass received TOS up to transport layer.

    Yes: Net/3 preserves the TOS field during input processing. The entire IP header is made available to the transport layer when IP calls the pr_input function for the receiving protocol. Unfortunately, the UDP and TCP transport layers ignore it.

  • Should not use RFC 795 [Postel 1981d] link-layer mappings for TOS.

    Yes: Net/3 does not use these mappings.

  • Must not send packet with TTL of 0.

    Partially: The IP layer (ip_output) in Net/3 does not check this requirement and relies on the transport layers not to construct an IP header with a TTL of 0. UDP, TCP, ICMP, and IGMP all select a nonzero TTL default value. The default value can be overridden by the IP_TTL option.

  • Must not discard received packets with a TTL less than 2.

    Yes: If the system is the final destination of the packet, ipintr accepts it regardless of the TTL value. The TTL is examined only when the packet is being forwarded.

  • Must allow transport layer to set TTL.

    Yes: The transport layer must set TTL before calling ip_output.

  • Must enable configuration of a fixed TTL.

    Yes: The default TTL is specified by the global integer ip_defttl, which defaults to 64 (IPDEFTTL). Both UDP and TCP use this value unless the IP_TTL socket option has specified a different value for a particular socket. ip_defttl can be modified through the IPCTL_DEFTTL name for sysctl.

Multihoming

  • Should select, as the source address for a reply, the specific address received as the destination address of the request.

    Yes: Responses generated by the kernel (ICMP reply messages) include the correct source address (Section C.5). Responses generated by the transport protocols are described in their respective chapters.

  • Must allow application to choose local IP address.

    Yes: An application can bind a socket to a specific local IP address (Section 15.8).

  • May silently discard datagrams addressed to an interface other than the one on which it is received.

    No: Net/3 implements the weak end system model and ipintr accepts such packets.

  • May require packets to exit the system through the interface with an IP address that corresponds to the source address of the packet. This requirement pertains only to packets that are not source routed.

    No: Net/3 allows packets to exit the system through any interface another weak end system characteristic.

Broadcast

  • Must not select an IP broadcast address as a source address.

    Partially: If an application explicitly selects a source address, the IP layer does not override the selection. Otherwise, IP selects as a source address the specific IP address associated with the outgoing interface.

  • Should accept an all-0s or all-1s broadcast address.

    Yes: ipintr accepts packets sent to either address.

  • May support a configurable option to send all 0s or all 1s as the broadcast address on an interface. If provided, the configurable broadcast address must default to all 1s.

    No: A process must explicitly send to either the all-0s (INADDR_ANY) or all-1s broadcast address (INADDR_BROADCAST). There is no configurable default.

  • Must recognize all broadcast address formats.

    Yes: ipintr recognizes the limited (all-1s and all-0s) and the network-directed and subnet-directed broadcast addresses.

  • Must use an IP broadcast or IP multicast destination address in a link-layer broadcast.

    Yes: ip_output enables the link-layer multicast or broadcast flags only when the destination is an IP multicast or broadcast address.

  • Should silently discard link-layer broadcasts when the packet does not specify an IP broadcast address as its destination.

    No: There is no explicit test for the M_BCAST or M_MCAST flags on incoming packets in Net/3, but ip_forward will discard these packets before forwarding them.

  • Should use limited broadcast address for connected networks.

    Partially: The decision to use the limited broadcast address (versus a subnet-directed or network-directed broadcast) is left to the application level by Net/3.

IP Interface

  • Must allow transport layer to use all IP mechanisms (e.g., IP options, TTL, TOS).

    Yes: All the IP mechanisms are available to the transport layer in Net/3.

  • Must pass interface identification up to transport layer.

    Yes: The m_pkthdr.rcvif member of each mbuf containing an incoming packet points to the ifnet structure of the interface that received the packet.

  • Must pass all IP options to transport layer.

    Yes: The entire IP header, including options, is present in the packet passed to the pr_input function of the receiving transport protocol by ipintr.

  • Must allow transport layer to send ICMP port unreachable and any of the ICMP query messages.

    Yes: The transport layer may send any ICMP error messages by calling icmp_error or may format and send any type of IP datagram by calling the ip_output function.

  • Must pass the following ICMP messages to the transport layer: destination unreachable, source quench, echo reply, timestamp reply, and time exceeded.

    Yes: These messages are distributed by ICMP to other transport protocols or to any waiting processes using the raw IP socket mechanism.

  • Must include contents of ICMP message (IP header plus the data bytes present) in ICMP message passed to the transport layer.

    Yes: icmp_input passes the portion of the original IP packet contained within the ICMP message to the transport layers.

  • Should be able to leap tall buildings at a single bound.

    No: The next version of IP may meet this requirement.

IP Options Requirements

This section summarizes the IP option processing requirements from Section 3.5 of RFC 1122 and the compliance of the Net/3 code that we’ve examined to those requirements.

  • Must allow transport layer to send IP options.

    Yes: The second argument to ip_output is a list of IP options to include in the outgoing IP datagram.

  • Must pass all IP options received to higher layer.

    Yes: The IP header and options are passed to the pr_input function of the receiving transport protocol.

  • Must silently ignore unknown options.

    Yes: The default case in ip_dooptions skips over unknown options.

  • May support the security option.

    No: Net/3 does not support the IP security option.

  • Should not send the stream identifier option and must ignore it in received datagrams.

    Yes: Net/3 does not support the stream identifier option and ignores it on incoming datagrams.

  • May support the record route option.

    Yes: Net/3 supports the record route option.

  • May support the timestamp option.

    Partially: Net/3 supports the timestamp option but does not implement it exactly as specified. The originating host does not insert a timestamp when required but the destination host records a timestamp before passing the datagram to the transport layer. The timestamp value follows the rules regarding standard values as specified in Section 3.2.2.8 of RFC 1122 for the ICMP timestamp message.

  • Must support originating a source route and must be able to act as the final destination of a source route.

    Yes: A source route may be included in the options passed to ip_output, and ip_dooptions correctly terminates a source route and saves it for use in constructing return routes.

  • Must pass a datagram with completed source route up to the transport layer.

    Yes: The source route option is passed up with any other options that may have appeared in the datagram.

  • Must build correct (nonredundant) return route.

    No: Net/3 blindly reverses the source route and does not check or correct for a route that was built incorrectly with a redundant hop for the original source host.

  • Must not send multiple source route options in one header.

    No: The IP layer in Net/3 does not prohibit a transport protocol from constructing and sending multiple source route options in a single datagram.

Source Route Forwarding

  • May support packet forwarding with the source route option.

    Yes: Net/3 supports the source route options. ip_dooptions does all the work.

  • Must obey corresponding router rules while processing source routes.

    Yes: Net/3 follows the router rules whether or not the packet contains a source route.

  • Must update TTL according to gateway rules.

    Yes: ip_forward implements this requirement.

  • Must generate ICMP error codes 4 and 5 (fragmentation required and source route failed).

    Yes: ip_output is able to generate a fragmentation required message, and ip_dooptions is able to generate the source route failed message.

  • Must allow the IP source address of a source routed packet to not be an IP address of the forwarding host.

    Yes: ip_output transmits such packets.

    RFC 1122 lists this as a may requirement because the addresses may be different, which must be allowed.

  • Must update timestamp and record route options.

    Yes: ip_dooptions processes these options for source routed packets.

  • Must support a configurable switch for nonlocal source routing. The switch must default to off.

    No: Net/3 always allows nonlocal source routing and does not provide a switch to disable this function. Nonlocal source routing is routing packets between two different interfaces instead of receiving and sending the packet on the same interface.

  • Must satisfy gateway access rules for nonlocal source routing.

    Yes: Net/3 follows the forwarding rules for nonlocal source routing.

  • Should send an ICMP destination unreachable error (source route failed) if a source routed packet cannot be forwarded (except for ICMP error messages).

    Yes: ip_dooptions sends the ICMP destination unreachable error. icmp_error discards it if the original datagram was an ICMP error message.

IP Fragmentation and Reassembly Requirements

This section summarizes the IP fragmentation and reassembly requirements from Section 3.5 of RFC 1122 and the compliance of the Net/3 code that we’ve examined to those requirements.

  • Must be able to reassemble incoming datagrams of at least 576 bytes.

    Yes: ip_reass supports reassembly of datagrams of indefinite size.

  • Should support a configurable or indefinite maximum size for incoming datagrams.

    Yes: Net/3 supports an indefinite maximum size for incoming datagrams.

  • Must provide a mechanism for the transport layer to learn the maximum datagram size to receive.

    Not applicable: Net/3 has an indefinite limit based on available memory.

  • Must send ICMP time exceeded error on reassembly timeout.

    No: Net/3 does not send an ICMP time exceeded error. See Figure 10.30 and Exercise 10.1.

  • Should support a fixed reassembly timeout value. The remaining TTL value in a received IP fragment should not be used as a reassembly timeout value.

    Yes: Net/3 uses a compile-time value of 30 seconds (IPFRAGTTL is 60 slow-timeout intervals, which equals 30 seconds).

  • Must provide the MMS_S (maximum message size to send) to higher layers.

    Partially: TCP derives the MMS_S from the MTU found in the route entry for the destination or from the MTU of the outgoing interface. A UDP application does not have access to this information.

  • May support local fragmentation of outgoing packets.

    Yes: ip_output fragments an outgoing packet if it is too large for the selected interface.

  • Must not allow transport layer to send a message larger than MMS_S if local fragmentation is not supported.

    Not applicable: This is a transport-level requirement that does not apply to Net/3 since local fragmentation is supported.

  • Should not send messages larger than 576 bytes to a remote destination in the absence of other information regarding the path MTU to the destination.

    Partially: Net/3 TCP defaults to a segment size of 552 (512 data bytes + 40 header bytes). Net/3 UDP applications cannot determine if a destination is local or remote and so they often restrict their messages to 540 bytes (512 + 20 + 8). There is no kernel mechanism that prohibits sending larger messages.

  • May support an all-subnets-MTU configuration flag.

    Yes: The global integer subnetsarelocal defaults to true. TCP uses this flag to select a larger segment size (the size of the outgoing interface’s MTU) instead of the default segment size for destinations on a subnet of the local network.

ICMP Requirements

This section summarizes the ICMP requirements from Section 3.5 of RFC 1122 and the compliance of the Net/3 code that we’ve examined to those requirements.

  • Must silently discard ICMP messages with unknown type.

    Partially: icmp_input ignores these messages and passes them to rip_input, which delivers the message to any waiting processes or silently discards the message if no process is prepared to receive the message.

  • May include more than 8 bytes of the original datagram.

    No: The icmp_error function returns only a maximum of 8 bytes of the original datagram in the ICMP error message, Exercise 11.9.

  • Must return the header and data unchanged from the received datagram.

    Partially: Net/3 converts the ID, offset, and length fields of an IP packet from network byte order to host byte order in ipintr. This facilitates processing the packet, but Net/3 neglects to return the offset and length fields to network byte order before including the header in an ICMP error message. If the system operates with the same byte ordering as the network, this error is harmless. If it operates with a different ordering, the IP header contained within the ICMP error message has incorrect offset and length values.

    The authors found that an Intel implementation of SVR4 and AIX 3.2 (Net/2 based) both return the length byte-swapped. Implementations other than Net/2 or Net/3 that were tried (Cisco, NetBlazer, VM, and Solaris 2.3) did not have this bug.

    Another error occurs when an ICMP port unreachable error is sent from the UDP code: the header length of the received datagram is changed incorrectly (Section 23.7). The authors found this error in Net/2 and Net/3 implementations. Net/1, however, did not have the bug.

  • Must demultiplex received ICMP error message to transport protocol.

    Yes: icmp_error uses the protocol field from the original header to select the appropriate transport protocol to respond to the error.

  • Should send ICMP error messages with a TOS field of 0.

    Yes: All ICMP error messages are constructed with a TOS of 0 by icmp_error.

  • Must not send an ICMP error message caused by a previous ICMP error message.

    Partially: icmp_error sends an error for an ICMP redirect message, which Section 3.2.2 of RFC 1122 classifies as an ICMP error message.

  • Must not send an ICMP error message caused by an IP broadcast or IP multicast datagram.

    No: icmp_error does not check for this case.

    The icmp_error function from the original Deering multicast code for BSD checks for this case.

  • Must not send an ICMP error message caused by a link-layer broadcast.

    Yes: icmp_error discards ICMP messages in response to packets that arrived as link-layer broadcasts or multicasts.

  • Must not send an ICMP error message caused by a noninitial fragment.

    Yes: icmp_error discards errors generated in this case.

  • Must not send an ICMP error message caused by a datagram with nonunique source address.

    Yes: icmp_reflect checks for experimental and multicast addresses. ip_output discards messages sent from a broadcast address.

  • Must return ICMP error messages when not prohibited.

    Partially: In general, Net/3 sends appropriate ICMP error messages. It fails to send an ICMP reassembly timeout message at the appropriate time (Exercise 10.1).

  • Should generate ICMP destination unreachable (protocol and port).

    Partially: Datagrams for unsupported protocols are delivered to rip_input where they are silently discarded if there are no processes registered to accept the datagrams. UDP generates an ICMP port unreachable error.

  • Must pass ICMP destination unreachable to higher layer.

    Yes: icmp_input passes the message to the pr_ctlinput function defined for the protocol (udp_ctlinput and tcp_ctlinput for UDP and TCP, respectively).

  • Should respond to destination unreachable error.

    See Sections 23.9 and 27.6.

  • Must interpret destination unreachable as only a hint, as it may indicate a transient condition.

    See Sections 23.9 and 27.6.

  • Must not send an ICMP redirect when configured as a host.

    Yes: ip_forward, the only function that detects and sends redirects, is not called unless the system is configured as a router.

  • Must update route cache when an ICMP redirect is received.

    Yes: ipintr calls rtredirect to process the message.

  • Must handle both host and network redirects. Furthermore, network redirects must be treated as host redirects.

    Yes: ipintr calls rtredirect for both types of messages.

  • Should discard illegal redirects.

    Yes: rtredirect discards illegal redirects (Section 19.7).

  • May send source quench if memory is unavailable.

    Yes: ip_forward sends a source quench if ip_output returns ENOBUFS. This occurs when there is a shortage of mbufs or when an interface output queue is full.

  • Must pass source quench to higher layer.

    Yes: icmp_input passes source quench errors to the transport layers.

  • Should respond to source quench in higher layer.

    See Sections 23.9 and 27.6 for UDP and TCP processing. Neither ICMP nor IGMP accept ICMP error messages (they don’t define a pr_ctlinput function), in which case they are discarded by IP.

  • Must pass time exceeded error to transport layer.

    Yes: icmp_input passes this message to the transport layers.

  • Should send parameter problem errors.

    Yes: ip_dooptions complains about incorrectly formed options.

  • Must pass parameter problem errors to transport layer.

    Yes: icmp_input passes parameter problem errors to the transport layer.

  • May report parameter problem errors to process.

    See Sections 23.9 and 27.6 for UDP and TCP processing. Neither ICMP nor IGMP accept ICMP error messages.

  • Must support an echo server and should support an echo client.

    Yes: icmp_input implements the echo server and the ping program implements the echo client using a raw IP socket.

  • May discard echo requests to a broadcast address.

    No: The reply is sent by icmp_reflect.

  • May discard echo request to multicast address.

    No: Net/3 responds to multicast echo requests. Both icmp_reflect and ip_output permit multicast destination addresses.

  • Must use specific destination address as echo reply source.

    Yes: icmp_reflect converts a broadcast or multicast destination to the specific address of the receiving interface and uses the result as the source address for the echo reply.

  • Must return echo request data in echo reply.

    Yes: The data portion of the echo request is not altered by icmp_reflect.

  • Must pass echo reply to higher layer.

    Yes: ICMP echo replies are passed to rip_input for receipt by registered processes.

  • Must reflect record route and timestamp options in ICMP echo request message.

    Yes: icmp_reflect includes the record route and timestamp options in the echo reply message.

  • Must reverse and reflect source route option.

    Yes: icmp_reflect retrieves the reversed source route with ip_srcroute and includes it in the outgoing echo reply.

  • Should not support the ICMP information request or reply.

    Partially: The kernel does not generate or respond to either message, but a process may send or receive the messages through the raw IP mechanism.

  • May implement the ICMP timestamp request and timestamp reply messages.

    Yes: icmp_input implements the timestamp server functionality. The timestamp client may be implemented through the raw IP mechanism.

  • Must minimize timestamp delay variability (if implementing the timestamp messages).

    Partially: The receive timestamp is applied after the message is taken off the IP input queue and the transmit timestamp is applied before the message is placed in the interface output queue.

  • May silently discard broadcast timestamp request.

    No: icmp_input responds to broadcast timestamp requests.

  • May silently discard multicast timestamp requests.

    No: icmp_input responds to broadcast timestamp requests.

  • Must use specific destination address as timestamp reply source address.

    Yes: icmp_reflect converts a broadcast or multicast destination to the specific address of the receiving interface and uses the result as the source address for the timestamp reply.

  • Should reflect record route and timestamp options in an ICMP timestamp request.

    Yes: icmp_reflect includes the record route and timestamp options in the timestamp reply message.

  • Must reverse and reflect source route option in ICMP timestamp request.

    Yes: icmp_reflect retrieves the reversed source route with ip_srcroute and includes it in the outgoing timestamp reply.

  • Must pass timestamp reply to higher layer.

    Yes: ICMP timestamp replies are passed to rip_input for receipt by registered processes.

  • Must obey rules for standard timestamp value.

    Yes: icmp_input calls iptime, which returns a standard time value.

  • Must provide a configurable method for selecting the address mask selection method for an interface.

    No: Net/3 supports only static configuration of address masks through the ifconfig program.

  • Must support static configuration of address mask.

    Yes: This is accomplished indirectly by specifying static information when the ifconfig program configures an interface during system initialization, typically in the /etc/netstart start-up script.

  • May get address mask dynamically during system initialization.

    No: Net/3 does not support the use of BOOTP or DHCP to acquire address mask information.

  • May get address with an ICMP address mask request and reply messages.

    No: Net/3 does not support the use ICMP messages to acquire address mask information.

  • Must retransmit address mask request if no reply.

    Not Applicable: Not required since this method is not implemented by Net/3.

  • Should assume default mask if no reply is received.

    Not Applicable: Not required since this method is not implemented by Net/3.

  • Must update address mask from first reply only.

    Not Applicable: Not required since this method is not implemented by Net/3.

  • Should perform reasonableness check on any installed address mask.

    No: Net/3 performs no reasonableness check on address masks.

  • Must not send unauthorized address mask reply messages and must be explicitly configured to be agent.

    Yes: icmp_input only responds to address mask requests if icmpmaskrepl is nonzero (it defaults to 0).

  • Should support an associated address mask authority flag with each static address mask configuration.

    No: Net/3 consults a global authority flag (icmpmaskrepl) to determine if it should send address mask replies for any interface.

  • Must broadcast address mask reply when initialized.

    No: Net/3 does not broadcast an address mask reply when an interface is configured.

Multicasting Requirements

This section summarizes the IP multicast requirements from Section 3.5 of RFC 1122 and the compliance of the Net/3 code that we’ve examined to those requirements.

  • Should support local IP multicasting (RFC 1112).

    Yes: Net/3 supports IP multicasting.

  • Should join the all-hosts group at start-up.

    Yes: in_ifinit joins the all-hosts group while initializing an interface.

  • Should provide a mechanism for higher layers to discover an interface’s IP multicast capability.

    Yes: The IFF_MULTICAST flag in the interface’s ifnet structure is available directly to kernel code and by the SIOCGIFFLAGS command for processes.

IGMP Requirements

This section summarizes the IGMP requirements from Section 3.5 of RFC 1122 and the compliance of the Net/3 code that we’ve examined to those requirements.

  • May support IGMP (RFC 1112).

    Yes: Net/3 supports IGMP.

Routing Requirements

This section summarizes the routing requirements from Section 3.5 of RFC 1122 and the compliance of the Net/3 code that we’ve examined to those requirements. Be aware that the requirements of this RFC apply to a host and not necessarily the kernel implementation. Some items are not explicitly handled by the kernel routing function in Net/3, but they are expected to be provided by a routing daemon such as routed or gated.

  • Must use address mask in determining whether a datagram’s destination is on a connected network.

    Yes: When an interface for a connected network such as an Ethernet is configured, its address mask is specified (or a default is chosen based on the class of IP address) and stored in the routing table entry. This mask is used by rn_match when it checks a leaf for a network match.

  • Must operate correctly in a minimal environment when there are no routers (all networks are directly connected).

    Yes: The system administrator must not configure a default route in this case.

  • Must keep a “route cache” of mappings to next-hop routers.

    Yes: The routing table is the cache.

  • Should treat a received network redirect the same as a host redirect.

    Yes, as described in Section 19.7.

  • Must use a default router when no entry exists for the destination in the routing table.

    Yes, if a default route has been entered into the routing table.

  • Must support multiple default routers.

    Multiple defaults are not supported by the kernel. Instead, this should be provided by a routing daemon.

  • May implement a table of static routes.

    Yes: These can be created at system initialization time with the route command.

  • May include a flag with each static route specifying whether or not the route can be overridden by a redirect.

    No.

  • May allow the routing table key to be a complete host address and not just a network address.

    Yes: Host routes take priority over a network route to the same network.

  • Should include the TOS in the routing table entry.

    No: There is a TOS field in the sockaddr_inarp that we describe in Chapter 21, but it is not currently used.

  • Must be able to detect the failure of a next-hop router that appears as the gateway field in the routing table and be able to choose an alternate next-hop router.

    Negative advice, the RTM_LOSING message generated by in_losing, is passed to any processes reading from a routing socket, which allows the process (e.g., a routing daemon) to handle this event.

  • Should not assume that a route is good forever.

    Yes: There are no timeouts on routing table entries in the kernel other than those created by ARP Again, the standard Unix routing daemons time out routes and replace them with alternatives when possible.

  • Must not ping routers continuously (ICMP echo request).

    Yes: The Net/3 kernel does not do this. The routing daemons don’t generate ICMP echo requests either.

  • Must use pinging of a router only when traffic is being sent to that router.

    The Net/3 kernel never generates pings to a next-hop router.

  • Should allow higher and lower layers to give positive and negative advice.

    Partially: The only information passed by other layers to the Net/3 routing functions is by in_losing, which is called only from TCP. The only action performed by the routing layer is to generate the RTM_LOSING message.

  • Must switch to another default router when the existing default fails.

    Yes, although the Net/3 kernel does not do this, it is supported by the routing daemons.

  • Must allow the following information to be configured manually in the routing table: IP address, network mask, list of defaults.

    Yes, but only one default is supported in the kernel.

ARP Requirements

This section summarizes the ARP requirements from Section 2.5 of RFC 1122 and the compliance of the Net/3 code that we’ve examined to those requirements.

  • Must provide a mechanism to flush out-of-date ARP entries. If this mechanism involves a timeout, it should be configurable.

    Yes and yes: arptimer provides this mechanism. The timeout is configurable (the arpt_prune and arpt_keep globals) but the only ways to change their values are to recompile the kernel or modify the kernel with a debugger.

  • Must include a mechanism to prevent ARP flooding.

    Yes, as we described with Figure 21.24.

  • Should save (rather than discard) at least one (the latest) packet of each set of packets destined to the same unresolved IP address, and transmit the saved packet when the address has been resolved.

    Yes: This is the purpose of the la_hold member of the llinfo_arp structure.

UDP Requirements

This section summarizes the UDP requirements from Section 4.1.5 of RFC 1122 and the compliance of the Net/3 code that we’ve examined to those requirements.

  • Should send ICMP port unreachable.

    Yes: udp_input does this.

  • Must pass received IP options to application.

    No: The code to do this is commented out in udp_input. This means that a process that receives a UDP datagram with a source route option cannot send a reply using the reversed route.

  • Must allow application to specify IP options to send.

    Yes: The IP_OPTIONS socket option does this. The options are saved in the PCB and placed into the outgoing IP datagram by ip_output.

  • Must pass IP options down to IP layer.

    Yes: As mentioned above, IP places the options into the IP datagram.

  • Must pass received ICMP messages to application.

    Yes: We must look at the exact wording from the RFC: “A UDP-based application that wants to receive ICMP error messages is responsible for maintaining the state necessary to demultiplex these messages when they arrive; for example, the application may keep a pending receive operation for this purpose.” The state required by Berkeley-derived systems is that the socket be connected to the foreign address and port. As the comments at the beginning of Figure 23.26 indicate, some applications create both a connected and an unconnected socket for a given foreign port, using the connected socket to receive asynchronous errors.

  • Must be able to generate and verify UDP checksum.

    Yes: This is done by udp_input, based on the global integer udpcksum.

  • Must silently discard datagrams with bad checksum.

    Yes: This is done only if udpcksum is nonzero. As we mentioned earlier, this variable controls both the sending of checksums and the verification of received checksums. If this variable is 0, the kernel does not verify a received nonzero checksum.

  • May allow sending application to specify whether outgoing checksum is calculated, but must default to on.

    No: The application has no control over UDP checksums. Regarding the default, UDP checksums are generated unless the kernel is compiled with 4.2BSD compatibility defined, or unless the administrator has disabled UDP checksums using sysctl(8).

  • May allow receiving application to specify whether received UDP datagrams without a checksum (i.e., the received checksum is 0) are discarded or passed to the application.

    No: Received datagrams with a checksum field of 0 are passed to the receiving process.

  • Must pass destination IP address to application.

    Yes: The application must call recvmsg and specify the IP_RECVDSTADDR socket option. Also recall our discussion following Figure 23.25 noting that 4.4BSD broke this option when the destination address is a multicast or broadcast address.

  • Must allow application to specify local IP address to be used when sending a UDP datagram.

    Yes: The application can call bind to set the local IP address. Recall our discussion at the end of Section 22.8 about the difference between the source IP address and the IP address of the outgoing interface. Net/3 does not allow the application to choose the outgoing interface t hat is done by ip_output, based on the route to the destination IP address.

  • Must allow application to specify wildcard local IP address.

    Yes: If the IP address INADDR_ANY is specified in the call to bind, the local IP address is chosen by in_pcbconnect, based on the route to the destination.

  • Should allow application to learn of the local address that was chosen.

    Yes: The application must call connect. When a datagram is sent on an unconnected socket with a wildcard local address, ip_output chooses the outgoing interface, which also becomes the source address. The inp_laddr member of the PCB, however, is restored to the wildcard address at the end of udp_output before sendto returns. Therefore, getsockname cannot return the value. But the application can connect a UDP socket to the destination, causing in_pcbconnect to determine the local interface and store the address in the PCB. The application can then call getsockname to fetch the IP address of the local interface.

  • Must silently discard a received UDP datagram with an invalid source IP address (broadcast or multicast).

    No: A received UDP datagram with an invalid source address is delivered to a socket, if a socket is bound to the destination port.

  • Must send a valid IP source address.

    Yes: If the local IP address is set by bind, it checks the validity of the address. If the local IP address is wildcarded, ip_output chooses the local address.

  • Must provide the full IP interface from Section 3.4 of RFC 1122.

    Refer to Section C.2.

  • Must allow application to specify TTL, TOS, and IP options for output datagrams.

    Yes: The application can use the IP_TTL, IP_TOS, and IP_OPTIONS socket options.

  • May pass received TOS to application.

    No: There is no way for the application to receive this value from the IP header. Notice that a getsockopt of IP_TOS returns the value used in outgoing datagrams, not the value from a received datagram. The received ip_tos value is available to udp_input, but is discarded along with the entire IP header.

TCP Requirements

This section summarizes the TCP requirements from Section 4.2.5 of RFC 1122 and the compliance of the Net/3 code that we’ve examined to those requirements.

PSH Flag

  • May aggregate data sent by the user without the PSH flag.

    Yes and no: Net/3 does not give the process a way to specify the PSH flag with a write operation, but Net/3 does aggregate data sent by the user in separate write operations.

  • May queue data received without the PSH flag.

    No: The absence or presence of a PSH flag in a received datagram makes no difference. Received data is placed onto the socket’s received queue when it is processed.

  • Sender should collapse successive PSH flags when it packetizes data.

    No.

  • May implement PSH flag on write calls.

    No: This is not part of the sockets API.

  • Since the PSH flag is not part of the write calls, must not buffer data indefinitely and must set the PSH flag in the last buffered segment.

    Yes: This is the method used by Berkeley-derived implementations.

  • May pass received PSH flag to application.

    No: This is not part of the sockets API.

  • Should send maximum-sized segment whenever possible, to improve performance.

    Yes.

Window

  • Must treat window size as an unsigned number. Should treat window size as 32-bit value.

    Yes: All the window sizes in Figure 24.13 are unsigned longs, which is also required by the window scale option of RFC 1323.

  • Receiver must not shrink the window (move the right edge to the left).

    Yes, in Figure 26.29.

  • Sender must be robust against window shrinking.

    Yes, in Figure 29.15.

  • May keep offered receive window closed indefinitely.

    Yes.

  • Sender must probe a zero window.

    Yes, this is the purpose of the persist timer.

  • Should send first zero-window probe when the window has been closed for the RTO.

    No: Net/3 sets a lower bound for the persist timer of 5 seconds, which is normally greater than the RTO.

  • Should exponentially increase the interval between successive probes.

    Yes, as shown in Figure 25.14.

  • Must allow peer’s window to stay closed indefinitely.

    Yes, TCP never gives up probing a closed window.

  • Sender must not timeout a connection just because the other end keeps advertising a zero window.

    Yes.

Urgent Data

  • Must have urgent pointer point to last byte of urgent data.

    No: Berkeley-derived implementations continue to interpret the urgent pointer as pointing just beyond the last byte of urgent data.

  • Must support a sequence of urgent data of any length.

    Yes, with the bug fix discussed in Exercise 26.6.

  • Must inform the receiving process (1) when TCP receives an urgent pointer and there was no previously pending urgent data, or (2) when the urgent pointer advances in the data stream.

    Yes, in Figure 29.17.

  • Must be a way for the process to determine how much urgent data remains, or at least whether more urgent data remains to be read.

    Yes, this is the purpose of the out-of-band mark, the SIOCATMARK ioctl.

TCP Options

  • Must be able to receive TCP options in any segment.

    Yes.

  • Must ignore any options not supported.

    Yes, in Section 28.3.

  • Must cope with an illegal option length.

    Yes, in Section 28.3.

  • Must implement both sending and receiving the MSS option.

    Yes, a received MSS option is handled in Figure 28.10, and Figure 26.23 always sends an MSS option with a SYN.

  • Should send an MSS option in every SYN when its receive MSS differs from 536, and may send it always.

    Yes, as mentioned earlier, an MSS option is always sent by Net/3 with a SYN.

  • If an MSS option is not received with a SYN, must assume a default MSS of 536.

    No: The default MSS is 512, not 536.

    This is probably a historical artifact because VAXes had a physical page size of 512 bytes and trailer protocols working only with data that is a multiple of 512.

  • Must calculate the “effective send MSS.”

    Yes, in Section 27.5.

TCP Checksums

  • Must generate a TCP checksum in outgoing segments and must verify received checksums.

    Yes, TCP checksums are always calculated and verified.

Initial Sequence Number Selection

  • Must use the specified clock-driven selection from RFC 793.

    No: RFC 793 specifies a clock that changes by 125,000 every half-second, whereas the Net/3 ISN (the global variable tcp_iss) is incremented by 64,000 every half-second, about one-half the specified rate.

Opening Connections

  • Must support simultaneous open attempts.

    Yes, although Berkeley-derived systems prior to 4.4BSD did not support this, as described in Section 28.9.

  • Must keep track of whether it reached the SYN_RCVD state from the LISTEN or SYN_SENT states.

    Yes, same result, different technique. The purpose of this requirement is to allow a passive open that receives an RST to return to the LISTEN state (as shown in Figure 24.15), but force an active open that ends up in SYN_RCVD and then receives an RST to be aborted. This is described following Figure 28.36.

  • A passive open must not affect previously created connections.

    Yes.

  • Must allow a listening socket with a given local port at the same time that another socket with the same local port is in the SYN_SENT or SYN_RCVD state.

    Yes: The stated purpose of this requirement is to allow a given application to accept multiple connection attempts at about the same time. This is done in Berkeley-derived implementations by cloning new connections from the socket in the LISTEN state when the incoming SYN arrives.

  • Must ask IP to select a local IP address to be used as the source IP address when the source IP address is not specified by the process performing an active open on a multihomed host.

    Yes, done by in_pcbconnect.

  • Must continue to use the same source IP address for all segments sent on a connection.

    Yes: Once in_pcbconnect selects the source address, it doesn’t change.

  • Must not allow an active open for a broadcast or multicast foreign address.

    Yes and no: TCP will not send segments to a broadcast address because the call to ip_output in Figure 26.32 does not specify the SO_BROADCAST option. Net/3, however, allows connection attempts to multicast addresses.

  • Must ignore incoming SYNs with an invalid source address.

    Yes: The code in Figure 28.16 checks for these invalid source addresses.

Closing Connections

  • Should allow an RST to contain data.

    No: The RST processing in Figure 28.36 ends up jumping to drop, which skips the processing of any segment data in Figure 29.22.

  • Must inform process whether other end closed the connection normally (e.g., sent a FIN) or aborted the connection with an RST.

    Yes: The read system calls return 0 (end-of-file) when the FIN is processed, but—1 with an error of ECONNRESET when an RST is received.

  • May implement a half-close.

    Yes: The process calls shutdown with a second argument of 1 to send a FIN. The process can still read from the connection.

  • If the process completely closes a connection (i.e., not a half-close) and received data is still pending in TCP, or if new data arrives after the close, TCP should send an RST to indicate data was lost.

    No and yes: If a process calls close and unread data is in the socket’s receive buffer, an RST is not sent. But if data arrives after a socket is closed, an RST is returned to the sender.

  • Must linger in TIME_WAIT state for twice the MSL.

    Yes, although the Net/3 MSL of 30 seconds is much smaller than the RFC 793 recommended value of 2 minutes.

  • May accept a new SYN from a peer to reopen a connection directly from the TIME_WAIT state.

    Yes, as shown in Figure 28.29.

Retransmissions

  • Must implement Van Jacobson’s slow start and congestion avoidance.

    Yes.

  • May reuse the same IP identifier field when a retransmission is identical to the original packet.

    No: The IP identifier is assigned by ip_output from the global variable ip_id, which increments each time an IP datagram is sent. It is not assigned by TCP.

  • Must implement Jacobson’s algorithm for calculating the RTO and Karn’s algorithm for selecting the RTT measurements.

    Yes, but realize that when RFC 1323 timestamps are present, the retransmission ambiguity problem is gone, obviating half of Karn’s algorithm, as we discussed with Figure 29.6.

  • Must include an exponential backoff for successive RTO values.

    Yes, as described with Figure 25.22.

  • Retransmission of SYN segments should use the same algorithm as data segments.

    Yes, as shown in Figure 25.15.

  • Should initialize estimation parameters to calculate an initial RTO of 3 seconds.

    No: The initial value of t_rxtcur calculated by tcp_newtcpcb is 6 seconds. This is also seen in Figure 25.15.

  • Should have a lower bound on the RTO measured in fractions of a second and an upper bound of twice the MSL.

    No: The lower bound is 1 second and the upper bound is 64 seconds (Figure 25.3).

Generating ACKs

  • Should queue out-of-order segments.

    Yes, done by tcp_reass.

  • Must process all queued segments before sending any ACKs.

    Yes, but only for in-order segments. ipintr calls tcp_input for each queued datagram that is a TCP segment. For in-order segments, tcp_input schedules a delayed ACK and returns to ipintr. If there are additional TCP segments on IP’s input queue, tcp_input is called by ipintr for each one. Only when ipintr finds no more IP datagrams on its input queue and returns can tcp_fasttimo be called to generate a delayed ACK. This ACK will contain the highest acknowledgment number in all the segments processed by tcp_input.

    The problem is with out-of-order segments: tcp_input calls tcp_output itself, before returning to ipintr, to generate the ACK for the out-of-order segment. If there are additional segments on IP’s input queue that would have made the out-of-order segment be in order, they are processed after the immediate ACK is sent.

  • May generate an immediate ACK for an out-of-order segment.

    Yes, this is needed for the fast retransmit and fast recovery algorithms (Section 29.4).

  • Should implement delayed ACKs and the delay must be less than 0.5 seconds.

    Yes: The TF_DELACK flag is checked by the tcp_fasttimo function every 200 ms.

  • Should send an ACK for at least every second segment.

    Yes, the code in Figure 26.9 generates an ACK for every second segment. We also discussed that this happens only if the process receiving the data reads the data as it arrives, since the calls to tcp_output that cause every other segment to be acknowledged are driven by the PRU_RCVD request.

  • Must include silly window syndrome avoidance in the receiver.

    Yes, as seen in Figure 26.29.

Sending Data

  • The TTL value for TCP segments must be configurable.

    Yes: The TTL is initialized to 64 (IPDEFTTL) by tcp_newtcpcb, but can then be changed by a process using the IP_TTL socket option.

  • Must include sender silly window syndrome avoidance.

    Yes, in Figure 26.8.

  • Should implement the Nagle algorithm.

    Yes, in Figure 26.8.

  • Must allow a process to disable the Nagle algorithm on a given connection.

    Yes, with the TCP_NODELAY socket option.

Connection Failures

  • Must pass negative advice to IP when the number of retransmissions for a given segment exceeds some value R1.

    Yes: The value of R1 is 4, and in Figure 25.26, when the number of retransmissions exceeds 4, in_losing is called.

  • Must close a connection when the number of retransmissions for a given segment exceeds some value R2.

    Yes: The value of R2 is 12 (Figure 25.26).

  • Must allow process to set the value of R2.

    No: The value 12 is hardcoded in Figure 25.26.

  • Should inform the process when R1 is reached and before R2 is reached.

    No.

  • Should default R1 to at least 3 retransmissions and R2 to at least 100 seconds.

    Yes: R1 is 4 retransmissions, and with a minimum RTO of 1 second, the tcp_backoff array (Section 25.9) guarantees a minimum value of R2 of over 500 seconds.

  • Must handle SYN retransmissions in the same general way as data retransmissions.

    Yes, but R1 is normally not reached for the retransmission of a SYN (Figure 25.15).

  • Must set R2 to at least 3 minutes for a SYN.

    No: R2 for a SYN is limited to 75 seconds by the connection-establishment timer (Figure 25.15).

Keepalive Packets

  • May provide keepalives.

    Yes, they are provided.

  • Must allow process to turn keepalives on or off, and must default to off.

    Yes: Default is off and process must turn them on with the SO_KEEPALIVE socket option.

  • Must send keepalives only when connection is idle for a given period.

    Yes.

  • Must allow the keepalive interval to be configurable and must default to no less than 2 hours.

    No and yes: The idle time before sending keepalive probes is not easily configurable, but it defaults to 2 hours. If the default idle time is changed (by changing the global variable tcp_keepidle), it affects all users of the keepalive option on the host it cannot be configured on a per-connection basis as many users would like.

  • Must not interpret the failure to respond to any given probe as a dead connection.

    Yes: Nine probes are sent before the connection is considered dead.

IP Options

  • Must ignore received IP options it doesn’t understand.

    Yes: This is done by the IP layer.

  • May support the timestamp and record route options in received segments.

    No: Net/3 only reflects these options for ICMP packets that are reflected back to the sender (icmp_reflect). tcp_input discards any received IP options by calling ip_stripoptions in Figure 28.2.

  • Must allow process to specify a source route when a connection is actively opened, and this route must take precedence over a source route received for this connection.

    Yes: The source route is specified with the IP_OPTIONS socket option. tcp_input never looks at a received source route when the connection is actively opened.

  • Must save a received source route in a connection that is passively opened and use the return route for all segments sent on this connection. If a different source route arrives in a later segment, the later route should override the earlier one.

    Yes and no: Figure 28.7 calls ip_srcroute, but only when the SYN arrives for a listening socket. If a different source route arrives later, it is not used.

Receiving ICMP Messages from IP

  • Receipt of an ICMP source quench should trigger slow start.

    Yes: The function tcp_quench is called by tcp_ctlinput.

  • Receipt of a network unreachable, host unreachable, or source route failed must not cause TCP to abort the connection and the process should be informed.

    Yes and no: As described following Figure 27.12, Net/3 now completely ignores host unreachable and network unreachable errors for an established connection.

  • Receipt of a protocol unreachable, port unreachable, or fragmentation required and DF bit set should abort an existing connection.

    No: tcp_notify records these ICMP errors in t_softerror, which is reported to the process if the connection is eventually dropped.

  • Should handle time exceeded and parameter problem errors the same as required previously for network and host unreachable.

    Yes: ICMP parameter problem errors are just recorded in t_softerror by tcp_notify. ICMP time exceeded errors are ignored by tcp_ctlinput. Neither type of ICMP error causes the connection to be aborted.

Application Programming Interface

  • Must be a method for reporting soft errors to the process, normally in an asynchronous fashion.

    No: Soft errors are returned to the process if the connection is aborted.

  • Must allow process to specify TOS for segments sent on a connection. Should let application change this during a connection’s lifetime.

    Yes to both, with the IP_TOS socket option.

  • May pass most recently received TOS to process.

    No: There is no way to do this with the sockets API. Calling getsockopt for IP_TOS returns only the current value being sent; it does not return the most recently received value.

  • May implement a “flush” call.

    No: TCP sends the data from the process as quickly as it can.

  • Must allow process to specify local IP address before either an active open or a passive open.

    Yes: This is done by calling bind before either connect or accept.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.17.79.60