Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 2. Delivering the Data

Chapter 1 touched on the basic architecture and design of the TCP/IP protocols. From that discussion, we know that TCP/IP is a hierarchy of four layers. This chapter explores in finer detail how data moves between the protocol layers and the systems on the network. We examine the structure of Internet addresses, including how addresses route data to its final destination and how address structure is locally redefined to create subnets. We also look at the protocol and port numbers used to deliver data to the correct applications. These additional details move us from an overview of TCP/IP to the specific implementation details that affect your system’s configuration.

Addressing, Routing, and Multiplexing

To deliver data between two Internet hosts, it is necessary to move the data across the network to the correct host, and within that host to the correct application or process. TCP/IP uses three schemes to accomplish these tasks:

Addressing: IP addresses, which uniquely identify every host on the network, deliver data to the correct host.
Routing: Gateways deliver data to the correct network.
Multiplexing: Protocol and port numbers deliver data to the correct software module within the host.

Each of these functions—addressing between hosts, routing between networks, and multiplexing between layers—is necessary to send data between two cooperating applications across the Internet. Let’s examine each of these functions in detail.

The IP Address

An IPv4 address is a 32-bit value that uniquely identifies every device attached to a TCP/IP Internet. Some of the bits identify a specific network within the Internet, and are referred to as network bits , or the network number . Other bits identify the device on the network and are called host bits, or the host number. We’ll talk much more about the structure of the IP address in the next section.

IP addresses are usually written as four decimal numbers separated by dots (periods) in a format called dotted decimal notation.^[*] Each decimal number represents a byte (8 bits) of the 32-bit address, and each of the four numbers is in the range of 0 through 255 (the decimal values possible in a single byte).

IP addresses are often called host addresses. Although this is common usage, it is slightly misleading. IP addresses are assigned to network interfaces, not to computer systems. A gateway has a different address for each network to which it is connected. The gateway is known to other devices by the address associated with the network that it shares with those devices.

Systems can be addressed in three different ways. Individual systems are directly addressed by a host address, which is called a unicast address. A unicast packet is addressed to one individual host. Groups of systems can be addressed using a multicast address, e.g., 224.0.0.2. Routers along the path from the source to destination recognize the special address and route copies of the packet to each member of the multicast group.^[†] All systems on a network are addressed using the broadcast address, e.g., 172.16.255.255. The broadcast address depends on the broadcast capabilities of the underlying physical network.

The broadcast address is a good example of the fact that not all network addresses or host addresses can be assigned to a network device. Some host addresses are reserved for special uses. On all networks, host numbers 0 and 255 are reserved. An IP address with all host bits set to 1 is a broadcast address. The broadcast address for network 172.16 is 172.16.255.255. A datagram sent to this address is delivered to every individual host on network 172.16. An IP address with all host bits set to 0 identifies the network itself. For example, 10.0.0.0 refers to network 10, and 172.16.0.0 refers to network 172.16. Addresses in this form are used in routing tables to refer to entire networks.

Network addresses with a first byte value greater than 223 cannot be assigned to a physical network because those addresses are reserved for special use. There are two other network addresses that are used only for special purposes. Network 0.0.0.0 designates the default route and network 127.0.0.0 is the loopback address. The default route is used to simplify the routing information that IP must handle, as explained in the section "The Routing Table" later in this chapter. The loopback address simplifies network applications by allowing the local host to be addressed in the same manner as a remote host. These special network addresses play an important part when configuring a host, but these addresses are not assigned to devices on real networks. Despite these few exceptions, most addresses are assigned to physical devices and are used by IP to deliver data to those devices.

The Internet Protocol moves data between hosts in the form of datagrams. Each datagram is delivered to the address contained in the Destination Address (word 5) of the datagram’s header. The Destination Address is a standard 32-bit IP address, which contains sufficient information to uniquely identify a network and a specific host on that network.

Address Structure

An IP address contains a network part and a host part , but the format of these parts is not the same in every IP address. The number of address bits used to identify the network and the number used to identify the host vary according to the prefix length of the address. The prefix length is determined by the address bit mask.

An address bit mask works in this way: if a bit in the mask is on, that equivalent bit in the address is interpreted as a network bit; if a bit in the mask is off, the bit belongs to the host part of the address. For example, if address 172.22.12.4 is given the network mask 255.255.255.0, which has 24 bits on and 8 bits off, the first 24 bits are the network number and the last 8 bits are the host address. Combining the address and the mask tells us that this is the address of host 4 on network 172.22.12.

Specifying both the address and the mask in dotted decimal notation is cumbersome when writing out addresses. A shorthand notation is available for writing an address with its associated address mask. Instead of writing network 172.31.26.32 with a mask of 255.255.255.224, we can write 172.31.26.32/27. The format of this notation is address/prefix-length, where prefix-length is the number of bits in the network portion of the address. Without this notation, the address 172.31.26.32 could easily be interpreted as a host address.

Organizations can obtain public IP addresses by purchasing a block of addresses from their Internet service provider (ISP). In this case, the ISP normally assigns a single organization a continuous block of addresses that is appropriate for the needs of the organization. For example, a moderately large business might purchase 192.168.16.0/20 while a small business might buy 192.168.32.0/24. Because the prefix shows the length of the network portion of the address, the number of host address bits that are available to an organization (the host portion of the address) is determined by subtracting the prefix from the total number of 32 bits in an address. Thus, a prefix of 20 leaves 12 bits that are available to be locally assigned to network devices such as servers. This is called a “12-bit block” of addresses. A prefix of 24 creates an “8-bit block.” Of the two sample address blocks, the first is a 12-bit block that encompasses 4,096 addresses from 192.168.16.0 to 192.168.31.255, and the second is an 8-bit block that includes the 256 addresses from 192.168.32.0 to 192.168.32.255.

Each of these address blocks appears to the outside world to be a single “network” address. Thus, external routers have one route to the block 192.168.16.0/20 and one route to the block 192.168.32.0/24, regardless of the size of the address block. Internally, however, the organization may have several separate physical networks within the address block. The flexibility of address masks means that service providers can assign arbitrary length blocks of addresses to their customers, and the customers can subdivide those address blocks using different length masks.

Subnets

To locally modify the structure of an IP address, use host address bits as additional network address bits. Essentially, the dividing line between network address bits and host address bits can be moved to create additional networks, by reducing the maximum number of hosts that can belong to each network. These newly designated network bits define an address block within the larger address block, which is called a subnet.

Organizations usually decide to subnet in order to overcome topological or organizational problems. Subnetting allows decentralized management of host addressing. With the standard addressing scheme, a central administrator is responsible for managing host addresses for the entire network. By subnetting, the administrator can delegate address assignment to smaller organizations within the overall organization—which may be a political expedient rather than a technical requirement. If you don’t want to deal with the data processing department, assign them their own subnet and let them manage it themselves.

Subnetting can also be used to overcome hardware differences and distance limitations. IP routers can link dissimilar physical networks together, but only if each physical network has its own unique network address. Subnetting divides a single address block into many subnet addresses, so that each physical network can have its own unique address.

A subnet is defined by changing the bit mask of the IP address. A subnet mask functions in the same way as a normal address mask: an “on” bit is interpreted as a network bit; an “off” bit belongs to the host part of the address. The difference is that a subnet mask is only used locally. In the outside world, the address is still interpreted using the address mask known to the outside world.

Assume you have a small real estate business that has been assigned the address block 192.168.32.0/24. The bit mask associated with that address block is 255.255.255.0 and the block contains 256 addresses. Further, assume that your business has 10 offices each with a half-dozen computers and you want to allocate some addresses to each office and have some for future expansion. You can subdivide the 256-address block with a subnet mask that extends the network portion of the address by a few additional bits.

To subdivide 192.168.32.0/24 into 16 subnets, use the mask 255.255.255.240, i.e., 192.168.32.0/28. The first three bytes contain the original network address block; the fourth byte is divided between the subnet address and the address of the host on that subnet. Applying this mask defines the four high-order bits^[*] of the fourth byte as the subnet part of the address, and the remaining four bits—the last four bits of the fourth byte—as the host portion of the address. This creates 16 subnets, each containing 14 host addresses, which is better suited to the network topology of your small real estate business. Table 2-1 shows the subnets and host addresses produced by applying this subnet mask to network address 192.168.32.0/24.

Table 2-1. Effect of a subnet mask

Network number	Host address range	Broadcast address
192.168.32.0	192.168.32.1-192.168.32.14	192.168.32.15
192.168.32.16	192.168.32.17-192.168.32.30	192.168.32.31
192.168.32.32	192.168.32.33-192.168.32.46	192.168.32.47
192.168.32.48	192.168.32.49-192.168.32.62	192.168.32.63
192.168.32.64	192.168.32.65-192.168.32.78	192.168.32.79
192.168.32.80	192.168.32.81-192.168.32.94	192.168.32.95
192.168.32.96	192.168.32.97-192.168.32.110	192.168.32.111
192.168.32.112	192.168.32.113-192.168.32.126	192.168.32.127
192.168.32.128	192.168.32.129-192.168.32.142	192.168.32.143
192.168.32.144	192.168.32.145-192.168.32.158	192.168.32.159
192.168.32.160	192.168.32.161-192.168.32.174	192.168.32.175
192.168.32.176	192.168.32.177-192.168.32.190	192.168.32.191
192.168.32.192	192.168.32.193-192.168.32.206	192.168.32.207
192.168.32.208	192.168.32.209-192.168.32.222	192.168.32.223
192.168.32.224	192.168.32.225-192.168.32.238	192.168.32.239
192.168.32.240	192.168.32.241-192.168.32.254	192.168.32.255

In Table 2-1, the first row describes a subnet with a subnet number that is all zeros (the first four bits of the fourth byte are all set to 0). The last row in the table describes a subnet with a subnet number that is all ones (the first four bits of the fourth byte are all set to 1). Originally, the RFCs implied that you should not use subnet numbers of all zeros or all ones. However, RFC 1812, Requirements for IP Version 4 Routers, makes it clear that subnets of all zeros and all ones are legal and should be supported by all routers.

You don’t have to manually calculate a table like Table 2-1 to know what subnets and host addresses are produced by a subnet mask. The calculations have already been done for you. RFC 1878, Variable Length Subnet Table For IPv4, lists all possible subnet masks and the valid addresses they produce.

RFC 1878 describes all 32 prefix values. But little documentation is needed because the prefix is easy to understand and remember. Writing 10.104.0.19 as 10.104.0.19/8 shows that this address has 8 bits for the network number and therefore 24 bits for the host number. Unfortunately, things are not always this neat. Sometimes the address is not given an explicit address mask and you need to know how to determine the natural mask that an address is assigned by default.

The Natural Mask

Originally, the IP address space was divided into a few fixed-length structures called address classes. The three main address classes were class A, class B, and class C. IP software determined the class, and therefore the structure, of an address by examining its first few bits. Address classes are no longer used. However, the same rules that used to determine the address class are now used to create the default address mask, which is called the natural mask. These rules are as follows:

If the first bit of an IP address is 0, the default mask is 8 bits long (prefix 8). This is the same as the old class A network address format. The first 8 bits identify the network, and the last 24 bits identify the host.
If the first 2 bits of the address are 1 0, the default mask is 16 bits long (prefix 16), which is the same structure as the old class B network address format. The first 16 bits identify the network, and the last 16 bits identify the host.
If the first 3 bits of the address are 1 1 0, the default mask is 24 bits long (prefix 24). This mask is the same as the old class C network address format. In this format, the first 24 bits are the network address, and the last 8 bits identify the host.
If the first four bits of the address are 1 1 1 0, it is a multicast address. These addresses were sometimes called class D addresses, but they don’t really refer to specific networks. Multicast addresses are used to address groups of computers all at one time. Multicast addresses identify a group of computers that share a common application, such as a videoconference, as opposed to a group of computers that share a common network. All bits in a multicast address are significant for routing, so the default mask is 32 bits long (prefix 32). Figure 2-1 shows examples of class A, class B and class C addresses. A sample multicast address is 224.0.0.9.

Figure 2-1. Default IP address formats

When an IP address is written in dotted decimal format, it is sometimes easier to think of the address as four 8-bit bytes instead of as a 32-bit value. We can look at the address as composed of full bytes of network address and full bytes of host address when using the natural mask, because the three default masks all create prefix lengths that are multiples of eight. A simple way to determine the default mask is to look at the first byte of the address. If the value of the first byte is:

Less than 128: The default address mask is 8 bits long; the first byte is the network number, and the next three bytes are the host address.
128 to 191: The default address mask is 16 bits long; the first two bytes identify the network, and the last two bytes identify the host.
192 to 223: The default address mask is 24 bits; the first three bytes are the network address, and the last byte is the host number.
224 to 239: The entire address identifies a specific multicast group; therefore the default mask is 32 bits.
Greater than 239: The address is reserved. We can ignore reserved addresses.

Figure 2-1 illustrates the two techniques for determining the default address structure. The first address is 10.104.0.19. The first bit of this address is 0; therefore, the first 8 bits define the network, and the last 24 bits define the host. Explained in a byte-oriented manner, the first byte is less than 128, so the address is interpreted as host 104.0.19 on network 10. One byte specifies the network and three bytes specify the host.

The second address in Figure 2-1 is 172.16.12.1. The two high-order bits are 1 0, meaning that 16 bits define the network and 16 bits define the host. Viewed in a byte-oriented way, the first byte falls between 128 and 191, so the address refers to host 12.1 on network 172.16. Two bytes identify the network and two identify the host.

Finally, in the address 192.168.16.1, the three high-order bits are 1 1 0, indicating that 24 bits represent the network and 8 bits represent the host. The first byte of this address is in the range from 192 to 223, so this is the address of host 1 on network 192.168.16—three network bytes and one host byte.

Evaluating addresses according to the old class rules discussed above limits the length of network numbers to 8, 16, or 24 bits—1, 2, or 3 bytes. The IP address, however, is not really byte-oriented. It is 32 contiguous bits. The address bit mask provides a flexible way to define the network and host portions of an address. IP uses the network portion of the address to route the datagram between networks. The full address, including the host information, is used to identify an individual host. Because of the dual role of IP addresses, the flexibility of address masks not only makes more addresses available for use, it also has a positive impact on routing.

CIDR Blocks and Route Aggregation

The IP address, which provides universal addressing across all of the networks of the Internet, is one of the great strengths of the TCP/IP protocol suite. However, the original class structure of the IP address had weaknesses. The TCP/IP designers did not envision the enormous scale of today’s network. When TCP/IP was being designed, networking was limited to large organizations that could afford substantial computer systems. The idea of a powerful PC system on every desktop did not exist. At that time, a 32-bit address seemed so large that it was divided into classes to reduce the processing load on routers, even though dividing the address into classes sharply reduced the number of host addresses actually available for use. For example, assigning a single class B address instead of six class C addresses to a large network reduced the load on the router because the router needed to keep only one route for that entire organization. However, an organization that was assigned the class B address probably did not have 65,536 computers, so most of the host addresses available to that organization were never used.

The class-structured address design was critically strained by the rapid growth of the Internet. At one point, it appeared that all class B addresses might be rapidly exhausted. The rapid depletion of the class B addresses showed that three primary address classes were not enough: class A was much too large and class C was much too small. Even a class B address was too large for many networks but was used because it was better than the alternatives.

The obvious solution to the class B address crisis was to force organizations to use multiple class C addresses. There were millions of these addresses available and they were in no immediate danger of depletion. As is often the case, the obvious solution is not as simple as it may seem. In the core of the Internet, each class C address might require its own entry within the routing table. Assigning thousands or millions of class C addresses could cause the routing table to grow so rapidly for major network providers that their routers would soon be overwhelmed. The solution not only required the new way of looking at addresses that address masks provide but also required a new way of assigning addresses.

Originally, network addresses were assigned in more or less sequential order as they were requested. This worked fine when the network was small and centralized. However, it did not take network topology into account. Thus, only random chance would determine if the same intermediate routers would be used to reach network 195.4.12.0 and network 195.4.13.0, which makes it difficult to reduce the size of the routing table. Addresses can only be aggregated if they are contiguous numbers and are reachable through the same route. For example, if addresses are contiguous for one service provider, a single route can be created for that aggregation because that service provider will have a limited number of connections to the Internet. But if one network address is in France and the next contiguous address is in Australia, creating a consolidated route for these addresses is not possible.

Today, large, contiguous blocks of addresses are assigned to large network service providers in a manner that better reflects the topology of the network. The service providers then allocate chunks of these address blocks to the organizations to which they provide network services. Because the assignment of addresses reflects the topology of the network, it permits route aggregation. Under this scheme, we know that network 195.4.12.0 and network 195.4.13.0 are reachable through the same intermediate routers. In fact, both of these addresses are in the range of the addresses assigned to Europe (194.0.0.0 to 195.255.255.255).

Assigning addresses that reflect the topology of the network enables route aggregation but does not implement it. As long as network 195.4.12.0 and network 195.4.13.0 are interpreted as separate class C addresses, they require separate entries in the routing table. For this reason, address masks are included in routing table entries to ensure that destination addresses are interpreted correctly.

The use of an address mask instead of the old address classes to determine the destination network is called Classless Inter-Domain Routing (CIDR).^[*] Supporting CIDR required modifications to the routers and routing protocols. The protocols need to distribute, along with the destination addresses, address masks that define how the addresses are interpreted. Routers and hosts need to know how to interpret these addresses as “classless” addresses and how to apply the bit mask that accompanies the address. All new operating systems and routing protocols support address masks.

CIDR was intended as an interim solution, though it has proved much more durable than its designers imagined. CIDR has provided address and routing relief for many years and is capable of providing it for many more years to come. Another innovation that has slowed the depletion of IP addresses is the standardization of private network numbers.

Private Network Numbers

Every interface on a TCP/IP network must have a unique IP address. If a host is directly connected to the Internet, its IP address must be unique within the entire Internet. If a host is connected to a private network, such as an enterprise network, its IP address only needs to be unique within that private network. RFC 1918, Address Allocation for Private Internets, lists network numbers that are reserved for private use.^[†] The private network numbers are:

Network 10.0.0.0 (10/8 prefix) is a 24-bit block of addresses.
Networks 172.16.0.0 to 172.31.0.0 (172.16/12 prefix) are a 20-bit block of addresses.
Networks 192.168.0.0 to 192.168.255.0 (192.168/16 prefix) are a 16-bit block of addresses.

The disadvantage to using a network address from RFC 1918 is that you may have to change your address in the future if you directly connect your full network to the Internet and wish to make all of the systems on the network accessible from the Internet. There are some advantages to choosing a private network address:

It’s easy. You do not have to apply for an official address or get anyone’s approval.
It’s friendly. You save address space for those who need to connect to the Internet.
It’s free. RFC 1918 addresses cost nothing—public addresses cost money.

If you do choose an address from RFC 1918, the hosts on your network can still have access to systems on the Internet. But it will take some effort. You’ll need network address translation (NAT ) or a proxy server. NAT is available as a separate piece of hardware or as a piece of software in some routers and firewalls. It works by converting the source address of datagrams leaving your network from your private address to your official address. Address translation has several advantages:

It conserves IP addresses. Most network connections are between systems on the same enterprise network. Only a small percentage of systems need to connect to the Internet at any one time. Therefore, far fewer official IP addresses are needed than the total number of systems on an enterprise network. NAT makes it possible to use a large address space from RFC 1918 for configuring the enterprise network while using only a small official address space for Internet connections.
It has some security advantages because it reduces address spoofing, a security attack in which a remote system pretends to be a local system. The addresses in RFC 1918 cannot be routed over the Internet. Therefore, even if a datagram is routed off of your network toward the remote system, the fact that the datagram contains an RFC 1918 destination address means that the routers in the Internet will discard the datagram.
It eliminates the need to renumber your hosts when you connect to the Internet.

Network address translation also has disadvantages:

NAT may add cost for new hardware or optional software. However, these costs tend to be very low.
Address translation adds overhead to the processing of every datagram. When the address is changed, the checksum must be recalculated. Furthermore, some upper-layer protocols carry a copy of the IP address that also must be converted.
Routers never modify the addresses in a datagram header, but NAT does. This might introduce some instability. Additionally, protocols and applications that embed addresses in their data may not function correctly with NAT.
NAT may impact end-to-end encryption and authentication. Authentication and encryption schemes that include the IP address within the calculation are affected because the NAT box changes the IP addresses. (See the description of IPSec in Chapter 1 for an example of a protocol that includes the IP address in authentication and encryption.)

Windows Server 2003 handles the NAT security problem by supporting NAT-Traversal (NAT-T). NAT-T resolves the potential IPSec encryption and authentication security problems caused by NAT. A system that supports NAT-T indicates its NAT-T capability during the IPSec connection negotiation. It also sends two messages, one containing a hash of the destination IP address and port, and the other containing a hash of the source IP address and port. The receiving system can then compare the hashed addresses to the ones it sees in the IP header. If the header addresses have been changed by an intervening NAT box, the fact is easily detected. When an intervening NAT box is detected, the IPSec end-systems continue by encapsulating the IPSec messages inside of UDP packets. The NAT box does its address translation on the UDP header leaving the IPSec messages unmolested. The data gets through, and the authentication and encryption are preserved.

Combining NAT with a private network address gives every host on the private network access to the outside world, but it does not allow outside users access into your network. For that level of direct access, you need to obtain an official IP address as described in Chapter 4.

CIDR and private network numbers have extended the usefulness of IPv4 addresses. However, the long-term solution for address depletion is to replace the current addressing scheme with a new one. In the TCP/IP protocol suite, addressing is defined by the IP protocol. Therefore, to define a new address structure, the Internet Engineering Task Force (IETF) created a new version of IP called IPv6 .

IPv6

IPv6 provides an enormous 128-bit address to solve the address depletion problem. A 128-bit address can uniquely identify 3.4 × 10³⁸ devices. Of course, not all 128 bits are used to specify a device address. Like the IPv4 address, the IPv6 address has a structure that defines the network and the device on the network. Figure 2-2 is an example of the basic IPv6 address structure currently being assigned by the Internet Assigned Number Authority (IANA) as shown in RFC 3587, IPv6 Global Unicast Address Format.

Figure 2-2. Sample IPv6 unicast address format

The first three bits of the address shown in Figure 2-2 indicate the address type. A variable-length binary prefix determines the IPv6 address type. RFC 3513, Internet Protocol Version 6 (IPv6) Addressing Architecture, assigns the prefixes shown in Table 2-2.

Table 2-2. IPv6 address types

Assignment	Variable-length binary prefix
Special use	0000 0000
NSAP allocation	0000 001
IANA allocation	001
Link-local unicast addresses	1111 1110 10
Site-local unicast addresses	1111 1110 11
Multicast addresses	1111 1111

The currently defined special-use addresses include the following:

Unspecified address: This is an address where all 128 bits are 0. The unspecified address is used to explicitly indicate that an address has not been assigned. For example, a client might use an unassigned address as the source address before being assigned an address by a configuration server.
Loopback address: This is an address where the first 127 bits are set to zeros followed by one bit set to 1. It serves the same purpose as the IPv4 loopback address.
IPv4-compatible IPv6 address format: This address contains 96 bits of zeros followed by a 32-bit IPv4 address. This type of address is used to tunnel IPv6 packets over IPv4 networks.
IPv4-mapped IPv6 address format: This address has 80 bits of zeros, 16 bits of ones, and then a 32-bit IPv4 address. This is used to represent an IPv4 address as an IPv6 address.

Network Service Access Point (NSAP) addresses are used at the NSAPs that connect the global Internet. IANA addresses are the public IPv6 addresses an ISP would obtain for its network, in the same way an ISP uses public IPv4 addresses. (Chapter 4 provides more information about official address registries and how public addresses are obtained.) The addresses used to move packets across a global IPv6 Internet are of this type.

Link-local addresses and site-local addresses are analogous to IPv4 private network numbers. Site-local addresses are for private use within an enterprise. These addresses are not to be routed across a global Internet. Link-local addresses have an even smaller scope. Link-local addresses are private addresses limited to a single physical link. They cannot be routed even within the enterprise.

Finally, IPv6 also provides multicast addresses. These are used in exactly the same way as IPv4 multicast addresses. However, IPv6 has a related address called an anycast address. Like a multicast address, an anycast address identifies members of a group, but the anycast address references only one member of the group—the member that is closest to the source system. Despite the logical relationship of anycast addresses to multicast addresses, these two addresses are not syntactically related. Anycast addresses do not come from the multicast address space. Instead, anycast addresses are taken from the unicast address space.

The first three bits in the address format shown in Figure 2-2 indicate that this is a global unicast address from the address space currently assigned by IANA. This three-bit field is followed by the Global Routing Prefix. The Global Routing Prefix is the portion of the IPv6 address that is synonymous with the network portion of an IPv4 address. It is the portion of the address that is assigned to the enterprise by the address registry, and it is the portion evaluated by intermediate routers to move packets to the enterprise site.

The Subnet ID is a 16-bit field used to create subnets within the larger network. It is used just like subnets in IPv4, but unlike IPv4, it does not require taking bits away from the host-specific portion of the address. The IPv6 global unicast address structure shown in Figure 2-2 sets aside 16 bits specifically for subnetting.

The 64-bit Interface Identifier is the part of the IPv6 address analogous to the host portion of an IPv4 address. The Interface Identifier is assigned in a number of ways:

By using the MAC address of the interface
By DHCP
By PPP
By using a randomized value

DHCP and PPP are address assignment techniques used in IPv4. The pseudo-random value is a technique used to emulate the device-independent addressing used in IPv4. Using the MAC address is perhaps the most interesting of the address assignment methods because the availability of this type of address assignment means that the network interface can be self-configured. Here’s how. Ethernet interfaces, and many other types of network interfaces, have a unique physical layer address, called a Media Access Control (MAC) address, encoded in the device hardware. The IPv6 software can retrieve the address from the hardware and use it to create the Interface Identifier. The Interface Identifier uses a modified EUI-64 format. If the device uses a MAC address that complies with the EUI-64 format, the IPv6 software can use that address as an Interface Identifier with very little modification. In the far more common case where the device uses an IEEE 802.3 Ethernet-style 48-bit MAC address, the IPv6 software simply extends the MAC address to 64 bits by inserting the hexadecimal value FFFE between the company identifier and the vendor-supplied identifier of the 802.3 MAC address to create an EUI-64 compliant MAC address. It then modifies this address to create an Interface Identifier. In either case, the conversion is easily done in the IPv6 software without any external configuration servers or any special configuration input from the network administrator.

IPv6 addresses are written using a colon-hexadecimal syntax. The addresses are written as eight 16-bit values separated by colons. Leading zeros within a 16-bit value do not have to be written out, and a long string of zero values can be indicated by the use of a double-colon. An example will make this notation clear.

The multicast address used to address all routers can be written out as:

    FF01:0000:0000:0000:0000:0000:0000:0002

or the leading zeros could be dropped and the address could be written as:

    FF01:0:0:0:0:0:0:2

or the double colon syntax could be used in place of the contiguous run of zeros:

    FF01::2

The prefix length notation defined by CIDR can also be used with IPv6 addresses to identify a route, subnet or address range. For example, FF::/8 would match every address that begins with FF, which is every multicast address.

As you might imagine, the large address means that the IPv6 header is substantially larger than an IPv4 header. However, it is less complex and more easily processed. Figure 2-3 shows the IPv6 header format.

Figure 2-3. The IPv6 header format

The Version field specifies the version of IP. In an IPv4 header, this field contains the value 4 and in an IPv6 header, it contains 6.

Traffic class and Flow label are used to implement quality of service. Traffic class provides for differentiated service by identifying the type of data being carried in the datagram payload. For example, voice traffic might be given different handling than email traffic. The Flow label allows the source to request special handling for a sequence of packets, which is called a flow. This might be used to maintain sequence and timing within a flow of real-time data. Built-in support for Quality of Service (QoS) is one of the advantages of IPv6.

Payload length specifies the length of the packet that follows the IPv6 header. Next header identifies the type of header that follows the IPv6 header. (This is similar to the Protocol field in an IPv4 header.) The Hop limit field is similar to the Time-to-Live field of an IPv4 header. It is decremented by each router that handles the packet and is used to ensure that the packet is not caught in a routing loop. Last are the Source address and the Destination address.

Compare this header to the IPv4 header shown in Figure 1-5. This IPv6 header has fewer fields to process and the header is always a fixed length.

The address structure and the header format are not the only things that have changed with IPv6. A new ICMP, called ICMPv6, was also created. And beyond that, network protocols in the layers above the Internet Layer also had to change. Changing something as fundamental as IP causes changes throughout the protocol stack. Switching from IPv4 to IPv6 is a major change that affects all layers of the network software.

The lack of demand for IPv6

IPv6 is an improvement on the IP protocol based on 20 years of operational experience. The original motivation for the new protocol was the threat of address depletion, which IPv6 solves with a very large 128-bit address space. The large address space also makes it possible to use a hierarchical address structure to reduce the burden on routers while still maintaining more than enough addresses for future network growth. But large addresses are only one of the benefits of the new protocol. Other benefits of IPv6 are:

Improved security built into the protocol
Simplified, fixed-length, word-aligned headers to speed header processing and reduce overhead
Improved techniques for handling header options
Improved quality of service (QoS) support
Support for automatic configuration

IPv6 has several good features, but it is still not widely used. In part, this is because of enhancements to IPv4, improvements in hardware performance, and changes in the way that networks are configured have lessened the demand for the new features of IPv6.

A critical shortage of addresses has not yet materialized for three reasons:

CIDR makes the assignment of addresses more flexible, which in turn makes more addresses available and permits aggregation to reduce the burden on routers.
Private addresses and NAT have greatly reduced the demand for official addresses. Many organizations prefer to use private addresses for all systems on their internal networks because private addresses reduce the administrative burden and improve security.
Permanent, fixed address assignment is less common than dynamic address assignment. The majority of systems use dynamic addresses temporarily assigned by a configuration protocol such as DHCP.

The creation of the IPSec standards for IPv4 lessened the need for the security enhancements of IPv6. In fact, many of the security tools and features available for IPv4 systems are not being fully utilized, indicating that the demand for tools to secure the link may have been overestimated.

IPv6 eliminates hop-by-hop segmentation, has a more efficient header design, and enhanced option processing. These things make it more efficient to process IPv6 packets than to handle IPv4 packets. However, for the vast majority of systems, this increased efficiency is unneeded because processing IP datagrams is a very minor task. Most systems exist at the edge of the network and handle relatively few communications packets. Processor speed and memory have increased enormously while hardware prices have fallen. Most managers would rather buy more hardware using the proven IPv4 protocol than undertake implementing the new IPv6 protocol just to save a few machine cycles. Only those systems located near the core of the network would truly benefit from this efficiency, and although important, those systems are relatively few in number.

All of these things have worked together to lessen the demand for IPv6. The lack of demand has limited the number of organizations that have adopted IPv6 as their primary communications protocol, and a large user community is the one thing that a protocol needs to be truly successful. We use communications protocols to communicate with other people. If there are not enough people using the protocol, we don’t feel the need to use it. IPv6 is still in the early-adopter phase. Most organizations do not use IPv6 at all, and many of those that do, use it only for experimental purposes. Between organizations, most IPv6 communications are encapsulated inside IPv4 datagrams and sent over the Internet inside IPv4 tunnels. It will be some time yet before it is the primary protocol of operational networks.

If you run an operational network, you should not be overly concerned with IPv6. The current generation of TCP/IP (IPv4), with the enhancements that CIDR and other extensions provide, should be more than adequate for your current network needs. On your network and on the Internet, you will most likely use IPv4 and 32-bit IP addresses. IPv4 is the version of IP on which this book focuses.

Internet Routing Architecture

Chapter 1 described the evolution of the Internet architecture over the years. Along with these architectural changes have come changes in the way that routing information is disseminated within the network.

In the original Internet structure, there was a hierarchy of gateways. This hierarchy reflected the fact that the Internet was built upon the existing ARPAnet. When the Internet was created, the ARPAnet was the backbone of the network: a central delivery medium to carry long-distance traffic. This central system was called the core, and the centrally managed gateways that interconnected it were called the core gateways.

In that hierarchical structure, routing information for all of the networks on the Internet was passed into the core gateways. The core gateways processed the information and then exchanged it among themselves using the Gateway to Gateway Protocol (GGP ). The processed routing information was then passed back out to the external gateways. The core gateways maintained accurate routing information for the entire Internet.

Using the hierarchical core router model to distribute routing information has a major weakness: every route must be processed by the core. This places a tremendous processing burden on the core, and as the Internet grew larger, the burden increased. In network-speak, we say that this routing model does not “scale well.” For this reason, a new model emerged.

Even in the days of a single Internet core, groups of independent networks called autonomous systems (AS) existed outside of the core. The term “autonomous system” has a formal meaning in TCP/IP routing. An autonomous system is not merely an independent network. It is a collection of networks and gateways with its own internal mechanism for collecting routing information and passing it to other independent network systems. The routing information passed to the other network systems is called reachability information . Reachability information simply says which networks can be reached through that autonomous system. In the days of a single Internet core, autonomous systems passed reachability information into the core for processing. The Exterior Gateway Protocol (EGP) was the protocol used to pass reachability information between autonomous systems and into the core.

The new routing model is based on co-equal collections of autonomous systems called routing domains. Routing domains exchange routing information with other domains using Border Gateway Protocol (BGP). Each routing domain processes the information it receives from other domains. Unlike the hierarchical model, this model does not depend on a single core system to choose the “best” routes. Each routing domain does this processing for itself; therefore, this model is more expandable.

The problem with this model is this: how are “best” routes determined in a global network if there is no central routing authority, like the core, that is trusted to determine the “best” routes? In the days of the NSFNET, the policy-routing database (PRDB) was used to determine whether the reachability information advertised by an autonomous system was valid. But now, even the NSFNET does not play a central role.

To fill this void, NSF created the Routing Arbiter (RA) servers when it created the Network Access Points (NAPs) that provide interconnection points for the various service provider networks. A routing arbiter is located at each NAP. The server provides access to the Routing Arbiter Database (RADB), which replaced the PRDB. ISPs can query servers to validate the reachability information advertised by an autonomous system.

The RADB is only part of the Internet Routing Registry (IRR). As befits a distributed routing architecture, there are multiple organizations that validate and register routing information. The Europeans were the pioneers in this. The Reseaux IP Européens (RIPE) Network Control Center (NCC) provides the routing registry for European IP networks. Big network carriers provide registries for their customers. All of the registries share a common format based on the RIPE-181 standard.

Many ISPs do not use the route servers. Instead, they depend on formal and informal bilateral agreements. In essence, two ISPs get together and decide what reachability information each will accept from the other. They create, in effect, private routing policies. Small ISPs have criticized the routing policies of the tier-one providers claiming that they limit competition. In response, tier-one providers have made the policies public to clarify the basis for the current architecture.

Creating an effective routing architecture continues to be a major challenge for the Internet, and the routing architecture will certainly evolve over time. No matter how it is derived, eventually the routing information winds up in your local gateway, where it is used by IP to make routing decisions.

The Routing Table

Gateways route data between networks, but all network devices, hosts as well as gateways, must make routing decisions. For most hosts, the routing decisions are simple:

If the destination host is on the local network, the data is delivered to the destination host.
If the destination host is on a remote network, the data is forwarded to a local gateway.

IP routing decisions are simply table look-ups. Packets are routed toward their destination as directed by the routing table.^[*] The routing table maps destinations to the router and network interface that IP must use to reach that destination. Examining the routing table on a Windows Server 2003 system shows this.

Use the route command with the print option to display the routing table. Here is a simple routing table from a small system:

    C:>route print
     
    IPv4 Route Table
    ===========================================================================
    Interface List
    0x1 ........................... MS TCP Loopback interface
    0x10003 ...00 50 ba 3f c2 5e ...... D-Link DFE-530TX+ PCI Adapter
    ===========================================================================
    ===========================================================================
    Active Routes:
    Network Destination        Netmask         Gateway       Interface  Metric
              0.0.0.0          0.0.0.0     172.16.12.1     172.16.12.20     30
            127.0.0.0        255.0.0.0       127.0.0.1        127.0.0.1      1
          172.16.12.0    255.255.255.0    172.16.12.20     172.16.12.20     30
         172.16.12.20  255.255.255.255       127.0.0.1        127.0.0.1     30
        172.16.12.255  255.255.255.255    172.16.12.20     172.16.12.20     30
            224.0.0.0        240.0.0.0    172.16.12.20     172.16.12.20     30
      255.255.255.255  255.255.255.255    172.16.12.20     172.16.12.20      1
    Default Gateway:       172.16.12.1
    ===========================================================================
    Persistent Routes:
      None

The route print command displays the routing table in three sections:

Interface List: Lists the network interfaces used by TCP/IP. In the example, only the loopback interface and a single Ethernet interface are used.
Active Routes: Contains the bulk of the routing table. Active routes are routes that can be updated based on changing network conditions.
Persistent Routes: Lists static routes that have been manually defined by the system administrator and marked as persistent. These routes are not updated to reflect the current status of the network. Persistent routes are not usually required. However, Chapter 4 shows how manually defined routes are created and the effect they have on the routing table.

The routes listed in the Active Routes section are displayed with the following fields:

Network Destination: The value against which the destination IP address is matched.
Netmask: The address mask used to match an IP address to the value shown in the Network Destination field.
Gateway: The router used to reach the specified destination.
Interface: The name of the network interface used by the route.
Metric: The “cost” of the route. The metric is used to sort duplicate routes if any appear in the table. Beyond this, a dynamic routing protocol is required to make any use of the metric.

Each entry in the routing table starts with a destination value. The destination value is the key against which the IP address is matched to determine if this is the correct route to use to reach the IP address. The destination value is usually called the “destination network,” although it does not need to be a network address. The destination value can be a host address; it can be a multicast address; it can be an address block that covers an aggregation of many networks; it can be a special value for the default route or loopback address. In all cases, however, the Destination Network field contains the value against which the destination address from the IP packet is matched to determine if IP should use this route.

The Netmask field is the bit mask IP applies to the destination address from the packet to see if the address matches the destination value in the table. If a bit is “on” in the bit mask, the corresponding bit in the destination address is significant for matching the address. Thus, the address 172.16.12.183 would match the second entry in the sample table because ANDing the address with 255.55.255.0 yields 172.16.12.0.^[*]

When an address matches an entry in the table, the Gateway field tells IP how to reach the specified destination. If the Gateway field contains the IP address of a router, the router is used. If the Gateway field contains the address of one of the system’s network interfaces, the destination network is a directly connected network and the “gateway” is one of the computer’s network interfaces.

The Interface field displays the address of the network interface used for each route. In the example, it is either the Ethernet interface that was assigned the address 172.16.12.20 or the loopback interface, which is always given the address 127.0.0.1. The destination, mask, gateway, and interface define the route.

The remaining field displays supporting information about the route. The Metric field displays a numeric cost associated with the route. The Metric value is only used when a routing protocol is run on the system. For the Windows server administrator, the heart of the routing table is the route, which is composed of the destination, the mask, the gateway, and the interface.

The first route displayed in the Active Routes section of this routing table is the default route, and the gateway specified in this entry is the default gateway. The default route uses one of the reserved network numbers mentioned earlier: 0.0.0.0. The default gateway is used whenever there is no specific route in the table for a destination network address. For example, this routing table has no entry for network 192.168.10.0. If IP receives a datagram addressed to this network, it will send the datagram to the default gateway 172.16.12.1. (The default gateway is also identified with the Default Gateway tag at the end of the Active Routes section.)

The second route displayed is the loopback route for the local host. This is the loopback address, mentioned earlier as a reserved network number, which is used to simplify software and reduce network load. Because every system uses the loopback route to send datagrams to itself, an entry for the loopback interface is in every host’s routing table. The loopback network is 127.0.0.0. The host address 127.0.0.1 associated with the loopback interface is often assigned the hostname localhost.

The third route displayed is the route to the local network (172.16.12.0). The gateway to this network is the Ethernet interface of the Windows system. The last three routes also use the Ethernet interface as their gateway. These three routes are

A route for the network broadcast address 172.16.12.255
A route for multicast addresses
A route for the limited broadcast address 255.255.255.255

Finally, the fourth route is a route to the local host. This Windows system was assigned the address 172.16.12.20. A datagram sent to this address goes through the loopback interface because of the gateway for the fourth route is 127.0.0.1. Without this route, datagrams from the local host addressed to 172.16.12.20 would be sent out over the Ethernet.

All of the gateways that appear in the routing table are on networks directly connected to the local system. In the sample shown above, this means that regardless of the destination address, the gateway addresses all begin with 172.16.12, which is the address of the local Ethernet, or 127.0.0, which is the address of the loopback network. These are the only networks to which this sample host is directly attached, and therefore the only networks to which it can directly deliver data. The gateways that this host uses to reach the rest of the Internet must be on its subnet.

In Figure 2-4, the IP layer of two hosts and a gateway on our imaginary network is replaced by a small piece of a routing table, showing destination networks and the gateways used to reach those destinations. Assume that the address mask used for network 172.16.0.0 is 255.255.255.0. When the source host (172.16.12.2) sends data to the destination host (172.16.1.2), it applies the address mask to determine that it should look for the destination network address 172.16.1.0 in the routing table. The routing table in the source host shows that data bound for 172.16.1.0 is sent to gateway 172.16.12.3. The source host forwards the packet to the gateway. The gateway does the same steps and looks up the destination address in its routing table. Gateway 172.16.12.3 then makes direct delivery through its 172.16.1.5 interface. Examining the routing tables in Figure 2-4 shows that all systems list only gateways on networks to which they are directly connected. This is illustrated by the fact that 172.16.12.1 is the default gateway for both 172.16.12.2 and 172.16.12.3, but because 172.16.1.2 cannot reach network 172.16.12.0 directly, it has a different default route.

Figure 2-4. Table-based routing

A routing table does not contain end-to-end routes. A route points only to the next gateway, called the next hop, along the path to the destination network.^[*] The host relies on the local gateway to deliver the data, and the gateway relies on other gateways. As a datagram moves from one gateway to another, it should eventually reach one that is directly connected to its destination network. It is this last gateway that finally delivers the data to the destination host.

IP uses the network portion of the address to route the datagram between networks. The full address, including the host information, is used to make final delivery when the datagram reaches the destination network.

Address Resolution

The IP address and the routing table direct a datagram to a specific physical network, but when data travels across a network, it must obey the physical layer protocols used by that network. The physical networks that underlay the TCP/IP network do not understand IP addressing. Physical networks have their own addressing schemes. One task of the network access protocols is to map IP addresses to physical network addresses.

A good example of this network access layer function is the translation of IP addresses to Ethernet addresses. The protocol that performs this function is Address Resolution Protocol (ARP), which is defined in RFC 826.

The ARP software maintains a table of translations between IP addresses and Ethernet addresses. This table is built dynamically. When ARP receives a request to translate an IP address, it checks for the address in its table. If the address is found, it returns the Ethernet address to the requesting software. If the address is not found in the table, ARP broadcasts a packet to every host on the Ethernet. The packet contains the IP address for which an Ethernet address is sought. If a receiving host identifies the IP address as its own, it responds by sending its Ethernet address back to the requesting host. The response is then cached in the ARP table.

The arp command displays or modifies the contents of the ARP table. To display the entire ARP table, use the arp -a command. Display individual entries by specifying the individual host after the -a argument on the arp command line. For example, to check the ARP table entry for IP address 192.168.0.2 enter:

    C:>arp -a 192.168.0.2
     
    Interface: 192.168.0.20 --- 0x10003
      Internet Address      Physical Address      Type
      192.168.0.2           00-e0-4c-9b-99-19     dynamic

Check all entries in the table by using with the -a option with no host address. arp -a produces the following output:

            C:> arp -a
     
    Interface: 192.168.0.20 --- 0x10003
      Internet Address      Physical Address      Type
      192.168.0.2           00-e0-4c-9b-99-19     dynamic
      192.168.0.3           00-00-c0-9a-72-ca     dynamic
      192.168.0.12          00-10-a4-8b-8b-97     static

This table tells you that when this host forwards datagrams addressed to 192.168.0.2, it puts those datagrams into Ethernet frames and sends them to Ethernet address 00-00-c0-9a-72-ca.

Two of the entries in the sample table were added dynamically as a result of ARP queries by the local host. These entries are of the type dynamic. The other entry is a static entry added manually by the Windows administrator. We know this because it is of the type static.

ARP tables normally don’t require any static entries because they are built automatically by the ARP protocol, which is very stable. However, if things go wrong, the ARP table can be manually adjusted, as indicated by the static entry in the sample table. See Chapter 14 for an example of when a static ARP table entry might be useful.

Protocols, Ports, and Sockets

Once data has been routed through the network and delivered to a specific host, it must be delivered to the correct user or process. As the data moves up or down the TCP/IP layers, a mechanism is needed to deliver it to the correct protocols in each layer. The system must be able to combine data from many applications into a few transport protocols, and then from the transport protocols into the Internet Protocol. Combining many sources of data into a single data stream is called multiplexing .

Data arriving from the network must be demultiplexed: divided for delivery to multiple processes. To accomplish this task, IP uses protocol numbers to identify transport protocols, and the transport protocols use port numbers to identify applications.

Some protocol and port numbers are reserved to identify well-known services. Well-known services are standard network protocols, such as FTP and Telnet, which are commonly used throughout the network. The IANA assigns protocol numbers and port numbers to well-known services. Officially, assigned numbers are documented at the web site http://www.iana.org. Windows Server 2003 systems document protocol and port numbers in two simple text files.

Protocol Numbers

The protocol number is a single byte in the third word of the datagram header. The value identifies the protocol in the layer above IP to which the data should be passed.

On a Windows system, the protocol numbers are documented in the protocol file.^[1] This file is a simple table containing the protocol name and the protocol number associated with that name. The format of the table is a single entry per line, consisting of the official protocol name, separated by whitespace from the protocol number. The protocol number is separated by whitespace from the alias for the protocol name. Comments in the table begin with a #. An example of a protocol file is shown below:

    C:>type %SystemRoot%system32driversetcprotocol
    # Copyright (c) 1993-1999 Microsoft Corp.
    #
    # This file contains the Internet protocols as defined by RFC 1700
    # (Assigned Numbers).
    #
    # Format:
    #
    # <protocol name>  <assigned number>  [aliases...]   [#<comment>]
     
    ip       0     IP       # Internet protocol
    icmp     1     ICMP     # Internet control message protocol
    ggp      3     GGP      # Gateway-gateway protocol
    tcp      6     TCP      # Transmission control protocol
    egp      8     EGP      # Exterior gateway protocol
    pup      12    PUP      # PARC universal packet protocol
    udp      17    UDP      # User datagram protocol
    hmp      20    HMP      # Host monitoring protocol
    xns-idp  22    XNS-IDP  # Xerox NS IDP
    rdp      27    RDP      # "reliable datagram" protocol
    rvd      66    RVD      # MIT remote virtual disk

The listing above is the contents of the protocol file from a sample Windows Server 2003 system. This list of numbers is by no means complete. If you refer to the Protocol Numbers section of the IANA web site, you’ll see many more protocol numbers. However, even the limited list shown here contains some protocols that this system doesn’t use, but the additional entries do no harm. The protocols table is only used to map protocol numbers to names for programs that reference protocols by name or for programs that wish to display names for protocol number. The protocol numbers are included in the TCP/IP software through header files.

What exactly do the numbers in this table mean? When a datagram arrives and its destination address matches the local IP address, the IP layer knows that the datagram has to be delivered to one of the transport protocols above it. To decide which protocol should receive the datagram, IP looks at the datagram’s protocol number. Using this table, you can see that if the datagram’s protocol number is 6, IP delivers the datagram to TCP. If the protocol number is 17, IP delivers the datagram to UDP. TCP and UDP are the two transport layer services we are concerned with, but all of the protocols listed in the table use IP datagram delivery service directly. Some, such as ICMP, EGP, and GGP, have already been mentioned. Others haven’t, but you don’t need to be concerned with the minor protocols in order to configure and manage a TCP/IP network.

Port Numbers

After IP passes incoming data to the transport protocol, the transport protocol passes the data to the correct application process. Application processes (also called network services) are identified by port numbers, which are 16-bit values. The source port number, which identifies the process that sent the data, and the destination port number, which identifies the process that is to receive the data, are contained in the first header word of each TCP segment and UDP packet.

Port numbers below 1024 are reserved for well-known services (like FTP and Telnet) and are assigned by the IANA. Well-known port numbers (those below 1024) are considered “privileged ports,” which should not be bound to a user process. Ports numbered from 1024 to 49151 are “registered ports.” IANA tries to maintain a registry of services that use these ports, but it does not officially assign port numbers in this range. The port numbers from 49152 to 65535 are the “private ports.” Private port numbers are available for any use.

Port numbers are not unique between transport layer protocols; the numbers are only unique within a specific transport protocol. In other words, TCP and UDP can, and do, both assign the same port numbers. It is the combination of protocol and port numbers that uniquely identifies the specific process to which the data should be delivered.

On Windows Server 2003 systems, port numbers are listed in the services file in the %SystemRoot%system32driversetc directory. There are many more network applications than there are transport layer protocols, as the size of the services table shows. A partial listing of the Windows services file follows:

    # Copyright (c) 1993-1999 Microsoft Corp.
    #
    # This file contains port numbers for well-known services defined by IANA
    #
    # Format:
    #
    # <service name>  <port number>/<protocol>  [aliases...]   [#<comment>]
    #
     
    echo                7/tcp
    echo                7/udp
    discard             9/tcp    sink null
    discard             9/udp    sink null
    systat             11/tcp    users                  #Active users
    systat             11/tcp    users                  #Active users
    daytime            13/tcp
    daytime            13/udp
    qotd               17/tcp    quote                  #Quote of the day
    qotd               17/udp    quote                  #Quote of the day
    chargen            19/tcp    ttytst source          #Character generator
    chargen            19/udp    ttytst source          #Character generator
    ftp-data           20/tcp                           #FTP, data
    ftp                21/tcp                           #FTP. control
    telnet             23/tcp
    smtp               25/tcp    mail            #Simple Mail Transfer Protocol
    time               37/tcp    timserver
    time               37/udp    timserver
    rlp                39/udp    resource        #Resource Location Protocol
    nameserver         42/tcp    name                   #Hostname Server
    nameserver         42/udp    name                   #Hostname Server
    nicname            43/tcp    whois
    domain             53/tcp                           #Domain Name Server
    domain             53/udp                           #Domain Name Server

The format of this file is very similar to the protocol file. Each single-line entry starts with the official name of the service separated by whitespace from the port number/protocol pairing associated with that service. The port numbers are paired with transport protocol names because different transport protocols may use the same port number. An optional list of aliases for the official service name may be provided after the port number/protocol pair.

This file, combined with the protocol file, provides all of the information necessary to deliver data to the correct application. A datagram arrives at its destination based on the destination address in the fifth word of the datagram header. Using the protocol number in the third word of the datagram header, IP delivers the data from the datagram to the proper transport layer protocol. The first word of the data delivered to the transport protocol contains the destination port number that tells the transport protocol to pass the data up to a specific application. Figure 2-5 shows this delivery process.

Figure 2-5. Protocol and port numbers

Despite its size, the services file does not contain the port number of every important network service. You won’t find the port number of every Remote Procedure Call (RPC) service in the services file. Sun Microsystems developed a different technique for reserving ports for RPC services that doesn’t involve getting a well-known port number assignment from IANA. RPC services generally use registered port numbers, which do not need to be officially assigned. When an RPC service starts, it registers its port number with the portmapper, which is is a program that keeps track of the port numbers being used by RPC services. When a client wants to use an RPC service, it queries the portmapper running on the server to discover the port assigned to the service. The client can find portmapper because it is assigned well-known port 111. Portmapper makes it possible to install widely used services without formally obtaining a well-known port. Windows Server 2003 fully supports the portmapper and RPC services.

Sockets

Well-known ports are standardized port numbers that enable remote computers to know which port to connect to for a particular network service. This simplifies the connection process because both the sender and receiver know in advance that data bound for a specific process will use a specific port. For example, all systems that offer Telnet do so on port 23.

Equally important is a second type of port number called a dynamic allocated port. As the name implies, dynamically allocated ports are not preassigned. They are assigned to processes when needed. The system ensures that it does not assign the same port number to two processes and also that the numbers assigned are above the range of well-known port numbers, i.e., above 1024.

Dynamically allocated ports provide the flexibility needed to support multiple users. If a Telnet user were assigned port number 23 for both the source and destination ports, what port numbers would be assigned to the second concurrent Telnet user? To uniquely identify every connection, the source port is assigned a dynamically allocated port number, and the well-known port number is used for the destination port.

In the Telnet example, the first user is given a random source port number and a destination port number of 23 (Telnet). The second user is given a different random source port number and the same destination port. It is the pair of port numbers, source and destination, that uniquely identifies each network connection. The destination host knows the source port because it is provided in both the TCP segment header and the UDP packet header. Both hosts know the destination port because it is a well-known port.

Figure 2-6 shows the exchange of port numbers during the TCP handshake. The source host randomly generates a source port, in this example 3044. It sends out a segment with a source port of 3044 and a destination port of 23. The destination host receives the segment, and responds back using 23 as its source port and 3044 as its destination port.

The combination of an IP address and a port number is called a socket. A socket uniquely identifies a single network process within the entire Internet. Sometimes the terms “socket” and “port number” are used interchangeably. In fact, well-known services are frequently referred to as “well-known sockets.” In the context of this discussion, a “socket” is the combination of an IP address and a port number. A pair of sockets, one socket for the receiving host and one for the sending host, define the connection for connection-oriented protocols such as TCP.

Figure 2-6. Passing port numbers

Let’s build on the example of dynamically assigned ports and well-known ports. Assume a user on host 172.16.12.2 uses Telnet to connect to host 192.168.16.2. Host 172.16.12.2 is the source host. The user is dynamically assigned a unique port number—3382. The connection is made to the Telnet service on the remote host that is, according to the standard, assigned well-known port 23. The socket for the source side of the connection is 172.16.12.2:3382 (IP address 172.16.12.2 plus port number 3382). For the destination side of the connection, the socket is 192.168.16.2:23 (address 192.168.16.2 plus port 23). The port of the destination socket is known by both systems because it is a well-known port. The port of the source socket is known by both systems, because the source host informed the destination host of the source socket when the connection request was made. The socket pair is therefore known by both the source and destination computers. The combination of the two sockets uniquely identifies this connection; no other connection in the Internet has this socket pair.

Use the netstat command to see the active sockets on your Windows Server 2003 system. The -a command-line argument directs the netstat command to show the active sockets and the -n argument shows the sockets as numeric IP addresses and ports. Here is an example:

    D:>netstat -na
     
    Active Connections
     
     Proto  Local Address        Foreign Address       State
     TCP    0.0.0.0:135          0.0.0.0:0             LISTENING
     TCP    0.0.0.0:445          0.0.0.0:0             LISTENING
     TCP    0.0.0.0:1025         0.0.0.0:0             LISTENING
     TCP    0.0.0.0:1026         0.0.0.0:0             LISTENING
     TCP    0.0.0.0:1029         0.0.0.0:0             LISTENING
     TCP    192.168.0.20:135     192.168.0.12:32802    ESTABLISHED
     TCP    192.168.0.20:139     0.0.0.0:0             LISTENING
     UDP    0.0.0.0:445          *:*
     UDP    0.0.0.0:500          *:*
     UDP    0.0.0.0:1027         *:*
     UDP    0.0.0.0:4500         *:*
     UDP    127.0.0.1:123        *:*
     UDP    192.168.0.20:67      *:*
     UDP    192.168.0.20:68      *:*
     UDP    192.168.0.20:123     *:*
     UDP    192.168.0.20:137     *:*
     UDP    192.168.0.20:138     *:*
     UDP    192.168.0.20:2535    *:*

This sample server has active TCP sockets and UDP sockets, as shown by the values in the Proto field of the netstat output. The Local Address column shows the sockets on which the server is actively listening for inbound traffic or on which it is actively communicating with a remote host. When the IP address in the Local Address field is 0.0.0.0, it means the server is listening on every address assigned to the local system’s network interfaces. Note that even if the IP address of the Local Address is 0.0.0.0, a specific port number is always used. The port number maps to the specific application protocol that services the socket. When a specific IP address is displayed in the Local Address, it means the local system will only accept traffic for the socket on the network interface that is assigned that specific address.

In the Foreign Address column, 0.0.0.0:0 means that input from any port or any address is accepted, while a specific address or port means that only traffic originating at that specific host or port will be accepted. Notice that UDP uses *:* instead of 0.0.0.0:0 in this column when UDP will accept input from any port on any host. The different format is used to clearly indicate that UDP is not a connection-oriented protocol and that no connection to the remote address will be made. For the same reason, UDP does not maintain any connection state and therefore the State column is unused for the UDP section of the output. TCP does maintain state. The output shows that most of the sockets are listening for connections and that one connection is currently established. Notice that the connection is between port 135 on the local host and port 32802 on a remote host. Further, notice that 135 is still listening for connections. The well-known port, 135 in this case, is free to listen for more connections even though a connection already exists that uses that well-known port, because it is the pair of socket that define a connection, not the well-known port.

This netstat example illustrates how sockets are used on your system and how you can view them in action. There is much more about the netstat command in Chapter 14.

Summary

This chapter showed how data moves through the global Internet from one specific process on the source computer to a single cooperating process on the other side of the world. TCP/IP uses globally unique addresses to identify any computer on the Internet. It uses protocol numbers and port numbers to uniquely identify a single process running on that computer.

Routing directs the datagrams destined for a remote process through the maze of the global network. Routing uses part of the IP address to identify the destination network. Every system maintains a routing table that describes how to reach remote networks. The routing table usually contains a default route that is used if the table does not contain a specific route to the remote network. A route only identifies the next computer along the path to the destination. TCP/IP uses hop-by-hop routing to move datagrams one step closer to the destination until the datagram finally reaches the destination network.

At the destination network, final delivery is made by using the full IP address (including the host part) and converting that address to a physical layer address. An example of the type of protocol used to convert IP addresses to physical layer addresses is Address Resolution Protocol (ARP). It converts IP addresses to Ethernet addresses for final delivery.

The first two chapters described the structure of the TCP/IP protocol stack and the way in which it moves data across a network. In the next chapter, we move up the protocol stack to look at the type of services the network provides to simplify configuration and use.

^[*]Addresses are occasionally written in other formats, e.g., as hexadecimal numbers. Whatever the notation, the structure and meaning of the address are the same.

^[†]This is only partially true. Multicasting is not supported by every router. Sometimes it is necessary to tunnel through routers and networks by encapsulating the multicast packet inside a unicast packet.

^[*]The high-order bits are the leftmost bits, i.e., those bits on the left hand side of the number when it is written in binary format.

^[*]CIDR is pronounced “cider.”

^[†]The addresses used in this book are treated as if they were public network addresses, but they are really private network numbers.

^[*]This table is also called the forwarding table.

^[*]ANDing refers to one way that binary values are manipulated. It means that if a bit is “on” in the first value AND in the second value, the resulting value also has a bit “on” in that location.

^[*]Some routing protocols, such as Open Shortest Path First (OSPF) and BGP, obtain end-to-end routing information. Nevertheless, the packet is still passed to the next-hop router.

^[1]This file and some other TCP/IP configuration files are found in the %SystemRoot%system32driversetc directory. %SystemRoot% is an environment variable that contains the name of the top-level directory where the operating system files are stored.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 2. Delivering the Data

Create new playlist

Sign In

Sign Up