Chapter 14. Troubleshooting network issues

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 14. Troubleshooting network issues

This chapter covers

Using TCP/IP networking to manage network problems
Troubleshooting networks and network interfaces
Managing DHCP connectivity
Configuring DNS for address translation
Troubleshooting inbound network connectivity

When I was a lad, getting new software for a PC meant either writing it yourself or driving to a store and purchasing a box containing a program stored on one or more 5.25” floppy drives. As often as not, remote collaboration required a dot matrix printer and the post office. Streaming videos? Don’t make me laugh. I can’t remember if my first PC even had a modem. If it did, I certainly never used it.

These days, network connectivity is as integral to computing as keyboards and strong coffee. And the way things are going with voice interfaces, like Amazon’s Alexa, it might not be wise to invest too heavily in keyboard manufacturing. (Coffee prospects still look good, though.) The bottom line is that you and the users you support would be pretty helpless without fast and reliable network access.

To deliver fast and reliable access, you’ll need to know how to use network tools and protocols to establish connectivity between your network interfaces and the outside world. And you’ll also need to know how to identify and connect network adapters to your computers so the tools and protocols will have something to work with. We’ll get to that.

But if you’re going to confront the vexing and unpredictable disruptions that can plague your network communication, you’ll first need a solid working knowledge of the basics of the internet protocol suite, often known as the Transmission Control Protocol (TCP) and the Internet Protocol (IP), or TCP/IP for short. Technically, TCP/IP isn’t a Linux topic at all, as the protocols are used universally by all networked devices no matter what OS they’re running. Because the work you’re going to do in this chapter won’t make much sense without taking TCP/IP into account, that’s where we’ll begin. Feel free to skip this section if you’re already comfortable with the material.

14.1. Understanding TCP/IP addressing

A network’s most basic unit is the humble Internet Protocol (IP) address, at least one of which must be assigned to every connected device. Each address must be unique throughout the entire network; otherwise message routing would descend into chaos.

For decades, the standard address format followed the IPv4 protocol: each address is made up of four 8-bit octets for a total of 32 bits. (Don’t worry if you don’t understand how to count in binary.) Each octet must be a number between 0 and 255. Here’s a typical (fake) example:

154.39.230.205

The maximum theoretical number of addresses that can be drawn from the IPv4 pool is over 4 billion (256^4). Once upon a time, that seemed like a lot. But as the internet grew far beyond anyone’s expectations, there clearly weren’t going to be enough unique addresses in the IPv4 pool for all the countless devices seeking to connect.

Four billion possible addresses sounds like a big number until you consider that there are currently more than 1 billion Android smartphones in use; that’s in addition to all the millions of servers, routers, PCs, and laptops, not to mention Apple phones. There’s a good chance your car, refrigerator, and home-security cameras also have their own network-accessible addresses, so something obviously had to give.

Two solutions to the impending collapse of the internet addressing system (and the end of life as we know it) were proposed: IPv6 (an entirely new addressing protocol) and Network Address Translation (NAT). IPv6 provides a much larger pool of addresses, but because it’s still not all that widely deployed, I’ll focus on NAT.

14.1.1. What’s NAT addressing?

The organizing principle behind NAT is brilliant: rather than assign a unique, network-readable address to every one of your devices, why not have all of them share the single public address that’s used by your router? But how will traffic flow to and from your local devices? Through the use of private addresses. And if you want to divide network resources into multiple subgroups, how can everything be effectively managed? Through network segmentation. Clear as mud? Let’s look at how NAT addressing works, to gain a little perspective.

14.1.2. Working with NAT addressing

When a browser on one of the laptops connected to your home WiFi visits a site, it does so using the public IP address that’s been assigned to the DSL modem/router provided by your internet service provider (ISP). Any other devices connecting through the same WiFi network use that same address for all their browsing activity (see figure 14.1).

Figure 14.1. A typical NAT configuration, showing how multiple local devices, each with its own private address, can all be represented by a single public IP address

In most cases, the router uses the Dynamic Host Configuration Protocol (DHCP) to assign unique private (NAT) addresses to each local device, but they’re unique only in the local environment. That way, all local devices can enjoy full, reliable communication with their local peers. This works just as well for large enterprises, many of which use tens of thousands of NAT IP addresses, all behind a single public IP.

The NAT protocol sets aside three IPv4 address ranges that can only be used for private addressing:

10.0.0.0 to 10.255.255.255
172.16.0.0 to 172.31.255.255
192.168.0.0 to 192.168.255.255

Local network managers are free to use any and all of those addresses (there are more than 17 million of them) any way they like. But addresses are usually organized into smaller network (or subnet) blocks whose host network is identified by the octets to the left of the address. This leaves octets to the right of the address available for assigning to individual devices.

For example, you might choose to create a subnet on 192.168.1, which would mean all the addresses in this subnet would start with 192.168.1 (the network portion of the address) and end with a unique, single-octet device address between 2 and 254. One PC or laptop on that subnet might therefore get the address 192.168.1.4, and another could get 192.168.1.48.

Note

Following networking conventions, DHCP servers generally don’t assign the numbers 0, 1, and 255 to network devices.

Continuing with that example, you might subsequently want to add a parallel, but separate, network subnet using 192.168.2. In this case, not only are 192.168.1.4 and 192.168.2.4 two separate addresses, available to be assigned to two distinct devices, but because they’re on separate networks, the two might not even have access to each other (see figure 14.2).

Figure 14.2. Devices attached to two separate NAT subnets in the 192.168.x network range

Subnet notation

Because it’s critically important to make sure systems know what kind of subnet a network address is on, we need a standard notation that can accurately communicate which octets are part of the network and which are available for devices. There are two commonly used standards: Classless Inter-Domain Routing (CIDR) notation and netmask. Using CIDR, the first network in the previous example would be represented as 192.168.1.0/24. The /24 tells you that the first three octets (8×3=24) make up the network portion, leaving only the fourth octet for device addresses. The second subnet, in CIDR, would be described as 192.168.2.0/24.

These same two networks could also be described through a netmask of 255.255.255.0. That means all 8 bits of each of the first three octets are used by the network, but none of the fourth.

You don’t have to break up the address blocks exactly this way. If you knew you weren’t likely to ever require many network subnets in your domain, but you anticipated the need to connect more than 255 devices, you could choose to designate only the first two octets (192.168) as network addresses, leaving everything between 192.168.0.0 and 192.168.255.255 for devices. In CIDR notation, this would be represented as 192.168.0.0/16 and have a netmask of 255.255.0.0.

Nor do your network portions need to use complete (8-bit) octets. Part of the range available in a particular octet can be dedicated to addresses used for entire networks (such as 192.168.14.x), with the remainder left for devices (or hosts, as they’re more commonly called). This way, you could set aside all the addresses of the subnet’s first two octets (192 and 168), plus some of those of the third octet (0), as network addresses. This could be represented as 192.168.0.0/20 or with the netmask 255.255.240.0.

Where did I get these notation numbers? Most experienced admins use their binary counting skills to work it out for themselves. But for a chapter on general network troubleshooting, that’s a bit out of scope and unnecessary for the normal work you’re likely to encounter. Nevertheless, there are many online subnet calculators that will do the calculation for you.

Why would you want to divide your network into subnets? A common scenario involves groups of company assets that need to be accessible to some teams (developers, perhaps), but not others. Keeping them logically separated into their own subnets can be an efficient way to do that.

14.2. Establishing network connectivity

Everyone shows up for work bright and early one Monday morning. They exchange brief but cheerful greetings with each other, sit down at their laptops and workstations all ready for a productive week’s work, and discover the internet can’t be reached. With the possible exception of the cheerful and productive parts, you should expect that this will happen to you one day soon (if it hasn’t already). The source of a network outage could be any of the following:

A hardware or operating system failure on a local machine
Disruption to your physical cabling, routing, or wireless connections
A problem with the local routing software configuration
A breakdown at the ISP level
An entire chunk of the internet itself going down

Your first job will be to narrow the focus of your search by ruling out what’s not relevant. You do that by following a protocol that starts off closest to home, confirming that the fault doesn’t lie within your own local systems, and gradually expanding outward. Figure 14.3 illustrates the process flow.

Figure 14.3. A flow chart illustrating the sequence you might follow when troubleshooting an outbound connectivity problem

Let’s see how all that might work. You’ll begin with fixing problems local computers might have accessing external resources, and then address problems external clients or users might have accessing resources on your servers.

14.3. Troubleshooting outbound connectivity

It’s possible that your computer was never assigned its own IP address, without which it’s impossible to exist as a member in good standing of a network. Run ip to display your network interface devices, and then confirm you’ve got an active external-facing device and that there’s a valid IP associated with it. In the following, the eth0 interface is using 10.0.3.57:

$ ip addr                                                               1
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue
        state UNKNOWN group default qlen 1                              2
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo                                      3
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
7: eth0@if8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue
        state UP group default qlen 1000                                4
    link/ether 00:16:3e:29:8e:87 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.0.3.57/24 brd 10.0.3.255 scope global eth0                  5
       valid_lft forever preferred_lft forever
    inet6 fe80::216:3eff:fe29:8e87/64 scope link
       valid_lft forever preferred_lft forever

1 The addr argument for ip can also be shortened to a.
2 The loopback (lo) interface through which local (localhost) resources are accessed
3 Note how the IP used by the loopback device is 127.0.0.1. This follows standard networking conventions.
4 The interface is listed as UP.
5 The computer’s current public IP address is displayed as the value of inet.

If there’s no IP address listed on the inet line, or there’s no network interface listed altogether, then that’s where you’ll focus your attention.

14.3.1. Tracking down the status of your network

First, confirm that you’ve got a physical network adapter (also called a network interface card, or NIC) installed on your computer and that Linux sees it. You can list all the PCI-based hardware currently installed using lspci. In the following output, lspci found a PCI Express Gigabit Ethernet Controller:

$ lspci
00:00.0 Host bridge: Advanced Micro Devices, Inc. [AMD]
    Family 15h (Models 10h-1fh) Processor Root Complex
[...]
01:00.0 Ethernet controller:
    Realtek Semiconductor Co., Ltd. RTL8111/8168/8411
    PCI Express Gigabit Ethernet Controller (rev 06)        1

1 The term ethernet controller refers to a hardware network interface device.

If lspci returns no NICs, you should consider the possibility that you’ve had some kind of hardware failure.

Note

The Peripheral Component Interconnect (PCI) is a hardware standard used to allow peripheral devices to connect to the microprocessors on computer motherboards through the PCI bus. Various newer standards, like PCI Express (PCIe), also exist, each using its own unique form factor to physically connect to a motherboard.

Besides lspci, you can also use the lshw tool to display the networking hardware your system knows about. By itself, lshw returns a complete hardware profile, but lshw-class network will show you only the subset of that profile that relates to networking. Try it.

A positive result from lspci won’t, by itself, get you too far, because it doesn’t tell you how the device can be accessed from the command line. But it does give you some important information. Take, say, the word Ethernet from the lspci output and use it with grep to search the output of dmesg. As you might remember from chapter 11, dmesg is a record of kernel-related events involving devices. After some trial and error, I discovered that this particular search will work best by including the two dmesg lines immediately following the line containing my search string (using -A 2):

$ dmesg | grep -A 2 Ethernet
[    1.095265] r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
[    1.095840] r8169 0000:01:00.0 eth0<1>: RTL8168evl/8111evl
        at 0xffffc90000cfa000, 74:d4:35:5d:4c:a5, XID 0c900800 IRQ 36
[    1.095842] r8169 0000:01:00.0
        eth0: jumbo features [frames: 9200 bytes, tx checksumming: ko]   1

1 Device designation eth0 is shown as associated with the Gigabit Ethernet device.

Success! You can see that the device was given the eth0 designation. Hold on. Not so fast. Even though eth0 was originally given to the device because Linux now uses predictable interface names (refer back to chapter 10), it might not be the designation the interface is actually using. Just to be safe, you’ll want to search dmesg once again to see if eth0 shows up anywhere else:

$ dmesg | grep eth0
[    1.095840] r8169 0000:01:00.0
        eth0: RTL8168evl/8111evl at 0xffffc90000cfa000, 74:d4:35:5d:4c:a5,
        XID 0c900800 IRQ 36
[    1.095842] r8169 0000:01:00.0
        eth0: jumbo features [frames: 9200 bytes, tx checksumming: ko]
[    1.129735] r8169 0000:01:00.0 enp1s0:
        renamed from eth0                         1

1 The eth0 designation was dropped and replaced with enp1s0.

Aha. It seems that at some point in the boot process the device was renamed enp1s0. OK. You’ve got a properly configured network interface, but still no IP address and still no network connectivity, what’s next? dhclient, but first some background.

14.3.2. Assigning IP addresses

Network devices can get their IP addresses in these ways:

Someone manually sets a static address that (hopefully) falls within the address range of the local network.
A DHCP server automatically gives the device an unused address.

As is usually the case, each approach has its trade-offs. DHCP servers do their work automatically and invisibly, and guarantee that two managed devices are never trying to use the same address. But, on the other hand, those addresses are dynamic, meaning the addresses they’re using one day might not be the ones they get the next. With that in mind, if you’ve been successfully using, say, 192.168.1.34 to SSH into a remote server, be prepared to accommodate for unexpected changes.

Conversely, setting the IPs manually ensures that those addresses are permanently associated with their devices. But there’s always the chance that you may cause addressing conflicts—with unpredictable results. As a rule, unless you have a specific need for a static address—perhaps you need to reliably access a resource remotely using its address—I’d go with DHCP.

Defining a network route

Before looking for an address, you’ll need to make sure Linux knows how to find the network in the first place. If Linux can already see its way through to a working network, then ip route will show you your computer’s routing table, including the local network and the IP address of the device that you’ll use as a gateway router:

$ ip route
default via 192.168.1.1                            1
    dev enp0s3 proto static metric 100
192.168.1.0/24 dev enp0s3 proto kernel scope
    link src 192.168.1.22 metric 100               2

1 Address of the gateway router through which the local computer will access the wider network
2 The NAT network (192.168.1.x) and netmask (/24) of the local NAT network

If a working route isn’t listed, then you’ll need to create one, but you’ll have to figure out the subnet range of your local network first. If there are other computers using the same network, check out their IP addresses. If, say, one of those computers is using 192.168.1.34, then the odds are that the router’s address will be 192.168.1.1. Similarly, if the IP of that connected computer is 10.0.0.45, then the router’s address would be 10.0.0.1. You get the picture. Based on that, here’s the ip command to create a new default route to your gateway:

# ip route add default via 192.168.1.1 dev eth0

Note

The ip commands discussed in this chapter are relatively new and are meant to replace now-deprecated command sets like ifconfig, route, and ifupdown. You’ll still see plenty of how-to guides focusing on those old commands, and, for now at least, they’ll still work, but you should get used to using ip.

Requesting a dynamic address

The best way to request a DHCP address is to use dhclient to search for a DHCP server on your network and then request a dynamic address. Here’s how that might look, assuming your external network interface is called enp0s3:

# dhclient enp0s3
Listening on LPF/enp0s3/08:00:27:9c:1d:67
Sending on   LPF/enp0s3/08:00:27:9c:1d:67
Sending on   Socket/fallback
DHCPDISCOVER on enp0s3 to 255.255.255.255
   port 67 interval 3 (xid=0xf8aa3055)
DHCPREQUEST of 192.168.1.23 on enp0s3 to 255.255.255.255
   port 67 (xid=0x5530aaf8)
DHCPOFFER of 192.168.1.23 from 192.168.1.1             1
DHCPACK of 192.168.1.23 from 192.168.1.1
RTNETLINK answers: File exists
bound to 192.168.1.23 -- renewal in 34443 seconds.     2

1 The address of the DHCP server in this case is 192.168.1.1.
2 The new address is successfully leased for a set time; renewal will be automatic.

Configuring a static address

You can temporarily give an interface a static IP from the command line using ip, but that will only survive until the next system boot. Bearing that in mind, here’s how it’s done:

# ip addr add 192.168.1.10/24 dev eth0

That’s great for quick and dirty one-off configurations, perhaps trying to get connectivity on a stricken system while troubleshooting. But the odds are that you’ll normally prefer to make your edits permanent. On Ubuntu machines, that’ll require some editing of the /etc/network/interfaces file. The file may already contain a section defining your interface as DHCP rather than static.

Listing 14.1. A section in the /etc/network/interfaces file

auto enp0s3
iface enp0s3 inet dhcp

You’ll edit that section, changing dhcp to static, entering the IP address you want it to have, the netmask (in x.x.x.x format), and the IP address of the network gateway (router) that the computer will use. Here’s an example:

auto enp0s3
iface enp0s3 inet static
    address 192.168.1.10
    netmask 255.255.255.0
    gateway 192.168.1.1

On CentOS, each interface will have its own configuration file in the /etc/sysconfig/ network-scripts/ directory. A typical interface set for DHCP addressing will look as shown in the next listing.

Listing 14.2. Configurations in /etc/sysconfig/network-scripts/ifcfg-enp0s3

TYPE="Ethernet"
BOOTPROTO="dhcp"            1
DEFROUTE="yes"
PEERDNS="yes"
PEERROUTES="yes"
IPV4_FAILURE_FATAL="no"
IPV6INIT="yes"
IPV6_AUTOCONF="yes"
IPV6_DEFROUTE="yes"
IPV6_PEERDNS="yes"
IPV6_PEERROUTES="yes"
IPV6_FAILURE_FATAL="no"
NAME="enp0s3"
UUID="007dbb43-7335-4571-b193-b057c980f8d0"
DEVICE="enp0s3"
ONBOOT="yes"

1 Tells Linux to request a dynamic IP for the interface

The next listing shows how that file might look once you’ve edited it to allow static addressing.

Listing 14.3. The static version of a CentOS interface configuration file

BOOTPROTO=none            1
NETMASK=255.255.255.0
IPADDR=10.0.2.10          2
USERCTL=no
DEFROUTE="yes"
PEERDNS="yes"
PEERROUTES="yes"
IPV4_FAILURE_FATAL="no"
IPV6INIT="yes"
IPV6_AUTOCONF="yes"
IPV6_DEFROUTE="yes"
IPV6_PEERDNS="yes"
IPV6_PEERROUTES="yes"
IPV6_FAILURE_FATAL="no"
NAME="enp0s3"
UUID="007dbb43-7335-4571-b193-b057c980f8d0"
DEVICE="enp0s3"
ONBOOT="yes"

1 DHCP addressing won’t be used.
2 Sets the static IP address you want to use

If you want your settings to take effect immediately, you’ll need to restart networking. Most of the time, networking on modern systems is managed by the systemd service, NetworkManager. Instead, on Ubuntu at least, starting or stopping interfaces that are defined in the /etc/network/interfaces file is handled by the networking service. Therefore, if you want to apply the newly edited settings in the interfaces file, you’ll run systemctl restart networking rather than systemctl restart NetworkManager. Alternatively, you could use ip to bring just one interface up (or down):

# ip link set dev enp0s3 up

It can’t hurt to know about some of the places on your system that NetworkManager hides its working files. There’s a configuration file called NetworkManager.conf in the /etc/ NetworkManager/ directory, configuration files for each of the network connections your computer has made historically in /etc/NetworkManager/system-connections/, and data detailing your computer’s historical DHCP connections in /var/lib/NetworkManager/. Why not take a quick look through each of those resources?

14.3.3. Configuring DNS service

If you’ve got a valid network route and an IP address, but the connectivity problem hasn’t gone away, then you’ll have to cast your net a bit wider. Think for a moment about exactly what it is that you’re not able to do.

Is your web browser unable to load pages? (I don’t know, perhaps like bootstrap-it.com, if you’re looking for a great example.) It could be that you haven’t got connectivity. It could also mean that there isn’t any DNS translation happening.

What’s DNS?

It may not look it, but the World Wide Web is really all about numbers. There’s no place called manning.com or wikipedia.org. Rather, they’re 35.166.24.88 and 208.80.154.224, respectively. The software that does all the work connecting us to the websites we know and love recognizes only numeric IP addresses.

The tool that translates back and forth between text-loving humans and our more digitally oriented machines is called the domain name system (DNS). Domain is a word often used to describe a distinct group of networked resources, in particular, resources identified by a unique human-readable name. As shown in figure 14.4, when you enter a text address in your browser, the services of a DNS server will be sought.

Figure 14.4. DNS address query for stuff.com and the reply containing a (fictional) IP address

How does DNS work?

The first stop is usually a local index of names and their associated IP addresses, stored in a file that’s automatically created by the OS on your computer. If that local index has no answer for this particular translation question, it forwards the request to a designated public DNS server that maintains a much more complete index and can connect you to the site you’re after. Well-known public DNS servers include those provided by Google, which uses the deliciously simple 8.8.8.8 and 8.8.4.4 addresses, and OpenDNS.

Fixing DNS

Until something breaks, you normally won’t spend a lot of time thinking about DNS servers. But I’m afraid something might just have broken. You can confirm the problem using the ping tool. If pinging a normal website URL (like manning.com) doesn’t work, but using an IP address does, then you’ve found your trouble. Here’s how that might look:

$ ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=60 time=10.3 ms
64 bytes from 8.8.8.8: icmp_seq=2 ttl=60 time=10.2 ms
64 bytes from 8.8.8.8: icmp_seq=3 ttl=60 time=9.33 ms
^C                                                                1
--- 8.8.8.8 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2002ms
rtt min/avg/max/mdev = 9.339/10.002/10.378/0.470 ms

1 This symbol tells you that the Ctrl-c key combination was used to interrupt the ping operation.

The fix? That depends. A lot of the time, individual computers will inherit the DNS settings from the router through which they connect to the wider network. Which, I guess, means that you’ll be spending the next couple of minutes searching through drawers to recover your router’s login password (hint: the default password is often printed on the router case itself).

Once you do manage to log in, usually from a connected PC that’s running a browser pointed to the router’s IP address, work through the GUI menus of the router’s OS check to make sure the DNS settings are valid. You can also configure DNS settings on your local computer that will override what’s on the router. On a CentOS machine, add references to a couple of public DNS servers to your interface’s /etc/sysconfig/network-scripts/ifcfg-enp0s3 file. This example uses the two IPs used by Google’s DNS servers:

DNS1=8.8.8.8
DNS2=8.8.4.4

And on Ubuntu, add dns-nameserver values to the appropriate interface in the /etc/network/interfaces files:

dns-nameserver 8.8.8.8
dns-nameserver 8.8.4.4

14.3.4. Plumbing

Yep. It’s roll-up-your-sleeves-and-pull-out-the-drain-snake time. If you’ve got a working interface, a route, an IP address, and DNS service, and you still don’t have full connectivity, then there’s got to be something out there blocking the flow.

The Linux equivalent of a drain snake, especially the fancy kind that comes with a video camera at the end, is Traceroute—which, as advertised, traces the route a packet takes across the network on its way to its target. If there’s anything blocking traffic anywhere down the line, Traceroute will at least show you where the clog is. Even if you’re in no position to investigate further, the information could be particularly valuable as your ISP tries to get things going again.

This example shows a successful end-to-end trip between my home workstation and google.com (represented by 172.217.0.238). If anything had gone wrong, the hops displayed would have stopped before reaching the goal. Lines of output containing nothing but asterisks (*) might sometimes represent packets failing to make it back. A complete failure will usually be accompanied by error messages:

$ traceroute google.com
traceroute to google.com (172.217.0.238), 30 hops max, 60 byte packets
 1  ControlPanel.Home (192.168.1.1)
        21.173 ms  21.733 ms  23.081 ms                              1
 2  dsl-173-206-64-1.tor.primus.ca (173.206.64.1)
        25.550 ms  27.360 ms  27.865 ms                              2
 3  10.201.117.22 (10.201.117.22)  31.185 ms  32.027 ms  32.749 ms
 4  74.125.48.46 (74.125.48.46)  26.546 ms  28.613 ms  28.947 ms
 5  108.170.250.241 (108.170.250.241)  29.820 ms  30.235 ms  33.190 ms
 6  108.170.226.217 (108.170.226.217)
        33.905 ms 108.170.226.219 (108.170.226.219)  10.716 ms  11.156 ms
 7  yyz10s03-in-f14.1e100.net (172.217.0.238)  12.364 ms *  6.315 ms

1 The first hop is my local router; the ~20 ms hop times displayed are a bit slow, but acceptable.
2 My internet service provider

Still nothing? Sounds like a good time to put in a phone call to your ISP.

Coming up next: what happens when people within your local network can access everything the big, bad internet has to offer, but your remote workers, clients, and visitors can’t get in to consume the services you offer.

14.4. Troubleshooting inbound connectivity

Whether it’s your company’s website, an API supporting your app, or an internal documentation wiki, there are parts of your infrastructure that you’ll want to be available 24/7. Those kinds of inbound connectivity can be as important to your business or organization as the outgoing stuff we’ve just discussed.

If your remote clients can’t connect to your services or if their connections are too slow, your business will suffer. Therefore, you’ll want to regularly confirm that your application is healthy and listening for incoming requests, that those requests have the access they need, and that there’s enough bandwidth to handle all the traffic. netstat and netcat can help with that.

14.4.1. Internal connection scanning: netstat

Running netstat on a server displays a wide range of network and interface statistics. What would interest you most when faced with a screaming horde of angry web clients, however, is a list of the services that are listening for network requests.

netstat -l will show you all the sockets that are currently open. If it’s a website you’re running, then you can narrow down the results by filtering for http. In this case, both ports 80 (http) and 443 (https) appear to be active:

$ netstat -l | grep http
tcp6 0  0 [::]:http   [::]:*  LISTEN            1
tcp6 0  0 [::]:https   [::]:*  LISTEN

1 The protocol is shown as tcp6, suggesting that this is exclusively an IPv6 service. In fact, it covers both IPv6 and IPv4.

What exactly is a network socket?

To be honest, I’m not 100% sure how to describe it. What it would mean to a C programmer might feel strange to a simple system administrator, like your humble servant. Nevertheless, I’ll risk oversimplification and say that a service endpoint is defined by the server’s IP address and the port (192.168.1.23:80, for instance). That combination identifies the network socket. A connection is created during a session involving two endpoints/sockets (a client and a server).

netstat -i will list your network interfaces. On the surface, that wouldn’t seem like such a big deal; after all, ip addr will do that too, right? Ah, yes. But netstat will also show you how many data packets have been received (RX) and transmitted (TX). OK indicates error-free transfers; ERR, damaged packets; and DRP, packets that were dropped. These statistics can be helpful when you’re not sure a service is active:

$ netstat -i
Kernel Interface table
Iface   MTU Met RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR Flg
enp1s0     1500 0       0       0      0 0      0     0 BMU
lo        65536 0   16062       0      0 0  16062     0 LRU
wlx9cefd5fe6a19  1500 0 1001876 0      0 0 623247     0 BMRU

That example appears to show a healthy and busy wireless interface (with the unfortunate name wlx9cefd5fe6a19), and an interface called enp1s0 that’s inactive. What’s going on? It’s a PC with an unused ethernet port that gets its internet via WiFi. In the code, lo is the localhost interface, also known as 127.0.0.1. This is a great way to assess things from within the server, but how do things look from outside?

14.4.2. External connection scanning: netcat

You’ve used cat to stream text files and zcat to stream the contents of compressed archives. Now it’s time to meet another member of the (feline) family: netcat (often invoked as nc). As you might guess from its name, netcat can be used to stream files across networks, or even to serve as a simple two-way chat app.

But right now you’re more interested in the status of your server and, in particular, how a client will see it. nc, when run against a remote address, tells you whether it was able to make a connection. -z restricts netcat’s output to the results of a scan for listening daemons (rather than trying to make a connection), and -v adds verbosity to the output. You’ll need to specify the port or ports you want to scan. Here’s an example:

$ nc -z -v bootstrap-it.com 443 80
Connection to bootstrap-it.com 443 port [tcp/https] succeeded!
Connection to bootstrap-it.com 80 port [tcp/http] succeeded!

If either or both of those services (HTTP and HTTPS) were not available, the scan would fail. That could be because the service isn’t running on the server (perhaps your Apache web server has stopped) or there’s an overly strict firewall rule blocking access. This is how a failed scan would look:

$ nc -z -v bootstrap-it.com 80
nc: connect to bootstrap-it.com port 80 (tcp) failed: Connection timed out

This is Linux, however, so you can be sure there’s more than one good way to get this job done. Therefore, be aware that nmap can be used to perform a similar scan:

$nmap -sT -p80 bootstrap-it.com
Nmap scan report for bootstrap-it.com (52.3.203.146)
Host is up (0.036s latency).
PORT   STATE SERVICE
80/tcp open  http
Nmap done: 1 IP address (1 host up) scanned in 0.37 seconds

And this nmap command will scan for any open ports between ports 1 and 1023, an excellent way to quickly audit your system to make sure there’s nothing open that shouldn’t be:

$ nmap -sT -p1-1023 bootstrap-it.com
Nmap scan report for bootstrap-it.com (52.3.203.146)
Host is up (0.038s latency).
Not shown: 1020 filtered ports
PORT    STATE SERVICE
80/tcp  open  http
443/tcp open  https
Nmap done: 1 IP address (1 host up) scanned in 4.69 seconds

Which ports “should be” open? That depends on the software you’re running on the server. As a rule of thumb, if nmap reports any unfamiliar open ports, search online to find out what software uses those ports, and then ask yourself whether it’s reasonable for that software to be running on your server.

Summary

Practically at least, Linux defines and manages network interfaces and routes within the context of NAT networking protocols.
Linux needs to recognize attached hardware peripherals like network interfaces, but also designate device labels (like eth0) before they’re usable.
Custom static IP addresses can be assigned to a device both through editing configuration files and from the command line (using ip). Dynamic address can be automatically requested from DHCP servers, but you can’t control the addresses you get.
Confirming that appropriate local services are accessible for remote clients involves scanning for open sockets and ports.

Key terms

TCP/IP is the Transmission Control Protocol and Internet Protocol conventions that define network behavior administration.
Public-facing IP addresses must be globally unique, whereas NAT addresses need to be unique only within their local network. The Dynamic Host Configuration Protocol (DHCP) is commonly used to manage dynamic (nonpermanent) address assignment.
A network route is the address of a gateway router through which a computer gains network access.
The Domain Name System (DNS) provides translations between numeric IP addresses and human-readable URLs allowing convenient navigation of internet resources.
A network socket is the representation of an IP address and a port through which a network connection can be activated.

Security best practices

It’s good to periodically use a tool like nmap to audit your system for inappropriately open ports.

Command-line review

ip addr lists the active interfaces on a Linux system. You can shorten it to ip a or lengthened it to ip address. It’s your choice.
lspci lists the PCI devices currently connected to your computer.
dmesg | grep -A 2 Ethernet searches the dmesg logs for references to the string Ethernet and displays references along with the subsequent two lines of output.
ip route add default via 192.168.1.1 dev eth0 manually sets a new network route for a computer.
dhclient enp0s3 requests a dynamic (DHCP) IP address for the enp0s3 interface.
ip addr add 192.168.1.10/24 dev eth0 assigns a static IP address to the eth0 interface, which won’t persist past the next system restart.
ip link set dev enp0s3 up starts the enp0s3 interface (useful after editing the configuration).
netstat -l | grep http scans a local machine for a web service listening on port 80.
nc -z -v bootstrap-it.com 443 80 scans a remote web site for services listening on the ports 443 or 80.

Test yourself

1
Which of the following is a valid NAT IP address?

11.0.0.23

72.10.4.9

192.168.240.98

198.162.240.98

2
How would you describe an IPv4 network subnet using two octets for network addresses, both with CIDR and netmast notation?

x.x.x.x/16 or 255.255.0.0

x.x.x.x/24 or 255.255.255.0

x.x.x.x/16 or 255.0.0.0

x.x.x.x/16 or 255.255.240.0

3
Which of the following commands will help you discover the designation given by Linux to a network interface?

dmesg

lspci

lshw -class network

dhclient

4
You’re setting up a PC in your office and want it to have reliable network connectivity. Which of the following profiles will work best?

Dynamic IP address connected directly to the internet

Static IP address that’s part of a NAT network

Static IP address connected directly to the internet

Dynamic IP address that’s part of a NAT network

5
Which of the following commands is used to request a dynamic IP address?

ip route

dhclient enp0s3

ip client enp0s3

ip client localhost

6
Which file would you edit to configure a network interface named enp0s3 on a CentOS machine?

/etc/sysconfig/networking/ipcfg-enp0s3

/etc/sysconfig/network-scripts/ipcfg-enp0s3

/etc/sysconfig/network-scripts/enp0s3

/etc/sysconfig/network-scripts/ifcfg-enp0s3

7
What line would you add to a network interface configuration section of the /etc/network/interfaces file on an Ubuntu machine to force the interface to use a Google DNS name server?

DNS1=8.8.8.8

dns-nameserver 8.8.8.8

nameserver 8.8.8.8

dns-nameserver1 8.8.8.8

8
Which of the following will scan the well-known TCP ports on a remote server for accessible, listening services?

nmap -s -p1-1023 bootstrap-it.com

nmap -sU -p80 bootstrap-it.com

nmap -sT -p1-1023 bootstrap-it.com

nc -z -v bootstrap-it.com

Answer key

1.
c

2.
a

3.
a

4.
d

5.
b

6.
d

7.
d

8.
c

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 14. Troubleshooting network issues

Create new playlist

Sign In

Sign Up

Chapter 14. Troubleshooting network issues

14.1. Understanding TCP/IP addressing

14.1.1. What’s NAT addressing?

14.1.2. Working with NAT addressing

Figure 14.1. A typical NAT configuration, showing how multiple local devices, each with its own private address, can all be represented by a single public IP address

Note

Figure 14.2. Devices attached to two separate NAT subnets in the 192.168.x network range

14.2. Establishing network connectivity

Figure 14.3. A flow chart illustrating the sequence you might follow when troubleshooting an outbound connectivity problem

14.3. Troubleshooting outbound connectivity

14.3.1. Tracking down the status of your network

Note

14.3.2. Assigning IP addresses

Defining a network route

Note

Requesting a dynamic address

Configuring a static address

Listing 14.1. A section in the /etc/network/interfaces file

Listing 14.2. Configurations in /etc/sysconfig/network-scripts/ifcfg-enp0s3

Listing 14.3. The static version of a CentOS interface configuration file

14.3.3. Configuring DNS service

What’s DNS?

Figure 14.4. DNS address query for stuff.com and the reply containing a (fictional) IP address

How does DNS work?

Fixing DNS

14.3.4. Plumbing

14.4. Troubleshooting inbound connectivity

14.4.1. Internal connection scanning: netstat

14.4.2. External connection scanning: netcat

Summary

Key terms

Security best practices

Command-line review

Test yourself

Answer key

Table of Contents for
Chapter 14. Troubleshooting network issues