6
Location Awareness

He sits in the dim glow of his laptop screen, knowing he is more than half a world away from the system he is really working on. It's late at night and the world outside is blanketed by darkness. He moves carefully on the system because, while it's late at night where he is and dark, it would be light and into the business day on the system he is connected to. Fortunately, he isn't directly connected to the system on the other end. Instead, he has bounced through a couple of intermediate systems. He knows that even if someone were watching, having those additional hops in between will make it harder to track him down.

The time difference is something that he always has to factor in to make sure he isn't being too noisy while the legitimate user of the system is trying to use it. If he is using too much network or too much disk, that may get noticed because it will cause performance problems and the user may well take notice of the changes on the system. As a result, he always has to be aware of where the system he has compromised is. He has a number of ways to know this but the easiest is just checking the time zone setting on the system. This isn't always accurate, however, since some servers use Greenwich Mean Time (GMT) as their time zone to be able to line up log files across an organization into a consistent timeline. It will also only give him a region and not a specific location, though he doesn't need that so much.

Systems can identify where they are located in a number of ways. Some of this information is available from the network, and can be as simple as just a time zone from a DHCP server. However, smartphone applications that became reliant on global positioning systems (GPS) to obtain a location have driven a need for devices to get locations in other ways. While mobile applications can acquire location information, they are not alone in this capability. If you visit particular websites, you may notice that your web browser asks if you want to provide a location to the website. Your computer may use different strategies to acquire the location information and provide it to the server that is requesting it.

As more systems become mobile, whether they were designed to be permanently mobile like smartphones and tablets, or whether they are just sometimes mobile like a laptop, the device needs to be more aware of where it is in time and space. There are a number of reasons for this. One reason is that many applications want to know where you are in order to provide more accurate information. Not all devices have global positioning systems (GPS), however, so to provide the same level of service, there needed to be a means that would allow systems without GPS to know where they are.

Although databases are available that track information related to WiFi networks in order to provide location-based services, other ways exist to get information about where a system may be located. As a starting point, just knowing what public Internet Protocol (IP) address is being used can provide information about the location of the system. You can get this information in different ways with varying levels of accuracy.

Time Zones

When it comes to computers, time is relative. Every computer can be configured to know what time zone it is in. This allows computers around the world to correlate events across multiple systems because their timestamps can place events in a consistent time line. A time zone is a recognition that the Earth is a sphere that revolves in space, providing us with a way to measure the passage of time. Because it's a sphere, different parts of the globe are at different times of the day. This is because we use the sun's position in the sky to calculate time. When the sun is directly overhead, more or less, we consider this to be noon. Since the sun is more or less directly overhead at different moments (it would be directly overhead for me on the East Coast when it is nowhere near to being overhead in Los Angeles, for example), we use time zones so that time appears to be normalized. Noon is when the sun is essentially overhead.

The origin or reference time zone is based on the observatory in Greenwich, England. In the 1800s, in light of the importance of the Greenwich Observatory to astronomy and navigation, the prime meridian 0 was established to run through Greenwich. This means that the line of longitude with a degree of 0 is the line of longitude that runs through Greenwich. Every other line of longitude is calculated mathematically based on an origin of that prime meridian.

It's necessary to keep time zones in mind as you are working with any piece of information that has a timestamp. You need to know the time zone the system is in so you can create a coherent understanding of when events happened. Coincidentally, if you are told the time zone, you have a better understanding of where the system is. This is not a guarantee, however, because many systems are configured not to provide that information in their network communications. As an example, Listing 6-1 shows a set of HTTP headers with a timestamp that shows that the time is set to be GMT, or Greenwich Mean Time.

Operating systems handle time zones in different ways. In a Linux system, for example, there may be a file in the /etc directory that points to a file providing specific details about the time zone. You can see in Listing 6-2 that the /etc/localtime file points to a different file altogether, indicating that this system is on the East Coast. Not all Unix-like operating systems will use links to point to the zone file. Some will use a copy of the zone file to stand for the /etc/localtime file. While the time zone suggests it's in New York, New York is just one of the cities that has been designated to indicate what time zone the system is in. The properties of the location “New York” convey to the system that it is in the East Coast time zone and also adheres to daylight savings time. Although you can set the time zone using the graphical user interface components, ultimately what is happening is the time zone is set using the /etc/localtime file.

The process is different on a Windows system, but just as with Linux, everything related to time is relative to where you are in the world in relation to Greenwich Mean Time. In Figure 6.1, you can see a partial list of the time zones that are available to be configured in Windows. According to documentation at Microsoft's Developer's Network, 75 possible time zones can be configured on a Windows system.

image

Figure 6.1: Windows time zones.

Unlike Linux systems where configuration files are typically stored in plaintext files in the /etc directory, Windows systems store their configuration in the registry. As you can see in Figure 6.2, the time zone setting on a Windows system is stored by name in the registry. The key holding this information is HKEY_LOCAL_MACHINESYSTEMCurrentControlSetControlTime zoneInformation.

image

Figure 6.2: Windows registry time zone settings.

Time zones are useful to know about and they can provide some general direction about where systems are located. The challenge, though, is that time zones are not always that reliable. Any user can set any time zone on their system. Additionally, when laptops or other mobile devices move around, the time zone typically remains unchanged, unless the user dislikes the clock on her computer being wrong for the duration of her stay in a different location. Some protocols will include the time zone that has been configured on the server sending the information. However, since this is configured, it may not provide an accurate physical location. What you have is whatever location has been configured on the server.

Using whois

The Internet registries can also provide a large amount of location-related information about IP addresses. When blocks of IP addresses are allocated, the information about the new owner is registered with one of the regional Internet registries. The same is true with domain names and other identifying information related to the Internet. Using this information can also help to provide location information, though, as suggested below, it may not be sufficient. One of the challenges with using the Internet registries is that IP address blocks are generally registered to a company and while the company's business information, including address and phone number potentially, may be available in the registry, there is no guarantee that the IP address you have identified is located at the address provided. Large companies commonly have a headquarters and a number of other locations. The IP address would probably be registered to the headquarters, and the physical address of the corporate headquarters is what you will be able to identify.

The IP address could very well be located in a satellite office somewhere because once IP addresses are allocated to a company, no one bothers to check to see where the addresses are being used. That is entirely up to the discretion of the company the addresses have been registered to.

It's also possible that the information provided within the Internet registry is the service provider that was originally provided with the IP address block. Service providers may hand out blocks for the use of their customers without actually assigning ownership of the block to the company that is using it. This may also provide you some location information, however. In the case of smaller service providers, which may be more likely to engage in this practice because of the limited number of address blocks they have been able to get, their customers are likely to be local to them. If you were a small business in Vermont, for instance, it would be highly unlikely for you to make use of a small service provider in Colorado. This means that if you find that the address you have is registered to a service provider in Colorado in, say, Durango, you have a good idea that the customer who is using the IP address is likely also located in or near Durango.

While none of this may be all that useful if you are thinking about just getting to an end goal, a lot of information can be obtained from a lookup at an Internet registry. Fortunately, you can perform these lookups in a number of ways. One way is to just use the whois command. You can see the use of a command-line version of whois in Listing 6-3.

Using whois, we can see that the owner of the IP address 4.2.2.1—a common DNS caching server that can be used by anyone on the Internet—is Level 3 Communications. Level 3 is located in Broomfield, CO, though from personal knowledge I can tell you that 4.2.2.1 specifically is not located there. However, in smaller organizations that are not Internet service providers, this information may be useful.

If you are not comfortable with command-line utilities or you don't have a Unix-based system to run whois from, you can accomplish the same thing in other ways. For example, whois utilities are available for Windows. You can find them in places where you can get access to free utility software. Additionally, a number of websites will provide you the ability to do whois lookups. These sites work identically to the whois utility that was shown earlier. An example of one of these sites is shown in Figure 6.3. This particular site is at www.whois.com, though a number of other websites will also work. Some of these sites that offer whois lookups only work with domain names, so you have to be sure that you have a site that will do IP address lookups in addition to domain names.

image

Figure 6.3: whois lookup.

This whois lookup is for a different IP address than the lookup done earlier. It's done from the same block of IP addresses, though. You may have noticed from the whois lookup that Level 3 owns the entire 4.x.x.x block of addresses, so anything else in that block of addresses will show Level 3 as one of the owners. Since IP addresses are handed out in a hierarchical fashion, you may see a chain of owners, depending on whether the block has been re-assigned or just temporarily assigned.

As noted before, you can also look up domain names using the same techniques. Domain names are less specific than IP addresses, though you can still obtain the same location information. As you see in Figure 6.3, physical addresses are provided. However, companies that own domain names may have multiple locations so this is just one piece of information. It may be necessary to locate additional pieces of information to be clearer about the location.

Related to using whois, you can also use DNS to obtain location information. At the moment of this writing, the public IP address of my cable modem is 73.219.13.135. Using DNS tools, I can obtain the hostname of that IP address. Though I could do this lookup in multiple ways, I am using the host utility provided in the Linux distribution I am using in Listing 6-4 to obtain the hostname.

Using the hostname, I can determine that the IP address is located in Vermont, which is correct. I can tell that by the portion of the hostname that says vt.comcast.net. This is a subdomain that Comcast uses to house IP addresses and other DNS resources for customers in Vermont. Not all organizations use their DNS hostnames to indicate where those hostnames are located, but generally Internet service providers do because it makes troubleshooting quite a bit easier. These hostnames can also be identified in more of a bulk fashion, so even if the hostname with the location isn't your target, it's possible to get a collection of hostnames that can point at a particular geographic region.

Traceroute

Traceroute is a diagnostic tool used by technical professionals looking to identify a problem with network routing. Traceroute works by making use of the time to live (TTL) IP header field. Normally, IP packets include a default IP header value. Every time a packet passes through a routing device, the time to live field is decremented. Once the TTL reaches 0, the device that decremented the field to 0 returns an ICMP error message to the source of the original message indicating that time to live has been exceeded in transit. Traceroute makes use of this capability. Traceroute will send a message out to a destination with increasing TTL values. The first packet being sent has a TTL of 1. When the very first router (the default gateway on your network) receives the message, it decrements the TTL to 0 and responds with the ICMP error message. Once the sending system receives the message, it has the IP address of the first router.

The sender only has the IP address, though, which means that the system running traceroute has to do a DNS lookup to get the hostname that is associated with the IP address. This is a reverse lookup and requires that whoever owns the IP address has the pointer (PTR) record configured in the DNS server. A reverse address, provided by a PTR record is good to have but it is not required. They are a convenience, and PTR records can be even less likely to be configured. Generally, however, service providers will keep their DNS records up to date. Because they are the ones who will commonly include location information in the hostnames, they are the ones we are going to be most concerned with.

To get an idea where something is located using traceroute information, you simply run a traceroute and look at the output. Once you get the hang of it, reading the location out of the traceroute is fairly easy. The example shown in Listing 6-5 was done from a Mac OS X system and the utility is named traceroute. On a Linux system, it will also be named traceroute. On a Windows system, because traceroute exceeded the 8-character limit of the 8.3 naming convention from the DOS days, the utility is named tracert. Even though the 8.3 naming restrictions no longer exist, the name of the utility has remained tracert, presumably for consistency's sake. Pathping is another Windows utility that can be used to identify a network path.

As noted earlier, the very first thing I get in the output is the IP address of the default gateway on my local network. I am in the habit of using addresses from the 172.16.0.0/12 range on my networks. This is a range of private IP addresses, just as the 192.168.0.0 address range is, which is more common on home networks. Because I don't have a DNS server that includes any of my local systems in it, there is no reverse lookup to be had for that address. The first place we get a real hostname is on line 3. You can see the hostname listed as ge-4-19-ur01.wolcott.ct.hartford.comcast.net. This is a port on a network device in Hartford, CT. The ge indicates that this is a gigabit Ethernet (ge) port. The numbers after that could indicate slot and port in a large chassis. The ur01 indicates a router. Service providers will sometimes use short names to indicate the type of router within the network. If you see cr, it is probably a core router, meaning a device in the core or deep inside the network. An ar router would be an access router, where customers may commonly connect.

In general, you will see the type of interface followed by the slot and port numbers, if they exist, in the first part of the hostname. After that, you may well see the location information. In some cases, as in lines 3–5, the name will be pretty straightforward. You are seeing multiple entries on those lines because traceroute sends three messages. If there are multiple paths through the network to get to a particular location, each successive message may hit a different router in the network. That appears to be the case here. It may also indicate some routing distribution or load balancing, depending on where the message is located.

In some cases, the hostnames are even more specific, depending on the provider. The Comcast entries that include ibone indicate there are routers at 111 8th Avenue in Manhattan. This particular building is owned by Google and has a meet-me room where multiple carriers get together and hand off traffic to one another. The traceroute goes through a few hops in that building before departing to a number of IP addresses that don't have reverse lookups associated with them. Because of that, we don't really know where they are located. However, the traceroute terminates at the hostname lga15s44-in-f4.1e100.net.

The domain name le100.net is a domain name that Google uses to identify servers within its network. This particular hostname enables us to identify another way of looking at locations within service provider hostnames. For a long time, it was fairly common for service providers to identify locations within their network by the code for the airport in the city where the devices are located. If you see a three-letter indicator in a hostname, it may well be an airport code. LGA is LaGuardia Airport, located on Long Island. LGA provides services to Manhattan, so the servers that we have terminated at are located in New York. This doesn't necessarily provide us with a building address, though. Google owns only a small number of buildings in New York City and of those buildings, not all of them house data centers. There is a good chance that the building we have terminated at is also located in 111 8th Avenue.

Traceroute can provide a lot of details that are not only useful for network engineers, but also can provide some location information for investigators, once you learn how to read the output. While IP addresses do map to hostnames, you can get locations from IP addresses in other ways. There is nothing about an IP address that inherently provides a location, but with a little help, we can get fairly specific about where the IP address is located, if you know how to read the hostname that is associated with the IP address.

Geolocation

When Voice over IP (VoIP) services became commonplace, there was a challenge. Federal regulations require telecommunications providers to be able to support enhanced 911 (E-911) services to phone subscribers. Anyone dialing 911 should be able to be located by the phone network. In a traditional phone network, this is easy because the phones are hard-wired to the central office and each subscriber has an address associated with it. If a call comes from a particular phone number using a wired line, it's guaranteed that the call has come from a specific physical address because hard-wired lines can't be moved. When the caller dials 911, the central office knows which public service access point (PSAP) to route the call to.

VoIP, though, uses interface devices that convert traditional phones and the signals they use to IP. These devices can be taken anywhere. As long as they can get an IP address and can communicate with the servers within the VoIP provider network, there is nothing to prevent the service from being used. That, however, causes problems for the service providers because they are required to be able to hand off location information for their subscriber. As noted earlier, there is nothing inherent about an IP address that can provide physical addresses, and while it is possible to read hostnames and network paths to get some location out of them, the hostname and network path don't have nearly the specificity required by E-911.

At a minimum, the service provider needs to be able to know which PSAP to route the call to. There are a number of ways to do this, including just hard-coding the subscriber into a database associated with a particular PSAP. VoIP services are not the only ones where location information from IP addresses is important or at least very useful. As a result, there are databases that will keep track of that information, as well as web interfaces that can perform lookups from IP addresses. In fact, some of these websites will tell you where you are based on your IP address.

As it turns out, a number of geolocation providers and some of the websites that you can do lookups from will provide information from the different databases. Just to demonstrate some of the challenges associated with looking up geographic location from an IP address, you will sometimes get different locations. To highlight that point, Figure 6.4 shows location information related to an IP address belonging to Google. This is information from three different databases, though the site in question, www.iplocation.net, provides results from many other databases. While two of them appear to show the same location, when you look at the latitude and longitude, they are quite different. The two showing the same city will map to very different locations.

image

Figure 6.4: Geolocation lookup.

The third location is not only in a different city and state but most of the way across the United States. A fourth database shows New York and the fifth shows Mountain View again. As a result, you have a start on a location from the IP address but it is by no means definitive. In some cases, all the lookup service is doing is running a whois, getting the owner of the IP address, and providing the city for that owner. As previously discussed, that's not always that useful.

One of the databases at db-ip.com is not only more accurate but will also use IPv6 to perform a lookup. Some of the backbone providers are using IPv6 to communicate back and forth. In Figure 6.5, you can see a lookup of my external IPv6 address. While the address belongs to Comcast, db-ip.com isn't just providing the location of the IP address according to whois because that would be based on Comcast's address, which is not in Vermont. However, while we are very close to a real location, the database maps this address to a town that is nearby rather than the town I am actually located in.

image

Figure 6.5: db-ip.com lookup.

The company MaxMind maintains several databases related to location information and mapping network information. These databases can be integrated with Wireshark to save the effort of performing multiple lookups using a web interface. You can download lite versions of the databases from MaxMind and then tell Wireshark where the databases are using the preferences settings. There is a configuration setting for the locations of GeoIP databases. MaxMind provides databases for both IPv4 and IPv6 as well as information about the autonomous system (AS) number used by service providers for routing purposes, the city where the address is located, and the address in longitude and latitude form.

Once you have a packet capture, you can look at the Endpoints dialog box in the Statistics menu. This collection of information will give you IP addresses that were found in your packet capture, and if there are entries in the MaxMind databases, they will display the information. You can see an example of this in Figure 6.6.

image

Figure 6.6: GeoIP lookup using Wireshark.

Wireshark provides fields for the country the IP address appears to be located in as well as the AS number associated with the service provider, which also yields the name of the service provider. Finally, you can also see the city, longitude, and latitude columns that are associated with the IP address. Not all IP addresses will be able to be looked up in the database. This is especially true in the case of private addresses, because a private address can be associated with multiple networks around the world. As a result, any packets captured from a system on my local network will have no entries in the location columns.

Location-Based Services

Laptops and other mobile systems that don't have the capability to use GPS still have a need for location-based services. As web applications get more functionality and have to provide the same or similar services as truly mobile devices like smartphones, semi-mobile devices like laptops, or even immobile systems like desktop computers, there is a need for the application provider to obtain location-based information. The World Wide Web Consortium (W3C) has developed an application programming interface, called the Geolocation API, and a set of specifications that will allow devices that don't have GPS capability to also provide a location.

This interface is commonly provided in web pages using JavaScript. The JavaScript makes calls to a navigator object looking for the GeoIP information. This may simply be based on information about the IP address that is known using techniques referenced earlier. Other ways exist to obtain location information, however, and there is a good chance that this will continue to change over time. When your browser asks if it is okay to provide location information to the website you are visiting, it is probably using this W3C location interface.

WiFi Positioning

One way to get information about where people may be is to get someone to report on those people. This may be a self-check-in where the user provides information about himself in one form or another. However, it may also be that other people are collecting information and sharing it with a public database. This is partly the case when it comes to the WiFi Positioning System (WPS). WPS is an attempt to provide a way to locate systems using the wireless networks they are connected to. Databases are available to locate WiFi networks, and some of these databases are populated by users who collect the information and submit it to the database provider.

One of these database providers is WiGLE, which is a database for wireless hotspots around the world. Using WiGLE, you can view maps of locations and see the different WiFi networks that may be available within a particular geographic area. WiFi networks not only have a Service Set Identifier (SSID) associated with them, which is the network name, but they also have a Basis Service Set Identifier (BSSID). This looks like and often is a MAC address. The wireless access point, as a network device, has a MAC address associated with the network interface. This MAC address may become the BSSID for the wireless network to provide a layer-2 addressable identifier for the network.

WiGLE and other similar databases will not only store SSID information, but also store BSSID information. You can see an example of both BSSIDs and SSIDs in Figure 6.7.

image

Figure 6.7: Geolocation lookup.

The map shown is a part of the website at wigle.net, and is a location nearby where I am writing this. You can search locations and zoom in on the map. You can see SSIDs like Zombies ate My WiFi, as well as BSSIDs, which just appear to be MAC addresses, and they are quite likely to just be MAC addresses. Because urban or suburban areas are likely to have large numbers of WiFi networks in close proximity, all of these WiFi networks just get overlaid on top of one another on the map. It takes zooming in very closely to be able to differentiate one network from another. Of course, by that point, you may have lost some context.

This is one way that systems can obtain information about their position. Locating systems in physical space can be challenging, and using volunteers to provide information about WiFi networks helps with that effort.

Summary

Locating addresses on the Internet is challenging. You can use a number of tools to help narrow the scope of a search, but very little is highly accurate. You can start with something very broad like using time zone information in network transmissions, if a time zone has been transmitted that is useful. As we have seen, it is not uncommon for servers to simply use Greenwich Mean Time as a time zone because it saves on calculating a relative address with other servers within the infrastructure. This may be more commonly the case with large providers that would have systems scattered across multiple time zones.

The Internet registries can be used to locate information about addresses and domain names using tools like whois, but even that isn't going to be very accurate. You may be able to get a location from the IP address of small businesses, but with larger businesses or service providers, the best you are going to be able to get is the address of the headquarters. This may be nowhere near where the IP address is actually being used. Fortunately, in the case of service providers, DNS information may be useful in providing a location. However, at best, this may provide a city. Rarely will a hostname provide a specific address. As we look at DNS hostnames, we can make use of the traceroute utility to identify the path that packets take through the network, and since service providers will often use location information in the hostname, we can see a geographic path using traceroute.

Databases can provide more specific information about IP addresses, but even those databases, providing AS numbers, longitude and latitude, countries, and service provider names, are often just using information about the owner of the IP address, which may have nothing at all to do with the user of the IP address. However, you can make use of these databases with programs like Wireshark to perform lookups so you can identify locations from within Wireshark. This is far easier than trying to do manual lookups one address at a time. Wireshark will also map all of your addresses, placing dots on a world map indicating where the addresses you have seen are located.

Browsers can provide location information to web applications to provide services similar to the global positioning system. However, even these are not always highly accurate, as they rely on volunteers to obtain the information and submit it to the database provider. A number of databases will track information about WiFi networks around the world, and this can be useful in providing location information, but it's not a guarantee.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.225.31.159