Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 5. Footprinting Tools and Techniques

WHEN THINKING ABOUT HACKING into systems, you might think that hackers simply use a few software tools to gain access to the target. Although it is true that there are a multitude of tools available to facilitate this very action, effective hacking is a process that takes place in phases. Each phase in the hacking process should be undertaken with the goal of uncovering increasingly useful information about a target that can be used in the eventual break-in.

The first phase of hacking is the footprinting phase, which is specifically designed to passively gain information about a target. If done correctly and patiently, it is possible for skilled attackers to gain valuable information about their intended target without alerting the victim to the impending attack. Information that is possible to gain during this phase can be somewhat surprising because it is possible to obtain information such as network range, equipment/technologies in use, financial information, locations, physical assets, and employee names and titles. A typical company generates a wealth of information as a byproduct of its operations, and such information can be used for any purpose that an attacker may have in mind.

In this chapter, the process that hackers use will be introduced along with the techniques that are used during each step of the process. An understanding of the techniques that hackers use will provide valuable insight into not just the mechanics of the process but also how to thwart them in the real world. In this chapter, special emphasis will be placed upon the first of the phases: footprinting.

Chapter 5 Topics

This chapter covers the following topics and concepts:

What the information-gathering process entails
What type of information can be found on an organization's Web site
How attackers discover financial information
What the nature of Google hacking is
How to explore domain information leakage
How to track an organization's employees
How insecure applications are exploited
How to use some basic countermeasures

Chapter 5 Goals

When you complete this chapter, you will be able to:

State the purpose of footprinting
List the types of information typically found on an organization's Web site
Identify sources on the World Wide Web used for footprinting
Show how attackers map organizations
Describe the types of information that can be found about an organization's key employees
List examples of unsecured application used by organizations
Identify Google hacking

The Information-Gathering Process

Although this chapter will place emphasis on the footprinting phase of the hacking and information-gathering process, seven steps are actually used. The steps of the information-gathering process include:

Gathering information
Determining the network range
Identifying active machines
Finding open ports and access points
Detecting operating systems
Using fingerprinting services
Mapping the network

Of the seven steps, footprinting covers the first two steps in the process. Note that steps 1 and 2 are both passive in nature; they do not require direct interaction with the victim. This is one of the key characteristics of footprinting: to gather information about a victim without directly interacting and potentially providing advance notice of the attack. The following list shows some of the activities an attacker can perform when footprinting an organization:

Examine the company's Web site
Identify key employees
Analyze open positions and job requests
Assess affiliate, parent, or sister companies
Find technologies and software used by the organization
Determine network address and range
Review network range to determine whether the organization is the owner or if the systems are hosted by someone else
Look for employee postings, blogs, and other leaked information
Review collected data

Under the right conditions, a skilled hacker can gather the information mentioned here and use the results to fine-tune what will be scanned or probed on the victim. Remember that the most effective tools that can be employed during this phase are common sense and detective work. You must be able to look for the places where a company may have made information available and seek such information. In fact, footprinting may be the easiest part of the hacking process because most organizations generate massive amounts of information that is made available online. Before a skilled hacker fires up an active tool, such as a port scanner or password cracker, he or she will meticulously carry out the footprinting process to plan and coordinate a more effective attack.

The Information on a Company Web Site

When starting the footprinting phase, do not overlook some of the more obvious sources of information, including the company's Web site. As anyone who has used the Internet can attest, Web sites offer various amounts of information about an organization because the Web site has been published to tell customers about the organization. Although Web sites contain much less sensitive data now than was seen in the past, it is still not uncommon to come across Web sites that contain e-mail addresses, employee names, branch office locations, and technologies the organization uses. An example of an average Web site and some information you might find is shown in Figure 5-1.

One problem with Web sites that has only recently been overcome is the amount of sensitive information that can be accessed by the public. Sometimes without even realizing it, a company will publish a piece of information that seems insignificant, but to an attacker that same information may be gold. Consider a practice that used to be quite common: the posting of company directories on the company Web site. Such information may not seem like a problem except that it gives an attacker valuable contact information for employees that may be used to impersonate these individuals. Of course, what is valuable is not just what is visible on a Web site; it can also be the source code or HTML that is used to design the site. It is possible for a particularly astute attacker to browse through the source code and locate comments or other pieces of information that can give insight into an organization.

Figure 5-1. Company management.

The following is an example of HTML code with comments

<html> <head> <title>Company Web page</title> </head> <body>  </body> </html>

The comment included here may seem harmless, but it would tell an attacker the name of the server that is being accessed, assisting in targeting an attack.

Note

Site ripping tools such as Blackwidow Pro or Wget can be used to extract a complete copy of the Web site.

Over the last decade, companies have gotten the message that posting some information on the company Web site is not advisable. In some cases, organizations have removed information that could reveal details about internal process, personnel, and other assets.

On the surface, it would seem that once information is removed from a Web site the problem is eliminated, but this is far from true. In the case of a Web site, the state of a Web site at a particular point in time may still exist somewhere out in cyberspace. One of the tools that a security professional can use to gain information about a past version of a Web site is something known as the Wayback Machine. It is a Web application created by the Internet Archive that takes "snapshots" of a Web site at regular intervals and makes them available to anyone who looks. With the Wayback Machine, it is possible to recover information that was posted on a Web site sometime in the past. However, the information may be hopelessly out of date and of limited use. The Wayback Machine is available at http://www.archive.org/. An example of this Web site is shown in Figure 5-2.

When a Web site address is entered into the Wayback Machine, the site will return a list of dates representing when a Web site was archived with an asterisk next to any date on which a change was made. Although the Internet Archive does not keep exhaustive results on every Web site, the Web sites it does archive can stretch all the way back to 1996. Currently the Internet Archive has a sizable amount of content cataloged estimated to be in excess of 150 billion Web pages and related content. Of note in the Internet Archive is the fact that every Web site on the Internet is not archived, and those that are may not always go back far enough to reveal any useful information. Another potential drawback is that a site administrator, through use of a file called robots.txt, can block the Internet Archive from making snapshots of the site, denying anyone the use of old information. Figure 5-3 shows an example of how far back Web pages go for a specific company.

Note

The Internet Archive is intended to be a historical archive of the Internet for the purposes of research and historical interests. Originally started in 1996, the Internet Archive has grown to include the archived versions of more than 150 billion Web pages; the archive has since been enhanced to include text, video, and images and other content.

Figure 5-2. Wayback Machine query.

Figure 5-3. Wayback Machine results.

Of course, the Internet Archive is only one source from which valuable information can be gleaned about an intended target; another valuable source is job postings. Consider that the job postings a company posts on the corporate Web site or on job boards can give valuable clues into what the infrastructure they use looks like. IT should take note of the skills being requested when examining job postings, paying special interest to the skills section. For example, consider the following posting:

Expertise Required:

Advanced knowledge of Microsoft XP, 7, Server 2003; and products such as Microsoft Access, Microsoft SQL Server, Microsoft IISv6, Visual Basic
Proficient in Excel, Word, and PowerPoint 2007
Relevant Experience/Knowledge Cisco PIX; Checkpoint Firewall helpful but not necessary
Virtual Machine (VMWare), SAP S4P, and other data-gathering systems

Although this is only a snippet of a larger job posting, it still provides insight into what the company happens to be using. Think for a moment how an attacker can make use of the information the company provided. As an example, the attacker could use the information to attempt to fine tune a later attack, doing some research and locating vulnerabilities such as:

Search for vulnerabilities in the discovered products
Scan for application specific configuration issues
Locate product specific defects

If the attacker can successfully use any of these attacks, it is a simple matter to access the target's network and do further harm. On the other hand, if the attacker finds that these vulnerabilities are patched, the posting still provides information on other software in use and insight into the environment.

Note

When a company posts a job on a corporate site or a job posting site such as dice.com or monster.com, care should be taken to sanitize the posting. A company that is thinking ahead may either choose to be less specific on skills or remove information that easily identifies the company in question. Sanitizing seeks to clean up or strip out sensitive information that may be too sensitive or too revealing.

Another gem of information that can be useful in job postings is job location. When browsing a job posting, the location information, when browsed in conjunction with skills, can yield insight into potential activities at a location. When browsing job postings, the appearance of unusual skills at a specific location can be an indicator of activities such as those associated with research and development. An attacker could use the information to target specific locations that are more likely to contain assets of value.

Discovering Financial Information

It is not surprising that an ever-increasing number of attacks are financially motivated in nature. Criminals have discovered that technology can be a new and very effective way of committing old scams on a new medium. For example, consider Albert Gonzales, the hacker convicted of the TJ Maxx hacking attack. According to http://www.informationweek.com, Mr. Gonzales did not pick his targets at random. Targets were footprinted prior to being attacked; the footprinting process was specifically used to determine whether a targeted company made enough money to merit an attack. TJ Maxx is only one of the ever-increasing numbers of victims of cybercrime, numbers that are expected to increase as criminals adopt new methods and technologies.

Figure 5-4. Cisco EDGAR 10-Q.

The Value of Footprinting

How important is footprinting? According to the Information Security Forum (ISF), profit-driven attacks have largely replaced those of the lone wolf hacker. These new attackers rely on careful footprinting to determine and select suitable targets. Groups of organized criminal hackers have even been known to place bogus employees within organizations to provide inside knowledge that can be used to more effectively carry out an attack.

This new mode of attack is designed to steal valuable and sensitive information or customer data for financial gain and profit. Although not unheard of, such crimes are rarely carried out by one person; these attacks are typically the work of criminal networks that bring together specialist skills and expertise.

It is no surprise that the criminal element is quite often attracted to the prospect of monetary gain, and cybercrime is no exception. When a criminal is choosing a company to attack based on whether that company makes enough money, items such as publicly available financial records can prove vital. In the United States, getting information on the financial health of companies is easy because financial records on publicly traded companies are available for review. These financial records are easily accessible through the Securities and Exchange Commission (SEC) Web site at http://www.sec.gov. On the Web site, it is possible to review the Electronic Data Gathering, Analysis, and Retrieval system (EDGAR) database, which contains all sorts of financial information (some updated daily). All foreign and domestic companies that are publicly traded are legally required to file registration statements, periodic reports, and other forms electronically through EDGAR, all of which can be browsed by the public. Of particular interest in the EDGAR database are the items known as the 10-Qs and 10-Ks. These items are quarterly and yearly reports that contain the names, addresses, financial data, and information about acquired or divested industries. For example, a search of the EDGAR database for information about Cisco returns the list of records shown in Figure 5-4.

Closer examination of these records indicates where the company is based, detailed financial information, and the names of the principals, such as the president and members of the board. EDGAR is not the only source of this information, however; other sites provide similar types of information, including the following:

Hoovers—http://www.hoovers.com/
Dun and Bradstreet—http://www.dnb.com/us/
Yahoo Finance—http://finance.yahoo.com/
Bloomberg—http://www.bloomberg.com/

Google Hacking

The previous two methods demonstrated simple but powerful tools that can be used to gain information about a target. The methods both showed how they can be used in unintended and new ways to gain information. One more tool that can be used in ways never really intended is Google. Google contains a tremendous amount of information of all types just waiting to be searched and uncovered. In a process known as Google hacking, the goal is to locate useful information using techniques already provided by the search engine in new ways. If you can construct the proper queries, Google search results can provide hacker useful data about a targeted company. Google is only one search engine; other search engines, such as Yahoo and Bing, are also vulnerable to being used and abused in this way.

Why is Google hacking effective? Quite simply it is because Google indexes vast amounts of information in untold numbers of formats. Google obviously can index Web pages like any search engine. But Google can also index images, videos, discussion group postings, and all sorts of file types such as .pdf, .ppt and more. All the information that Google, or any search engine, gathers is held in large databases that are designed to be searchable; you only need to know how to look.

There are numerous resources about the process of Google hacking, but one of the best is Johnny Long's Google Hacking Database (GHDB) at http://www.hackersforcharity.org/ghdb/. This site offers insight into some of the ways an attacker can easily find exploitable targets and sensitive data by using Google's built-in functionality. An example of what is found at the Web site is seen in Figure 5-5.

The GHDB is merely a database of queries that identifies sensitive data and content that potentially may be of a sensitive nature. Some of the items an attacker can find are available using the following techniques:

Advisories and server vulnerabilities
Error messages that contain too much information
Files containing passwords
Sensitive directories
Pages containing logon portals
Pages containing network or vulnerability data

What makes this possible is the way in which information is indexed by a search engine. Specific commands such as intitle instruct Google to search for a term within the title of a document. Some examples of intitle search strings are shown here:

intitle:"index of" .bash_history
intitle:"index of" etc/shadow
intitle:"index.of" finances.xls
intitle:"index of" htpasswd
intitle:"Index of" inurl:maillog

The keyword "intitle:" directs Google to search for and return pages which contain the words listed after the intitle: keyword. For example intitle: "index of" finance.xls will return pages that contain files of the name finance.xls.

Figure 5-5. Google Hacking Database.

Once these results are returned the attacker can browse the results looking for those that contain sensitive or restricted information that may reveal additional details about the organization.

Another popular search parameter is filetype. This query allows the search to look for a particular term only within a specific filetype. A few examples of the use of this search string are as follows:

filetype:bak inurl:"htaccesslpasswd Ishadowlhtusers"
filetype:conf slapd.conf
filetype:ctt "msn"
filetype:mdb inurl:"account|users|admin|administrators|passwd|password"
filetype:xls inurl:"email.xls"

The keyword "filetype:" instructs Google to return files that have specific extensions. For example, filetype:doc or filetype:xls will return all the word or excel documents.

To better understand the actual mechanics of this type of attack, a closer examination is necessary. With this type of attack an attacker will need some knowledge ahead of time, such as the information gathered from a job posting regarding applications. The attacker can then determine that a company is hosting a Web server and further details such as the type and version (for example, Microsoft IIS 6.0). An attacker can then use this knowledge to perform a search to uncover whether the company is actually running the Web server version in question. For example, the attacker may have chosen to attack Cisco and as such will need to locate the Web servers that are running IIS 6.0 to move their attack to the next phase. Using Google to find Web servers that are running Microsoft IIS 6.0 servers can be accomplished with a simple Google query such as "intitle:index.of "Microsoft-IIS/6.0 Server at" on the Google search page. The results of this search are shown in Figure 5-6. Notice that more than 2,000 hits were returned.

Figure 5-6. Google Hacking Database search results.

A final search query that can prove invaluable is the Google keyword inurl. The inurl string is used to search within a site's uniform resource locator (URL). This is very useful if some knowledge of URL strings or with standard URL strings used by different types of applications and systems is possessed. Some common inurl searches include the following:

inurl:admin filetype:db
inurl:admin inurl:backup intitle:index.of
inurl:"auth_user_file.txt"
inurl:"/axs/ax-admin.pl" -script
inurl:"/cricket/grapher.cgi"

The keyword "inurl:" commands Google to return pages which include specific words or characters in the URL. For example, the search request inurl:hyrule will produce such pages that have the word "hyrule" in it.

These search queries and variations are very powerful information-gathering mechanisms that can reveal information that may not be so obvious or accessible normally. Gaining a careful understanding of each search term and keyword can allow a potential attacker to gain information about a target that may otherwise be out of view. The security professional who wants to gain additional insight into how footprinting using Google hacking works should experiment with each term and what it reveals. Knowing how they are used by attackers can help prevent the wrong information ending up in a Web search of your organization through the careful planning and securing of data.

Exploring Domain Information Leakage

A reality of developing security for any public organization is the fact that some information is difficult or impossible to hide. A public company that wants to attract customers must walk a fine line because some information by necessity will have to be made public while other information can be kept secret. An example of information that should be kept secret by any company is domain information, or the information that is associated with the registration of an Internet domain. Currently many tools are available that can be used for obtaining types of basic information, including these:

Whois
Nslookup
Internet Assigned Numbers Authority (IANA) and Regional Internet Registries (RIRs) to find the range of Internet protocol (IP) addresses
Traceroute to determine the location of the network

Each of these tools can provide valuable information pulled from domain registration information.

Manual Registrar Query

The Internet Corporation for Assigned Names and Numbers (ICANN) is the primary body charged with management of IP address space allocation, protocol parameter assignment, and domain name system management. Global domain name management is delegated to the Internet Assigned Numbers Authority (IANA). IANA is responsible for the global coordination of the Domain Name System (DNS) Root, IP addressing, and other Internet protocol resources.

Figure 5-7. Root Zone Database.

Figure 5-8. EDU registration services.

When the network range is determined manually, the best resource available to make this happen is the IANA Web site at the Root Zone Database page located at http://www.iana.org/domains/root/db/. The Root Zone Database represents the delegation details of top level domains (TLDs), including domains such as .com and country-code TLDs such as .us. As the manager of the DNS root zone, IANA is responsible for coordinating these delegations in accordance with its stated policies and procedures. The Web site can be seen in Figure 5-7.

To fully grasp the process of uncovering a domain name and its associated information, it is best to examine the process step by step. In this example, a search for http://www.smu.edu will be performed. Of course, the target in this scenario has already been set, but in the real process the target would be the entity to be attacked. After the target has been identified (in this case, http://www.smu.edu), move through the list until EDU is located; then click that page. The EDU Web page is shown in Figure 5-8.

At this point, the registration services for the .edu domain are handled by http://www.educause.edu/edudomain. Once the registrant for .edu domains has been identified, it is now possible to use the educause Web site at http://whois.educause.net/ and enter a query for http://www.smu.edu. The results of this query are shown in Figure 5-9.

Because organization and planning are essential skills for security professionals, make note of the information uncovered for later use. While the organization method that each individual uses is unique, consider an organization strategy similar to the matrix located in Table 5-1.

Table 5-1. Initial whois findings.

DOMAIN NAME	IP ADDRESS	NETWORK RANGE	DNS SERVER	POINT OF CONTACT
`http://www.smu.edu`	129.119.64.10		129.119.64.10	Bruce Meikle

Figure 5-9. SMU query.

Note that in a matter of a few clicks, it was possible to obtain very detailed information about the target such as the IP address of the Web server, DNS server IP address, location, point of contact, and more. In fact, of the information gathered at this point the only thing that is noticeably absent is the actual information about the network range.

To obtain the network range requires the attacker to visit at least one or more of the Regional Internet Registries (RIRs), which are responsible for management, distribution, and registration of public IP addresses within their respective assigned regions. Currently there are five primary RIRs (see Table 5-2).

Because RIRs are important to the process of information gathering and hacking, it is important to define the process of using an RIR in the context of http://www.smu.edu. When searching for information on the target, it serves some purpose to consider location; earlier research indicated that the host was located in Dallas, Texas. With this piece of information in hand, a query can be run using the ARIN site to obtain still more information about the domain. The http://www.arin.net site is shown in Figure 5-10.

Table 5-2. Regional Internet registries.

REGIONAL INTERNET REGISTRY	REGION OF CONTROL
ARIN	North and South America
APNIC	Asia and Pacific
RIPE	Europe, Middle East, and parts of Africa
LACNIC	Latin America and the Caribbean
AfriNIC	Africa

Figure 5-10. ARIN site.

Located in the top-right corner of the Web page is a search box labeled "search whois." In this search box, enter the IP address of http://www.smu.edu that was recorded earlier and it is also noted in Table 5-1 for reference. The results are of this search are shown in Figure 5-11.

You can see that the network range is 129.119.0.0-129.119.255.255. With this information, the last piece of the network range puzzle is in place, and a clear picture of the address on the network is built. Network range data provides a critical piece of information for an attacker because it confirms that addresses between 129.119.0.0129.119.255.255 all belong to http://www.smu.edu (these addresses will be examined in the next step of the process). With this last piece of information included, the table should now resemble what is shown in Table 5-3.

Figure 5-11. ARIN results.

Table 5-3. Final whois findings.

DOMAIN NAME	IP ADDRESS	NETWORK RANGE	DNS SERVER	POINT OF CONTACT
`http://www.smu.edu`	129.119.64.10	129.119.0.0–129.119.255.255	129.119.64.10	Bruce Meikle

Automatic Registrar Query

The manual method of obtaining network range information is effective, but it does have the drawback of taking a significant amount of time. You can speed up the process using automated methods to gather this information faster than can be done manually. Several Web sites are dedicated to providing this information in a consolidated view. Numerous Web sites are also dedicated to providing network range information automatically.

Figure 5-12. Domaintools name query.

Some of the more common or popular destinations for searches of this type include the following:

http://www.samspade.org
http://www.betterwhois.com
http://www.allwhois.com
http://geektools.com
http://www.all-nettools.com
http://www.smartwhois.com
http://www.dnsstuff.com
http://www.samspade.org
http://whois.domaintools.com

A point to remember is that no matter what tool the professional prefers, the goal is to obtain registrar information. As an example, Figure 5-12 shows the results of http://whois.domaintools.com when http://www.smu.edu was queried for information.

Underlying all these tools is a tool known as whois, which is software designed to query the databases that hold registration information. Whois is a utility that has been specifically designed to interrogate the Internet domain name administration system and return the domain ownership, address, location, phone number, and other details about a specified domain name. The accessibility of this tool depends on the operating system in use. For Linux users, the tool is just a command prompt away; Windows users have to locate a Windows-compatible version and download it or use a Web site that provides the service.

Whois

The Whois protocol was designed to query databases to look up and identify the registrant of a domain name. Whois information contains the name, address, and phone number of the administrative, billing, and technical contacts of the domain name. It is primarily used to verify whether a domain name is available or whether it has been registered. The following is an example of the whois info for cisco.com

Registrant:

Cisco Technology, Inc.
170 W. Tasman Drive
San Jose, CA 95134
US
Domain Name: CISCO.COM

Administrative Contact:

InfoSec
170 W. Tasman Drive
San Jose, CA 95134
US
408-527-3842 fax: 408-526-4575

Technical Contact:

Network Services
170 W. Tasman Drive
San Jose, CA 95134
US
408-527-9223 fax: 408-526-7373

Record expires on 15-May-2011.

Record created on 14-May-1987.

Domain servers in listed order:

NS1.CISCO.COM 128.107.241.185
NS2.CISCO.COM 64.102.255.44

Note

Whois has also been used by law enforcement to gain information useful in prosecuting criminal activity such as trademark infringement.

By looking at this example it is possible to gain some information about the domain name and the department that is responsible for managing it which, in this case, is the Infosec team. Additionally you will note that we have phone numbers and DNS info for the domain as well, not to mention a physical address that we can look up using Google Earth.

Nslookup

Nslookup is a program to query Internet domain name servers. Both UNIX and Windows come with an Nslookup client. If Nslookup is given an IP address or a fully qualified domain name (FQDN), it will look up and show the corresponding IP address. Nslookup can be used to do the following:

Find additional IP addresses if authoritative DNS is known from Whois
List the MX (mail) server for a specific range of IP addresses

Extracting Information with NSLOOKUP:

nslookup
> set type=mx
> cisco.com
- Server: x.x.x.x
- Address: x.x.x.x#53
Non-authoritative answer:
- cisco.com mail exchanger = 10 smtp3.cisco.com.
- cisco.com mail exchanger = 10 smtp4.cisco.com.
- cisco.com mail exchanger = 10 smtp1.cisco.com.
- cisco.com mail exchanger = 10 smtp2.cisco.com.
Authoritative answers can be found from:
- cisco.com nameserver = ns1.cisco.com.
- cisco.com nameserver = ns2.cisco.com.
- cisco.com nameserver = ns3.cisco.com.
- cisco.com nameserver = ns4.cisco.com.
- ns1.cisco.com internet address = 216.239.32.10
- ns2.cisco.com internet address = 216.239.34.10
- ns3.cisco.com internet address = 216.239.36.10
- ns4.cisco.com internet address = 216.239.38.10

Looking at these results you can see several pieces of information that would be useful, including the addresses of nameservers and mail exchangers. The nameservers represent the systems used to host DNS while the mail exchangers represent the addresses of servers used to process mail for the domain. The addresses should be recorded for later scanning and vulnerability checking.

Internet Assigned Numbers Authority (IANA)

According to http://www.iana.org, "The Internet Assigned Numbers Authority (IANA) is responsible for the global coordination of the DNS root, IP addressing, and other Internet protocol resources." Based on this information, IANA is a good starting point to learn more about domain ownership and to determine registration information. A good place to start is at the Root Zone Database page, which lists all top-level domains, including .com, .edu, .org, and so on. It also shows two-character country codes. Refer to the example shown in Figure 5-7.

DNS 101

Nslookup works with and queries the DNS, which is a hierarchical naming system for servers, computers, and other resources connected to the Internet. This system associates information such as IP address to the name of the resource itself. Once this association is present, it is possible to translate names of systems meaningful to humans into the IP addresses associated with networking equipment for the purpose of locating these devices. DNS can be thought of much in the same way as looking up phone numbers or names in a phonebook. First, a phonebook system is hierarchical with different phonebooks for different regions and within those phonebooks, different area codes. Second, in the phonebook you have names and the phone numbers associated with them, along with other information such as physical addresses, much like DNS. When looking up an individual you simply look up their name and see what their phone number is and call them. In DNS this would be called a forward lookup. You also can call Information and give a number and they can do a reverse lookup where they take the phone number and look up the name associated with it.

For example, for a quick look at information on an .edu domain such as Villanova University, you could start at http://www.iana.org/domains/root/db/edu.html. The top-level domain for .edu sites is http://www.educause.edu/edudomain (and the whois server: whois.educause.edu). The results of this search can be seen in Figure 5-13.

Figure 5-13. EDU whois search result.

The same type of search can be performed against a .com domain such as http://www.hackthestack.com. The results of this search are shown here:

Domain Name: HACKTHESTACK.COM
Reseller: DomainsRus
Created on: 27 Jun 2006 11:15:37 EST
Expires on: 27 Jun 2018 11:15:47 EST
Record last updated on: 31 May 2009 07:18:10 EST
Status: ACTIVE
Owner, Administrative Contact, Technical Contact, Billing Contact:
Superior Solutions Inc
Network Administrator (ID00055881)
PO Box 1722
Freeport, TX 77542
United States
Phone: +979.8765309
Email:
Domain servers in listed order:
NS1.PLANETDOMAIN.COM
NS2.PLANETDOMAIN.COM

Notice that these results also include a physical address along with all the other domain information. It would be possible to take the physical address provided and enter it into any of the commonly available mapping tools and gain information on the proximity of this address to the actual company. Now that the domain administrator is known, the next logical step in the process could be to determine a valid network range.

Determining a Network Range

One of the missions of the IANA is to delegate Internet resources to RIRs. The RIRs further delegate resources as needed to customers, who include Internet service providers (ISPs) and end-user organizations. The RIRs are organizations responsible for control of IPv4 and IPv6 addresses within specific regions of the world. The five RIRs are as follows:

American registry for Internet Numbers (ARIN)—North America and parts of the Caribbean
Ripe Network Coordination Centre (Ripe NCC)—Europe, the Middle East, and Central Asia
Asia-Pacific Network Information Centre (APNIC)—Asia and the Pacific region
Latin American and Caribbean Internet Addresses Registry (LACNIC)—Latin America and parts of the Caribbean region
African Network Information Centre (AfriNIC)—Africa

Per standards, each RIR must maintain point-of-contact (POC) information and IP address assignment. As an example, if the IP address 202.131.95.30 corresponding to http://www.hackthestack.com is entered, the following response is returned from ARIN:

OrgName: Asia Pacific Network Information Centre
OrgID: APNIC
Address: PO Box 2131
City: Milton
StateProv: QLD
PostalCode: 4064
Country: AU
ReferralServer: whois://whois.apnic.net
NetRange: 202.0.0.0-203.255.255.255
CIDR: 202.0.0.0/7
NetName: APNIC-CIDR-BLK
NetHandle: NET-202-0-0-0-1

Take note of the range of 202.0.0.0 to 203.255.255.255. This is the range of IP addresses assigned to the network hosting the http://www.hackthestack.com Web site.

Many other Web sites can be used to mine this same type of data. Some of them include the following:

http://www.all-nettools.com
http://www.Smartwhois.com
http://www.allwhois.com
http://www.Dnsstuff.com
http://www.Samspade.org

The next section shows how a hacker can help determine the true location of the domain and IP addresses previously discovered.

Traceroute

Traceroute is a software program used to determine the path a data packet traverses to get to a specific IP address. Traceroute, which is one of the easiest ways to identify the path to a targeted Web site, is available on both UNIX and Windows operating systems. In Windows operating systems, the command is known as tracert. Regardless of the name the program displays, tracert displays the list of routers on a path to a network destination by using Time to Live (TTL) time-outs and Internet control message protocol (ICMP) error messages. This command will not work from a DOS prompt.

C:	racert www.cisco.com
Tracing route to arin.net [202.131.95.30]
1 1 ms 1 ms 1 ms 192.168.123.254
2 12 ms 15 ms 11 ms adsl-69-151-223-254.dsl.hstntx.swbell.net
  [69.151.223.254]
3 12 ms 12 ms 12 ms 151.164.244.193
4 11 ms 11 ms 11 ms bb1-g14-0.hstntx.sbcglobal.net [151.164.92.204]
5 48 ms 51 ms 48 ms 151.164.98.61
6 48 ms 48 ms 48 ms gi1--1.wil04.net.reach.com [206.223.123.11]
7 49 ms 50 ms 48 ms i-0-0-0.wil-core02.bi.reach.com [202.84.251.233]
8 196 ms 195 ms 196 ms i-15-0.sydp-core02.bx.reach.com [202.84.140.37]
9 204 ms 202 ms 203 ms unknown.net.reach.com [134.159.131.110]
10 197 ms 197 ms 200 ms ssg550-1-r1-1.network.netregistry.net
   [202.124.240.66]
11 200 ms 227 ms 197 ms forward.planetdomain.com [202.131.95.30]

Analyzing these results, it is possible to get better look at what traceroute is providing. Traceroute functions by sending out a packet to a destination with the TTL set to 1. When the packet encounters the first router in the path to the destination it decrements the TTL by 1, in this case setting the value to 0, which results in the packet being discarded and a message being sent back to the original sender. This response is recorded and a new packet is sent out with a TTL of 2. This packet will make it through the first router, then will stop at the next router in the path. This second router then sends an error message back to the originating host much like the original router. Traceroute continues to do this over and over until a packet finally reaches the target host, or until a host is determined to be unreachable. In the process, traceroute records the time it took for each packet to travel round trip to each router. It is through this process that a map can be drawn of the path to the final destination.

In the above results you can literally see the IP address, name, and the time it took to reach each host and return a response giving a clear picture of the path to connect to the remote host and the time to do so.

The next-to-last hop before the Web site will often be the organization's edge device, such as a router or firewall. However, you cannot always rely on this information because security-minded organizations tend to limit the ability to perform traceroutes into their networks.

Tracking an Organization's Employees

You can use the Web to find a wealth of information about a particular organization that can be used to plan a later attack. The techniques so far have gathered information on the financial health of a company, its infrastructure, and other similar information that can be used to build a picture of the target. Of all the information gathered so far, there is one area that has yet to be explored: the human element. Gathering information on human beings is something that until recently has not been easy, but now with the ever-increasing amount of information people themselves put online, the task has become easier. The growing usage of social networking such as Facebook, MySpace, and Twitter have all served to provide information that can be searched and tracked back to an individual. According to Harris Interactive for CareerBuilder.com, 45 percent of employers questioned are using social networks to screen job candidates (and so are attackers). Information that can be uncovered online can include the following:

Posted photographs or information
Posted content about drinking or drug usage
Posting derogatory information about previous employers, coworkers, or clients
Discriminatory comments or fabricated qualifications

The motivation behind providing examples of such information is to give an idea of what the average user of social networking puts on the Internet. An attacker wanting to gain a sense of a company can search social networks and find individuals who work for the target and engage in idle gossip about their work. A single employee of a company talking too liberally about goings on at work can provide another layer of valuable insight that can be used to plan an attack.

Although disgruntled employees definitely are a security threat, there are other less ominous actions that a human can take that will affect security. A single employee can be a source of information leakage that could result in damaging information leaks or other security threats. Consider the fact that it is not uncommon to find an employee posting information on blogs, Facebook, Twitter, or other locations that can be publicly accessed. Other employees have been known to get upset and set up what is known as a "sucks" domain, in which varying degrees of derogatory information are posted. Some of the sites that hackers have been known to review to obtain more information about a target include the following:

FYI

One of the reasons why social networking is such an effective tool is that the typical user of these services does not think of the information that is being shared. Individuals using social networks have been known to post all sorts of activities, such as dating and clubbing, to information about bathroom and eating habits. Perhaps the best example of how loosely people share information in social networks is Twitter. A cursory look at Twitter quickly reveals a treasure trove of information about most users on the service. Keep in mind that the average user of Twitter does not typically use the features in the application to keep their postings private, either because they don't know about these settings or because they simply want to feel important by broadcasting their thoughts to anyone who might listen.

Blogs
Personal pages on a social networking site: Facebook, MySpace, LinkedIn, Plaxo, Twitter, Sucks domains
People-tracking sites

Each of these sites can be examined for names, e-mail addresses, addresses, phone numbers, photographs, and so on. As an example, consider the Peoples Dirt site (http://www.peoplesdirt.com), which is shown in Figure 5-14.

This site is designed to allow individuals to make anonymous posts about other individuals or organizations. Any disgruntled person can post libelous or hate-filled messages.

Weblogs, or blogs, are a good source for information about a targeted company if one can be located. Anyone can go to one of the many free blogging sites and set up a blog on which to post unfiltered comments and observations. As such, attackers have found them a valuable source of information. However, one of the bigger problems with blogs for the attacker is finding a blog that contains the information that may be useful. Consider the fact that a tremendous amount of blogs exist, and of those only a small amount are ever updated; the rest are simply abandoned by the owners. Wading into the sea of blogs on the Internet is a challenge, but using a site such as http://www.blogsearchengine.com will allow for the searches of many blogs quickly. Additional sites such as http://www.wink.com and http://www.spock.com allow users to search personal pages such as Facebook and MySpace for specific content.

Figure 5-14. Peoples Dirt Web site.

Figure 5-15. Zabasearch.

Sucks domains are domain names that have the word "sucks" in the name (for example, http://www.walmartsucks.org and http://www.paypalsucks.com). These are sites in which individuals have posted unflattering content about the targeted company due to a perceived slight or wrong. An interesting note about sucks sites is that although such sites may seem wrong or downright illegal, the comments posted on them have been frequently protected under free speech laws. Such sites are usually taken down, however, partly due to the domain name not actually being used or the domain simply being "parked" (although if the site is active and noncommercial, the courts have sometimes ruled such sites legal).

Note

Even job search sites such as Monster.com and Careerbuilder.com are prime targets for information. If an organization uses online job sites, pay close attention to what type of information is being given away about the company's technology.

Finally, another way of gaining information about an individual is to access sites that gather or aggregate information for easy retrieval. One such site is http://www.zabasearch.com, of which an example search is shown in Figure 5-15. Another similar site to Zabasearch is http://www.spokeo.com, which accumulates data from many sources such as Facebook, public records, photos, and other sources that can be searched to build a picture of an individual.

Figure 5-16. Windows Remote Desktop Web connection.

Exploiting Insecure Applications

Many applications were not built with security in mind. Insecure applications such as Telnet, File Transport Protocol (FTP), the "r" commands, Post Office Protocol (POP), Hypertext Transfer Protocol (HTTP), and Simple Network Management Protocol (SNMP) operate without encryption. What adds to the problem is that some organizations even inadvertently put this information on the Web. As an example, a simple search engine query for terminal service Web access TSWEB (another name for Remote Desktop) returns dozens of hits that appear similar to Figure 5-16. This application is designed to allow users to connect to a work or home computer and access files just as if physically sitting in front of the computer. The problem with locating this information online is that an attacker can use the information to get further details about the organization or even break in more quickly in some cases.

Using Basic Countermeasures

Footprinting can be a very powerful tool in the hands of an attacker who has the knowledge and patience to ferret out the information that is available about any entity online. But although footprinting is a powerful tool, there are some countermeasures that can lessen the impact to varying degrees.

Note

Organizations that are more ambitious should consider attempting to footprint themselves to see firsthand what types of information are currently in the public space and whether such information is potentially damaging.

The following shows some of the defenses that can be used to thwart footprinting:

Web site—Any organization should take a long hard look at the information available on the company Web site and determine whether it might be useful to an attacker. Any potentially sensitive or restricted information should be removed as soon as possible, along with any unnecessary information. Special consideration should be given to information such as e-mail addresses, phone numbers, and employee names. Access to such information should be limited to only those who require it. Additionally, the applications, programs, and protocols used by a company should be nondescript to avoid revealing the nature of services or the environment.
Google hacking—This attack can be thwarted to a high degree by sanitizing information that is available publicly wherever possible. Sensitive information should not be posted in any location, either linked or unlinked, that can be accessed by a search engine as the public locations of a Web server tend to be.
Job listings—When possible, use third-party companies for sensitive jobs so the company is unknown to all but approved applicants. If third-party job sites are used, the job listing should be as generic as possible, and care should be taken not to list specific versions of applications or programs. Consider carefully crafting job postings to reveal less about the IT infrastructure.
Domain information—Always ensure that domain registration data is kept as generic as possible, and that specifics such as names, phone numbers, and the like are avoided. If possible, employ any one of the commonly available proxy services to block the access of sensitive domain data. An example of one such service is shown in Figure 5-17.
Employee posting—Be especially vigilant about information leaks generated by well-intentioned employees who may post information in technical forums or discussion groups that may be too detailed. More important, be on the lookout for employees who may be disgruntled and who may release sensitive data or information that can be viewed or accessed publicly. It is not uncommon for information leakage to occur around events such as layoffs or mergers.
Figure 5-17. Domains by proxy.
Insecure applications—Make it a point to regularly scan search engines to see whether links to private services are available (Terminal Server, Outlook Web App [OWA], virtual private networks [VPNs], and so on). Telnet and FTP have similar security problems because each allows anonymous logon and passwords in cleartext. Consider replacing such applications with a more secure application such as SSH or comparable wherever possible or feasible.
Note
A good proactive step is for a company to research the options to block a search engine's bots from indexing a site. One of the best examples of code that tells search engines how a site can be indexed is the robots.txt file. The robots.txt file can be configured to block the areas a search engine looks, but it can also be accessed by a hacker that can open the file in any commonly available text editor.
Securing DNS—Sanitize DNS registration and contact information to be as generic as possible (for example, "Web Services Manager," main company phone number 555-1212, [email protected]). Have two DNS servers—one internal and one external in the demilitarized zone (DMZ). The external DNS should contain only resource records of the DMZ hosts, not the internal hosts. For additional safety, do not allow zone transfers to any IP address.

CHAPTER SUMMARY

This chapter covered the process of footprinting, or passively obtaining information about a target. In its most basic form, footprinting is simply information gathering that is performed carefully to avoid detection completely, or for as long as possible, while always trying to maintain a stealthy profile. Ultimately, the goal of footprinting is to gather as much information as possible about the intended victim without giving away intentions or even the presence of the attacker involved.

If done carefully and methodically, footprinting can reveal large amounts of information about a target. The process, when complete, will yield a better picture of the intended victim. In most situations, a large amount of time will be spent performing this process with relatively lesser amounts of time being spent in the actual hacking phase. Patience in the information gathering phase is a valuable skill to learn alongside how to actually gain the information. Ideally, information gathered from a well-planned and executed footprinting process will make the hacking process more effective.

Remember, footprinting includes gathering information from a diverse group of sources and locations. Common sources of information used in the footprinting phase include company Web sites, financial reports, Google searches, social networks, and other similar technologies. Attackers can and will review any source of information that can fill out the picture of the victim more than it would be otherwise.

KEY CONCEPTS AND TERMS

Footprinting
Google hacking
Insecure applications
Internet Archive
Internet Assigned Numbers Authority (IANA)
Nslookup
Regional Internet Registries (RIRs)
Social networking site
Traceroute
Whois

CHAPTER 5 ASSESSMENT

What is the best description of footprinting?
1. Passive information gathering
2. Active information gathering
3. Actively mapping an organization's vulnerabilities
4. Using vulnerability scanners to map an organization
Which of the following is the best example of passive information gathering?
1. Reviewing job listings posted by the targeted company
2. Port scanning the targeted company
3. Calling the company and asking questions about its services
4. Driving around the targeted company connecting to open wireless connections
Which of the following is not typically a Web resource used to footprint a company?
1. Company Web site
2. Job search sites
3. Internet Archive
4. Phonebooks
If you were looking for information about a company's financial history you would want to check the _______ database.
Which of the following is the best description of the intitle tag?
1. Instructs Google to look in the URL of a specific site
2. Instructs Google to ignore words in the title of a specific document
3. Instructs Google to search for a term within the title of a document
4. Instructs Google to search a specific URL
If you need to find a domain that is located in Canada, the best RIR to check first would be _______.
You have been asked to look up a domain that is located in Europe. Which RIR should you examine first?
1. LACNIC
2. APNIC
3. RIPE
4. ARIN
SNMP uses encryption and is therefore a secure program.
1. True
2. False
You need to determine the path to a specific IP address. Which of the following tools is the best to use?
1. IANA
2. Nslookup
3. Whois
4. Traceroute
During the footprinting process social networking sites can be used to find out about employees and look for technology policies and practices.
1. True
2. False

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 5. Footprinting Tools and Techniques

Create new playlist

Sign In

Sign Up

Chapter 5. Footprinting Tools and Techniques

The Information-Gathering Process

The Information on a Company Web Site

Note

Note

Note

Discovering Financial Information

Google Hacking

Exploring Domain Information Leakage

Manual Registrar Query

Automatic Registrar Query

Whois

Note

Nslookup

Internet Assigned Numbers Authority (IANA)

Determining a Network Range

Traceroute

Tracking an Organization's Employees

Note

Exploiting Insecure Applications

Using Basic Countermeasures

Note

Note

CHAPTER SUMMARY

KEY CONCEPTS AND TERMS

CHAPTER 5 ASSESSMENT

Table of Contents for
5. Footprinting Tools and Techniques