Again, let's go over some of the core technologies that we'll be using before going into the approaches we'll take or the tools we'll use. In the footprinting phase of reconnaissance, we will be focusing on the technologies of DNS, WHOIS, RWHOIS, and SMTP. Each of these technologies can be leveraged to gather more information on the overall footprint of our target and can help us in building our IP/host name mappings.
2.4.1.1. DNS
The Domain Name System (DNS) can be considered the life and blood of the Internet today. It is much easier for people to remember DNS names than full IP addresses of web sites. DNS, which is used for resolving DNS names into IP addresses and vice versa, can be seen as a database of host information. DNS is widely used by all Internetworking applications, such as web browsers, email, and so on.
DNS has been arranged in a hierarchical naming scheme, known to us as domain names. It functions with a top-down method, where a query begins at the top of the DNS tree and works its way to an endpoint. At the top of this hierarchy (called the “root”) are root servers. Thirteen root servers (logical, not physical) form the top of the DNS tree. The names of these root servers start with the letters A–M, all in the domain root-servers.net.
The next level on the tree is known as the top-level domain (or TLD), which is the label to the right of a domain name delineated by a period. There are two types of TLDs: country code (ccTLDs) and generic (gTLDs). A ccTLD may consist of .uk, .us, .za, or .il, for example. A gTLD may consist of .com, .org, .net, .edu, .mil, and so forth.
Each label to the left of the TLD is then technically a subdomain, until the end is reached and we actually have a full host name description. With that said, the label/subdomain immediately to the left of the TLD is also referred to as the second-level domain. The second-level domain is usually the core of the name, for example, “google,” “syngress,” or “elsevier.” These second-level domains are registered by registrars accredited by the Internet Corporation for Assigned Names and Numbers (ICANN).
ICANN is the decisive authority for domain name assignments, but in 1999 the concept of a Domain Name Registrar was introduced. A registrar is a commercial company, accredited by ICANN to sell domain names. More than 2000 different registrars are accredited and in operation today. Each maintains registration information for the registered domains it manages and makes this information available in the manner and format it chooses.
The decentralization of domain name registration in 1999 has significant implications for the penetration tester. In essence, it means that there is no single location for obtaining information about a given domain, no way of precisely determining where a domain name is registered, and no way of enumerating the domains registered to a single entity. Collectively, this radically reduces the usefulness of the system to the penetration tester. This specifically relates to second-level domain names and not IP address allocations.
Once a domain name has been purchased from a registrar, the owner of the second-level domain can then create as many subdomains as he likes under his domain name. These can be individual hosts or actual subdomains which further segment the owner's name space.
Let's look at a typical DNS request, ignoring DNS caching servers for now. A user opens his or her web browser and types
www.google.com. The machine requests a DNS query from the local DNS server. In theory, the local DNS server first visits one of the root servers and requests the addresses of the TLD servers for the .com domain. The root server will then reply with addresses of the .com TLD servers, to which the local DNS server will go to request the IP address of
google.com. The local DNS server then requests from the
google.com name server the final address of
www.google.com and is returned the address 74.125.95.103. The local DNS server then informs your browser of the address to use and begins to download the first page presented on
www.google.com. Of course, all of this takes place within seconds. This is illustrated in the diagram shown in
Fig. 2.11.
Two key components are used from the domain name space: name servers, and resolvers. A resolver, which functions as a client-side-based tool, will make a DNS request to a name server. The name server will return either the requested information or an address of another name server until the DNS query is resolved. If the DNS name cannot be resolved, an error message will be returned.
Asynchronous Full Transfer Zone requests, which are also known as AXFR or zone transfers, are another type of DNS transaction. Zone transfers are typically used to replicate DNS data across a number of DNS servers or to back up DNS files. A user or server will perform a specific zone transfer request from a name server. If the name server allows zone transfers to occur, all the DNS names and IP addresses hosted by the name server will be returned in human-readable ASCII text.
A DNS database is made up of various types of records, some of which are listed in
Table 2.2.
Table 2.2 Different Types of DNS Records
DNS Record Type | Description |
---|
A | A host's IP address. An address record allowing a computer name to be translated into an IP address. Each computer must have this record for its IP address to be located. |
MX | Host or domain's mail exchanger(s). |
NS | Host or domain's name server(s). |
CNAME | Host's canonical name allowing additional names or aliases to be used to locate a host. |
SOA | Indicates authority for the domain. |
SRV | Service location record. |
RP | Responsible person. |
PTR | Pointer to a canonical name. Usually used for reverse lookups. |
TXT | Generic text record. |
HINFO | Host information record with CPU type and operating system. |
When a resolver requests data from a name server, the DNS returned information may include any of the fields in
Table 2.2.
Sometimes we need to find the DNS name of an IP address, so we perform a reverse lookup query. This will work exactly the same way as a forward lookup,
whereby the resolver will query a name server for a DNS name by supplying the IP address. If the DNS name can be resolved for the IP address, the name server will return the name to the end-user. If not, an error message will be displayed.
DNS will be the key technology used during footprinting. It's a generally well-understood technology and therefore doesn't need much more discussion here. Please note the sidebar on DNS tips, however, as it contains some critical pointers.
TIPHere are some tips to help you get the most out of DNS during the footprinting and verification phases of the attack:
• We use DNS as a bridge between the real world and the cyberworld because it is so ideally positioned for this purpose. However, remember that DNS is a completely unregulated environment, so DNS entries may only ever serve as pointers toward your targets. Fake entries, stale entries, incorrect DNS entries, and entries that point to hosts that can't be reached from the Internet are all commonly found during a penetration test. The verification phase is therefore needed to double-check the findings of your DNS searches.
• Location, location, location! Be sure that you know which server is being used to handle your queries, and that it's the ideal server for the domain you're examining. Remember that by default your DNS query client will be configured to use your local resolver, which may be unsuitable for the queries you're making. Remember also that some ISPs will grant their own clients more DNS privileges than users with “outside” IP addresses. This is especially true for zone transfers, which are sometimes blocked to external users but are allowed to clients of the ISP. It's therefore often worth retrying your queries from a different IP address.
• Understand zone transfer security. Zone transfers are often restricted. However, this is done per name server and is based on source IP address. Thus, where zone transfer requests fail at one server, you will sometimes succeed by changing your location, or simply by trying another server.
• Understand the difference between forward and reverse queries. Forward and reverse DNS queries are not just flipsides of the same coin. The queries are in fact made against two completely separate databases, residing in different zone files, possibly residing on different servers and managed by different organizations. Thus, there is very little reason to expect forward and reverse DNS entries to correlate. The forward DNS zone is typically managed by the domain name owner, whereas the reverse zone is usually managed by the IP subnet owner. Now observe this little gem of logic: If the forward entry and the reverse entry for a given host are the same (or even similar), this suggests that the subnet owner
=
the domain owner, which in turn suggests very strongly that the IP in question is, in fact, associated with the domain we're targeting and hence with our target. This simple yet powerful logic is applied extensively when we use DNS reverse walks during the verification phase of reconnaissance.
2.4.1.2. WHOIS
WHOIS is a protocol for submitting queries to a database for determining the owner of a domain name, an IP network, or an Autonomous System Number (ASN). The information returned by WHOIS contains the owner information which may include email addresses, contact numbers, street addresses, and other relevant metadata. WHOIS is a popular informational protocol service that runs on port 43. When a user issues a WHOIS query to the server, the server accepts the connection and then responds to the query issued by the user and closes the connection.
The information returned by the WHOIS server is formatted in plain ASCII human-readable text. However, as WHOIS servers all over the Internet are administrated and maintained by different organizations, information returned to end-users may vary from server to server. Information returned and functionality may also vary between different WHOIS clients, as some servers may support different client-side flags.
WHOIS proxies are used as a mediator between a WHOIS client and a WHOIS server and typically run over HTTP/HTTPS, meaning that if a client were behind a firewall that rejects direct connections to port 43, a client could possibly access a WHOIS proxy on the Internet using a browser via HTTP. By using a WHOIS proxy, the user never has to be aware of the different WHOIS servers it may have to contact for different lookups. Instead, the proxy will handle determining which server it will need to contact to successfully complete the query automatically. In some cases, WHOIS proxies are even set up to cache data to minimize network traffic and speed delivery of results.
Almost all WHOIS services (servers and proxies) have mechanisms in place to prevent data mining. These restrictions are generally intended to prevent the collection of data for spam and so forth, but they unfortunately also limit the usefulness of WHOIS for intelligence gathering. The lack of standards and centralization among WHOIS services further limits its usefulness and makes it a less than 100% reliable tool.
2.4.1.4. Domain name registries and registrars
If WHOIS is the protocol over which information about DNS domain registration can be queried, the DNS Registry is the organization responsible for registering that domain in the first place, collecting and maintaining information about the registered owner, and making that information available to the Internet in general.
A single registry is typically responsible for one Generic Top-Level Domain (gTLD) such as .com or a Country Code Top-Level Domain (ccTLD) such as .za. This authority is delegated to the registry by IANA—the Internet Assigned Numbers Authority—which is responsible for ensuring that each gTLD has exactly one delegated owner. IANA oversees IP address, top-level domain, and IP code point allocations.
The registry is also responsible for operating the DNS servers for the given gTLD and for making its index available to the Internet using WHOIS or some other interface. The political structure of registries varies—some are governments, some are not-for-profit, and others are full commercial ventures.
2.4.1.5. SMTP
The Simple Mail Transfer Protocol (SMTP) is used for sending and receiving email between email clients and servers. When an SMTP server receives an email from a mail client, the SMTP server will then check the MX records for the domain in the email address in order to exchange the mail with the remote SMTP server. It will then either process the mail (if it is the MX server) or forward it to the appropriate SMTP server.
For SMTP to work properly, a set of MX records has to be defined within the name server's DNS database for the recipient's domain. An MX record has two
specific pieces of information—a preference number, and the DNS name of the mail server that's configured to handle mail for that domain. If there is more than one mail server for the domain, the SMTP server will choose one based on its preference number. The lowest number will have the highest priority, and based on availability, the SMTP server will work its way up from there.
One can view the headers of a received email to see the path the email traveled from client to server to destination endpoint. Each time an email is passed to and from an SMTP server, information regarding the server is recorded in the header.
Fig. 2.12 shows an example of an email header with SMTP server information using the RFC 2822 (
www.ietf.org/rfc/rfc2822.txt) format.
Once the local mail server receives the mail message, it is given an initial header (received by), which appears as:
Received: from [sending-host’s-name] [sending-host’s address] by [receiving-host’s-name]
[software-used]
with [message-ID]
for [recipient’s-address]; [date][time][time-zone-offset]
You can see two examples of such headers in
Fig. 2.12. The message then progresses through numerous mail relays where the message is given appended header information. The mail is eventually received by the recipient's mail server and is stored in the recipient's mail account (inbox), where the user downloads it. At this stage, the message has received a final header. Additional information given by the headers includes Message IDs, Multipurpose Internet Mail Extensions (MIME) version, and content type.
MIME is a standard for handling various types of data, and essentially it allows you to view mail as either text or HTML. Other MIME types are defined that enable mail to carry numerous attachment types. A message ID is assigned to a transaction by a particular host (the receiving host, or the “by” host). Administrators use these message IDs to track transactions in the mail server logs.
Mail headers are interesting to us because they show us where the mail servers are. In addition, they tend to deserve special attention because mail servers are usually where the people are, and that's usually right at the heart of the network. Mail servers are very seldom hosted outside the private network in larger organizations and thus represent an organization's core infrastructure to us.