Forget about computer security for a moment and imagine that you’re a big-time bank robber. If you were a robber in the days of the Wild West, you pretty much could have walked into a bank, broken the lock on the safe, and ridden off into the sunset. Nowadays, with electronic alarm systems, blast proof vaults, and other security mechanisms, pulling off a bank heist is much harder. In fact, you’re almost guaranteed to get caught. No longer so bold, today’s professional bank robber tries to gather as much information about the target as possible before committing the actual crime—for example, the number of security guards on duty, escape routes, and building blueprints. The reason is simple: doing this homework ahead of time greatly increases the likelihood of success and, more important, decreases the likelihood of capture.
Attackers in the computer world behave much like their real-world counterparts. Before starting their attack, they use several different techniques and public sources of information to gather information about your organization. If the attackers have a clear roadmap of the network and know where the holes are, they will get in, do their damage, and be gone before you have a chance to answer your pager. In this chapter, we’ll explore a variety of techniques and public sources of information that attackers commonly use, and we’ll look at countermeasures you can use to hinder their information reconnaissance.
Information reconnaissance is the process of gathering information about a target without actually attacking the target itself. Attackers find this information useful because it gives them a good idea about your organization and where to focus their efforts. What exactly are they looking for? That’s a good question but a tough one to answer because it depends on what the attackers are after. Table 8-1 describes some common types of information attackers seek.
Table 8-1. Common Types of Information Sought by Attackers
Information | Why this information is useful to attackers |
---|---|
System configuration | Attackers love to know your system configuration because it helps them plan what they will need to try and what they won’t. For instance, if attackers know your organization is a Microsoft Windows shop, they (the smart ones, at least) won’t waste time trying Solaris-based attacks against you. Again, the quicker and more decisive the attack, the less likely it will be noticed. |
Valid user accounts | Knowing valid user account names and formats is useful to attackers during brute-force attacks. (See Chapter 15.) Having this type of information in hand saves them wasting time guessing passwords against invalid user names. |
Contact information | A business phone number has great use to an attacker during a war dialing attack. (See Chapter 13.) Actual employee names are also useful to attackers for establishing context during social engineering attacks. (See Chapter 23.) |
Extranet and remote access servers | Extranets and remote access servers give employees and business partners access to internal resources when working offsite. Accounts belonging to business partners are notorious for weak passwords. Attackers know this and try to take advantage of it. |
Business partners and recent acquisitions or mergers | When businesses merge, their computer networks are often merged too. This is a complex task, and if it isn’t done carefully, it can create vulnerabilities in the resulting network, which become easy entry points for attackers. |
Again, the type of information an attacker will find useful depends on what he’s after, so don’t consider Table 8-1 to be definitive.
It is difficult to protect your organization against information reconnaissance because the information gathered by attackers is in the public arena. So how do you stop an attacker from using information that is intended to be public in the first place? The answer is you can’t, but a good rule of thumb is this: only disclose information about your organization when there is a good business reason for doing so. Everything else—keep it hidden!
A good starting point for an attacker who wants to find basic information about your organization, such as physical location, contact information, and supporting Internet infrastructure, is to contact the registrar with which your organization has registered its domain name. The information available in registrar records can be useful in social engineering and war dialing attacks, so the information recorded at the registrar should be reviewed regularly.
If your organization is a U.S. government (.gov and .fed) or military (.mil) body, you can look up your registrar information at http://www.dotgov.gov/whois.html and http://whois.nic.mil, respectively, and skip these steps.
To retrieve your organization’s registrar information, follow these steps:
Visit the InterNIC Whois search interface at http://www.internic.com/whois.html. In the search field, enter your organization’s domain name—for example, Microsoft.com—and click Submit.
The search results will indicate the registrar with which your organization registered its domain name. In the Microsoft.com example, the registrar responsible for this domain name is Network Solutions, Inc. (See Figure 8-1.)
Visit the Web page of your organization’s registrar, and follow the instructions on performing a whois query against your domain name. You’ll see something like the Web page shown in Figure 8-2.
There’s a lot of information an attacker can use residing on your registrar’s database. Your organization’s physical location information, for example, can be used by attackers in war driving attacks. Your organization’s phone numbers can be used in war dialing attacks. See the problem here? The registrar needs this information, but your better sense tells you not to let the attacker have this information. Well, if you can’t control the information from appearing at the registrar, you at least have control over the details in information that gets exposed:
E-mail addresses and contact names. Instead of using real names like "John Doe," use something based on job roles like "Technical Admin." Instead of using valid e-mail addresses, use mail forwarding accounts such as [email protected]. It’s easier to maintain, especially when people leave jobs, and it keeps your attackers guessing for a while.
Telephone numbers. Use numbers that are not positioned in your primary telephone ranges. For example, if your company phone lines sit in a range like 123-456-0000 to 123-456-9999, purchase a phone number like 123-789-1111 that sits outside that range and list that in your registrar records. A 1-800 number is also suitable.
Physical addresses. Use post office boxes instead of headquarter addresses to thwart casual dumpster-divers.
Unnecessary information. Provide only the minimum amount of information about your organization required by your registrar. Anything extra such as business hours or names of business partners should be removed unless absolutely required.
All these countermeasures are meant to force attackers to do more research to get at the real goods in the hope that they will give up in the process. At the very least, if you can keep them guessing, that’s more time you have to detect their activities during their attack, and if that’s the case, you’ve made real gains.
Just as our bank robbers need to know the location of the bank they are trying to rob, computer attackers also need to know the location of the target they will be attacking. A useful resource to determine this is the American Registry for Internet Numbers (ARIN) database, which can be found at http://www.arin.net. This public database provides a Web-based front end that allows users (and attackers) to determine, based on company or domain name, organizational IP network block assignments for North America, parts of the Caribbean, and subequatorial Africa. For other locations around the world, consult Table 8-2.
Table 8-2. Whois Databases and Regions
Region or organization | Whois database |
---|---|
North America, parts of the Caribbean, and subequatorial Africa | American Registry for Internet Numbers (ARIN), http://www.arin.net |
Europe | Réseaux IP Européens Network Coordination Centre (RIPE NCC), http://www.ripe.net |
Asia | Asia Pacific Network Information Center (APNIC), http://www.apnic.net |
Latin America and the Caribbean | Latin America and Caribbean Internet Address Registry, http://www.lacnic.net |
Before you begin conducting penetration tests against your organization, you need to determine which IP network blocks to test. The next section shows you how to do that.
To determine your organization’s IP network block assignment, follow these steps:
Visit the whois database that corresponds to your region. For instance, if your organization is Microsoft Corporation, you would go to the ARIN WHOIS database at http://www.arin.net.
Enter the name of your organization in the Whois search field, as shown in Figure 8-3.
The output returned by the whois search will show the valid IP network blocks assigned to your organization. (See Figure 8-4.)
The whois database for each region is maintained as a public resource. This means that at any time any individual, including attackers, can access and utilize the information stored in these databases. Your best defense is to conduct regular security audits of each IP network block listed for your organization along with internal segments and find out where you’re vulnerable before someone else does. Remember: if you don’t do this, some attacker out there will be more than happy to do it for you.
A handy source of information for attackers is your organization’s own website. Only information you can safely disclose to the public should be sitting on your Web servers. Not every organization follows policies like this one, and even if yours does, mistakes do happen and occasionally nonpublic information finds its way out into the public in the form of Web content. Attackers know this, and in fact they are counting on it. Information commonly found on websites that might be useful to attackers includes the following:
System and network configurations
Valid user accounts
Contact information such as telephone numbers and e-mail addresses
Extranet and remote access servers
Business partners and recent acquisitions or mergers
Confidential information embedded in HTML page sources
Weaknesses in your organization’s security policies and processes
If you absolutely need to distribute nonpublic information over public channels such as the Internet or some other external means, you should do so on a server that does not allow anonymous access. You should also review the justification for putting nonpublic information on an external server and the consequences if that information were to be leaked. What happens if the server is compromised or misconfigured?
The two methods of reviewing Web server content are manual and automated reviews. Neither of these methods is intended as a replacement for the other, so you should use both methods when you conduct your reviews.
Here’s the old-fashioned way of reviewing Web server content—that is, doing it manually. You can manually review your website in the following ways:
Explore your organization’s website with a browser, and look for inappropriate content.
Inspect the HTML source of each Web page. Look for hidden input tags containing passwords and leaky HTML comment tags. If you’re using Internet Explorer, you can easily do this on the Web page you want to inspect by selecting View from the toolbar and then Source. Notepad.exe will open loaded with the source code of the Web page, as shown in Figure 8-5. Next, search for strings such as "input" and "<!-- to ensure no passwords or inappropriate comments are embedded.
Leverage embedded search engines. If your website has an embedded search engine, try searching for keywords such as password, configuration, administrator, user name, firewall, confidential, .doc, .vsd, and .xls to see what gets returned. Try searching on the code name of confidential projects being developed in your organization but not yet released to the public; you might be surprised at what you get back.
Reviewing the source code of websites can be tedious and prone to error, especially in the case of a large enterprise website. An easy way to find instances of potentially sensitive information in Web content is to use a Windows utility named Findstr.exe along with regular expressions.
A regular expression is a concise way to represent a string pattern for search-and-replace operations. This might sound a little intimidating, but you’ve probably used regular expressions without even knowing it. For example, does "dir *.html" look familiar? Well, guess what? The regular expression "*.html" matches any file with an ".html" extension. More detail about regular expressions and their uses in Findstr.exe can be found at http://www.microsoft.com/windowsxp/home/using/productdoc/en/default.asp?.
Here’s how to use Findstr.exe with regular expressions to search for sensitive information:
Have your Web administrator make you a copy of the website you want to review. If you are using Internet Information Server (IIS), the Web root is typically found in %SYSTEMDRIVE%InetPubwwwroot on the file system. Place Web root files in a directory on your system. This example uses C:WebReview.
From a command prompt, search recursively through the C:Web-Review folder for patterns you want to find using Findstr.exe:
findstr.exe /s /n /i <regular_expression> <target_directory>*
Replace <regular_expression> with the pattern you’re trying to find and <target_directory> with the directory where you placed your Web root files. In our example, <target_directory> is C:WebReview. For instance, if you are searching for e-mail addresses, you could use the following command:
findstr.exe /s /n /i ".[@]"microsoft.com <target_directory>*
Figure 8-6 shows an example of the results.
Review each instance that Findstr.exe returns for potentially sensitive information. If no business reason exists for this material to be publicly available, you should remove it from production Web servers.
Table 8-3 shows common information to look for when reviewing your website content and corresponding regular expressions.
Table 8-3. Regular Expressions for Common Information Sought by Attackers
Information type | Regular expression |
---|---|
Telephone numbers | "[0–9][0–9][0–9][.-][0–9][0–9][0–9][.-][0–9][0–9][0–9][0–9]" |
Example: 123.456.7890 or 123-456-7890 | |
Telephone numbers | "([0–9][0–9][0–9])[0–9][0–9][0–9]-[0–9][0–9][0–9][0–9]" |
Example: (123)456-7890 | |
Valid e-mail addresses | ".[@]"yourdomain.yourtopleveldomain Example: ".[@]"Microsoft.com |
HTML form input tags | "<input.*>" |
HTML comments | "<!—" |
Occurrences of keywords such as password, pwd, secret, or confidential | "password" "pwd" "secret" "confidential" |
There is nothing you can do to stop attackers from trolling your organization’s public Web pages for valuable tidbits of information. What you can control, however, is the type of information published on those pages, by doing the following:
Create a policy for the type of information that is allowed to reside on your Web servers. Make a point of disallowing information on system configurations and other sensitive information related to your organization. It’s also a good idea to review any existing policies.
Review the content on your Web servers using both manual and automatic techniques on a regular basis and remove any materials not suitable for a public audience.
Search engines are a great tool for finding information about a subject. For example, if you want to find information about your favorite rock band, you will probably use a search engine. If an attacker wants to find more information about your organization, he will use a search engine. Simple searches against your organization’s name on public search engines can return a wealth of information useful to attackers, such as business partners and mergers, host information, and mailing lists.
Reviewing your organization’s website can be done using one or more search engines. Try searching for information related to your organization with valid user names and keywords such as password, configuration, firewall, confidential, .doc, .vsd, and .xls, and see what gets returned. Also search against confidential project code names, internal lingo used at your organization, user names, and other information uniquely related to your organization.
Depending on the search engine you’re using, several tricks can narrow the scope of the results you get. On MSN and Google, for instance, you can limit your search results to a specific domain by appending the string "site," followed by a colon and the domain you want to search. So, if you want to search for information about the .NET Framework but want to see results only from http://www.microsoft.com, you can enter .NET Framework site:microsoft.com as your search string, which results in the list shown in Figure 8-7.
Each search engine has a different set of tricks and keywords you can use to narrow your results. These keywords are normally discussed in the advanced search or help section for each search engine.
Finally, be sure to use a variety of search engines for your review. Different search engines yield different, and sometimes very fruitful, results. Table 8-4 lists popular search engines you should use.
Table 8-4. Popular Search Engines
Search engine | Location |
---|---|
About | |
AltaVista | |
Ask Jeeves | |
LookSmart | |
Lycos | |
MSN | |
Overture | |
Teoma | |
Yahoo |
You should also consider the following resources as you search the Internet for information about your organization:
Edgar Online Inc. Attackers often use the Edgar Online site at http://www.edgar-online.com to perform research against companies that have recently merged, been acquired, or are in some state of flux. These types of companies represent easy prey for attackers because differences in the security practices of the joining companies often create insecure environments.
Google cache. As Google searches the Internet, it creates cached copies of the Web pages it sees. The problem with this is that if your company has ever had an information leak on its website, and the leak was fixed, chances are the leak is still available in a cached copy of your Web page. You should check cached copies of your organization’s Web pages as part of your penetration tests. To do this, do a regular search of your organization on Google. The results returned will include a "Cached" link that you can click to view cached copies, as shown in Figure 8-8. If you find something about your organization that you don’t want to have publicly available, you can ask Google to remove it.
Internet Archive. The Internet Archive Wayback Machine at http://www.archive.org provides similar results to the Google caching feature. Be sure to include this resource in your penetration tests.
Though you cannot control information about your organization on other Web servers, such as information hosted by competitors, news wires, and financial institutions, you can control the content on your own Web servers. Here, the countermeasures against search engines is the same as those listed for attackers combing through your website:
Public discussion forums such as the ones found in newsgroups on Usenet or in Internet Relay Chat (IRC) channels are a great way to share information. Through these discussion forums, you can learn where your favorite band is playing, for example, or share your point of view on virtually any topic and post questions about problems you can’t solve. It’s this last item, employees posting questions, that you need to worry about. Employees posting troubleshooting questions represent a real risk to your organization’s security because these posts often expose sensitive information about the systems and networks in your organization. The following is a post similar to those commonly seen in a Usenet newsgroup:
From: [email protected] ([email protected]) Subject: Firewall configuration problems, help!!! Newsgroups: alt.fake.newsgroup.foobar.info Date: 2002-12-17 04:57:59 PST Hi, I am having trouble configuring our firewalls to block SQL traffic. Can someone tell me which ports/protocols I should be blocking? Thanks in advance!
These types of posts are particularly dangerous. (Alarm bells in your head should be ringing right now.) An attacker reading this request for help now knows that one or more of your firewalls is not filtering SQL traffic and might try to compromise your network by using SQL-based attacks. As suggested by Ed Skoudis in Counter Hack (Prentice Hall PTR, 2001), more insidious attackers might even reply with false information in hope that the poster will blindly follow their advice, resulting in a weakened security posture.
Determining the level of information about your organization leaked to the Internet is a big task. It essentially means that you need to comb the entire Internet, looking in every chat room and every newsgroup. Here are some services and techniques you can use to take a snapshot of your exposure:
Third-party intelligence services. Companies such as iDefense (http://www.idefense.com) offer Internet monitoring services that review public discussion forums and other sources of information for materials relating to your organization and provide regular reports based on custom intelligence collection plans.
Searching newsgroups. A quick and easy way to get a snapshot of your organization’s current exposure on newsgroups is to use online newsgroup search engines. One such engine, shown in Figure 8-9, is found at Google Groups, http://groups.google.com. By entering search strings such as @yourorganizationsname.com or the name of your organization into the search field, you can get archives of newsgroups discussions associated with your organization. You can then review the results for the presence of inappropriate material.
The best countermeasure to having your organization’s sensitive information exposed on public discussion forums is simple: don’t have it there in the first place! Of course, this is easier said than done, so here are some steps you should follow:
Policy. Create a policy defining what types of materials are allowed to be discussed on public discussion forums. Prohibit information that could compromise your organization, such as firewall rules and network configurations.
Vendor product support. It’s always best to accept product support supplied directly from the vendor or from trusted support sources only. Product patch solutions—that is, solutions requiring a product binary upgrade—should be obtained directly from vendors or trusted support sources only.
Scrutinizing support. If you do use technical support from public discussion forums, be at least wary of it. Scrutinize and test it thoroughly in isolated test environments before applying change to production environments. You just never know who you’re getting your advice from.
Prohibit and block public discussion forums. Prohibit the use of certain or all public discussion forums in policy. To help enforce your policy, you can enable preventive measures such as blocking inbound and outbound traffic to those forums. Some common public discussion forums are listed in Table 8-5.
Table 8-5. Common Public Discussion Forums and Ports
Public discussion forum | Port to block/protocol |
---|---|
Network News Transfer Protocol (NNTP), used by Usenet | 119/TCP and UDP |
Internet Relay Chat (IRC) | 194/TCP and UDP (Client) |
529/TCP and UDP (Server) | |
994/TCP and UDP (IRC over SSL/TLS) | |
Web-based interfaces | Block URL |
Example: http://groups.google.com |
3.136.27.75