CHAPTER 2

Reconnaissance: Information Gathering for the Ethical Hacker

In this chapter, you will

•   Define active and passive footprinting

•   Identify methods and procedures in information gathering

•   Understand the use of social networking, search engines, and Google hacking in information gathering

•   Understand the use of whois, ARIN, and nslookup in information gathering

•   Describe the DNS record types

I was watching a nature show on TV a couple nights back and saw a lion pride hunt from start to finish. The actual end was totally awesome, if a bit gruesome, with a lot of neck biting and suffocation, followed by bloody chewing. But the buildup to that attack was different altogether. In a way, it was visually…boring. But if you watched closely, you could see the real work of the attack was done before any energy was used at all.

For the first three quarters of the program, the cameras focused on lions just sitting there, seemingly oblivious to the world around them. The herds of antelope, or whatever the heck they were, saw the lions, but also went about their merry business of pulling up and chewing on grass. Every so often the lions would look up at the herd, almost like they were counting sheep (or antelope) in an effort to nap; then they’d go back to licking themselves and shooing away flies. A couple times they’d get up and stroll aimlessly about, and the herd would react one way or another. Late in the show, one camera angle across the field got a great shot of a lion turning from its apathetic appearance to focusing both eyes toward the herd—and you could see what was coming. When the pride finally went on the attack, it was quick, coordinated, and deadly.

What were these animals doing? In effect (and, yes, I know it’s a stretch here, but just go with it) they were footprinting. They spent the time figuring out how the herd was moving, where the old and young were, and the best way to split them off for easy pickings. If we want to be successful in the virtual world we find ourselves in, then we’d better learn how to gather information about targets before we even try to attack them. This chapter is all about the tools and techniques to do that. And for those of you who relish the thought of spy-versus-spy and espionage, you can still learn a whole lot through good-old legwork and observation, although most of this is done through virtual means.

Footprinting

Gathering information about your intended target is more than just a beginning step in the overall attack; it’s an essential skill you’ll need to perfect as an ethical hacker. I believe what most people wonder about concerning this particular area of our career field comes down to two questions: What kind of information am I looking for, and how do I go about getting it? Both are excellent questions (if I do say so myself), and both will be answered in this section. As always, we’ll cover a few basics in the way of the definitions, terms, and knowledge you’ll need before we get into the hard stuff.

You were already introduced to the term reconnaissance in Chapter 1, so I won’t bore you with the definition again here. I do think it’s important, though, that you understand there may be a difference in definition between reconnaissance and footprinting, depending on which security professional you’re talking to. For many, recon is more of an overall, overarching term for gathering information on targets, whereas footprinting is more of an effort to map out, at a high level, what the landscape looks like. They are interchangeable terms in CEH parlance, but if you just remember that footprinting is part of reconnaissance, you’ll be fine.

During the footprinting stage, you’re looking for any information that might give you some insight into the target—no matter how big or small. And it doesn’t necessarily need to be technical in nature. Sure, things such as the high-level network architecture (what routers are they using, and what servers have they purchased?), the applications and websites (are they public-facing?), and the physical security measures (what type of entry control systems present the first barrier, and what routines do the employees seem to be doing daily?) in place are great to know, but you’ll probably be answering other questions first during this phase. Questions concerning the critical business functions, the key intellectual property, the most sensitive information this company holds may very well be the most important hills to climb in order to recon your organization appropriately and diligently.

Of course, anything providing information on the employees themselves is always great to have because the employees represent a gigantic target for you later in the test. Although some of this data may be a little tricky to obtain, most of it is relatively easy to get and is right there in front of you, if you just open your virtual eyes.

As far as footprinting terminology goes and getting your feet wet with EC-Council’s view of it, most of it is fairly easy to remember. For example, while most footprinting is passive in nature, takes advantage of freely available information, and is designed to be blind to your target, sometimes an overly security-conscious target organization may catch on to your efforts. If you prefer to stay in the virtual shadows (and because you’re reading this book, I can safely assume that you do), your footprinting efforts may be designed in such a way as to obscure their source. If you’re really sneaky, you may even take the next step and create ways to have your efforts trace back to anyone and anywhere but you.

Images

NOTE    Giving the appearance that someone else has done something illegal is, in itself, a crime. Even if it’s not criminal activity you’re blaming on someone else, the threat of prison and/or a civil liability lawsuit should be reason enough to think twice about this.

Anonymous footprinting, where you try to obscure the source of all this information gathering, may be a great way to work in the shadows, but pseudonymous footprinting is just downright naughty, making someone else take the blame for your actions. How dare you!

Images

EXAM TIP    ECC describes four main focuses and benefits of footprinting for the ethical hacker:

1. Know the security posture (footprinting helps make this clear).

2. Reduce the focus area (network range, number of targets, and so on).

3. Identify vulnerabilities (self-explanatory).

4. Draw a network map.

Footprinting, like everything else in hacking, usually follows a fairly organized path to completion. You start with information you can gather from the “50,000-foot view”—using the target’s website and web resources to collect other information on the target—and then move to a more detailed view. The targets for gathering this type of information are numerous and can be easy or relatively difficult to crack open. You may use search engines and public-facing websites for general, easy-to-obtain information while simultaneously digging through DNS for detailed network-level knowledge. All of it is part of footprinting, and it’s all valuable; just like an investigation in a crime novel, no piece of evidence should be overlooked, no matter how small or seemingly insignificant.

That said, it’s also important for you to remember what’s really important and what the end goal is. Milan Kundera famously wrote in The Unbearable Lightness of Being, “Seeing is limited by two borders: strong light, which blinds, and total darkness,” and it really applies here. In the real world, the only thing more frustrating to a pen tester than no data is too much data. When you’re on a pen test team and you have goals defined in advance, you’ll know what information you want, and you’ll engage your activities to go get it. In other words, you won’t (or shouldn’t) be gathering data just for the sake of collecting it; you should be focusing your efforts on the good stuff.

There are two main methods for gaining the information you’re looking for. Because you’ll definitely be asked about them repeatedly on the exam, I’m going to define active footprinting versus passive footprinting here and then spend further time breaking them down throughout the rest of this chapter. An active footprinting effort is one that requires the attacker to touch the device, network, or resource, whereas passive footprinting refers to measures to collect information from publicly accessible sources. For example, passive footprinting might be perusing websites or looking up public records, whereas running a scan against an IP you find in the network would be active footprinting. When it comes to the footprinting stage of hacking, the vast majority of your activity will be passive in nature. As far as the exam is concerned, you’re considered passively footprinting when you’re online, checking on websites, and looking up DNS records, and you’re actively footprinting when you’re gathering social engineering information by talking to employees.

Images

NOTE    Here’s a CEH testing conundrum offered by our astute technical editor: What about websites designed to scan your target? There are plenty of sites out there that will scan a target for you, and while it’s actively scanning your target, it’s not YOU actively scanning it.

Lastly, I need to add a final note here on footprinting and your exam, because it needs to be said. Footprinting is of vital importance to your job, but for whatever reason ECC just doesn’t focus a lot of attention on it in the exam. It’s actually somewhat disconcerting that this is such a big part of the job yet just doesn’t get much of its due on the exam. Sure, you’ll see stuff about footprinting on the exam, and you’ll definitely need to know it (we are, after all, writing an all-inclusive book here), but it just doesn’t seem to be a big part of the exam. I’m not really sure why. The good news is, most of this stuff is easy to remember anyway, so let’s get on with it.

Passive Footprinting

Before starting this section, I got to wondering about why passive footprinting seems so confusing to most folks. During practice exams and whatnot in a class I recently sat through, there were a few questions missed by most folks concerning passive footprinting. It may have to do with the term passive (a quick “define passive” web search shows the term denotes inactivity, nonparticipation, and a downright refusal to react in the face of aggression). Or it may have to do with some folks just overthinking the question. I think it probably has more to do with people dragging common sense and real-world experience into the exam room with them, which is really difficult to let go of. In any case, let’s try to set the record straight by defining exactly what passive footprinting is and, ideally, what it is not.

Images

NOTE    Every once in a while, EC-Council puts something in the CEH study materials that seems contrary to real life. Many of us who have performed this sort of work know dang good and well what can and cannot get you caught, and we bristle when someone tells us that, for instance, dumpster diving is a passive activity. Therefore, do yourself a favor and just stick with the terms and definitions for your exam. Afterward, you can join the rest of us in mocking it. For now, memorize, trust, and go forth.

Passive footprinting as defined by EC-Council has nothing to do with a lack of effort and even less to do with the manner in which you go about it (using a computer network or not). In fact, in many ways it takes a lot more effort to be an effective passive footprinter than an active one. Passive footprinting is all about the publicly accessible information you’re gathering and not so much about how you’re going about getting it. Some methods include gathering of competitive intelligence, using search engines, perusing social media sites, participating in the ever-popular dumpster dive, gaining network ranges, and raiding DNS for information. As you can see, some of these methods can definitely ring bells for anyone paying attention and don’t seem very passive to common-sense-minded people anywhere, much less in our profession. But you’re going to have to get over that feeling rising up in you about passive versus active footprinting and just accept this for what it is—or be prepared to miss a few questions on the exam.

Passive information gathering definitely contains the pursuit and acquisition of competitive intelligence, and because it’s a direct objective within CEH and you’ll definitely see it on the exam, we’re going to spend a little time defining it here. Competitive intelligence refers to the information gathered by a business entity about its competitors’ customers, products, and marketing. Most of this information is readily available and can be acquired through different means. Not only is it legal for companies to pull and analyze this information, it’s expected behavior. You’re simply not doing your job in the business world if you’re not keeping up with what the competition is doing. Simultaneously, that same information is valuable to you as an ethical hacker, and there are more than a few methods to gain competitive intelligence.

Images

NOTE    Ever heard of Attention Meter (www.attentionmeter.com)? It compares website traffic from hosts of different sources and provides traffic data and graphs on it.

The company’s own website is a great place to start. Think about it: What do people want on their company’s website? They want to provide as much information as possible to show potential customers what they have and what they can offer. Sometimes, though, this information becomes information overload. Just some of the open source information you can gather from almost any company on its site includes company history, directory listings, current and future plans, and technical information. Directory listings become useful in social engineering, and you’d probably be surprised how much technical information businesses will keep on their sites. Designed to put customers at ease, sometimes sites inadvertently give hackers a leg up by providing details on the technical capabilities and makeup of their network.

Several websites make great sources for competitive intelligence. Information on company origins and how it developed over the years can be found in places like the EDGAR Database (www.sec.gov/edgar.shtml), Hoovers (www.hoovers.com), LexisNexis (www.lexisnexis.com), and Business Wire (www.businesswire.com). If you’re interested in company plans and financials, the following list provides some great resources:

•   SEC Info (www.secinfo.com)

•   Experian (www.experian.com)

•   Market Watch (www.marketwatch.com)

•   Wall Street Monitor (www.twst.com)

•   Euromonitor (www.euromonitor.com)

Images

NOTE    Other aspects that may be of interest in competitive intelligence include the company’s online reputation (as well as the company’s efforts to control it) and the actual traffic statistics of the company’s web traffic (www.alexa.com is a great resource for this). Also, check out finance.google.com, which will show you company news releases on a timeline of its stock performance—in effect, showing you when key milestones occurred.

Active Footprinting

When it comes to active footprinting, per EC-Council, we’re really talking about social engineering, human interaction, and anything that requires the hacker to interact with the organization. In short, whereas passive measures take advantage of publicly available information that won’t (usually) ring any alarm bells, active footprinting involves exposing your information gathering to discovery. For example, you can scrub through DNS usually without anyone noticing a thing, but if you were to walk up to an employee and start asking them questions about the organization’s infrastructure, somebody is going to notice. I have an entire chapter dedicated to social engineering coming up (see Chapter 12), but will hit a few highlights here.

Images

NOTE    Social engineering is often overlooked in a lot of pen testing cycles, but honestly it’s an extremely effective footprinting method. Books like How to Win Friends and Influence People and The Art of Conversation are fantastic social engineering resources. You’d be surprised how much you can learn about a target by simply being nice, charming, and a good listener.

Social engineering has a variety of definitions, but it basically comes down to convincing people to reveal sensitive information, sometimes without even realizing they’re doing it. There are millions of methods for doing this, and it can sometimes get really confusing. From the standpoint of active footprinting, the social engineering methods you should be concerned about involve human interaction. If you’re calling an employee or meeting an employee face to face for a conversation, you’re practicing active footprinting.

This may seem easy to understand, but it can get confusing in a hurry. For example, I just finished telling you social media is a great way to uncover information passively, but surely you’re aware you can use some of these social sites in an active manner. What if you openly use Facebook connections to query for information? Or what if you tweet a question to someone? Both of those examples could be considered active in nature, so be forewarned.

Images

EXAM TIP    This is a huge point of confusion on the exam, so let’s clear it up here: in general, social engineering is an active footprinting method (unless, of course, you’re talking about dumpster diving, which is defined as passive). What EC-Council is really trying to say is, social engineering efforts that involve interviewing (phone calls, face-to-face interactions, and social media) are active, whereas those not involving interviewing aren’t. In short, just memorize “dumpster diving = passive,” and you’ll be okay.

Footprinting Methods and Tools

In version 10 of the exam, ECC continues putting a lot of focus on the tools themselves and not so much on the definitions and terms associated with them. This is really good news from one standpoint—those definitions and terms can get ridiculous, and memorizing the difference between one term and another doesn’t really don’t do much in the way of demonstrating your ability as an actual ethical hacker. The bad news is, you have to know countless tools and methods just in case you see a specific question on the exam. And, yes, there are plenty of tools and techniques in footprinting for you to learn—both for your exam and your future in pen testing.

Search Engines

Ever heard of a lovebug? No, I’m not talking about some painted-up VW from the 60’s; I’m talking about the black bugs that stick together and fly around everywhere in the South at least twice a year. They’re a plague on all that is good and noble on the planet, and this year, they’re out in droves.

Yesterday somebody asked me if lovebugs serve a purpose—any purpose at all. If this had been back in my youth, I would’ve had to shrug and admit I had no idea. If I really wanted to know, my only recourse would be to go to the library and try to find it in a book (GASP! The HORROR!). Yesterday, I simply pulled out my smartphone and did what everyone else does—I googled it. Today, given five minutes and a browser, I sound like an entomologist, with a minor in Lifestyles of the Lovebug.

Images

NOTE    You can google “lovebug lifestyles” yourself and discover the same useless facts I did. While you’re at it, though, try the other search engines—Bing, Yahoo!, DuckDuckGo, Baidu. Even AOL and Ask are still out there. It’s good practice for using these search engines to find information on your target later in testing. Whether or not lovebugs serve a purpose at all, I’ll leave to you, dear reader.

Pen testing and hacking are no different. Want to learn how to use a tool? Go to YouTube and somebody has a video on it. Want to define the difference between BIA and MTD? Go to your favorite search engine and type them in. Need a good study guide for CEH? Type it in and—voilà—here you are.

Search engines can provide a treasure trove of information for footprinting and, if used properly, won’t alert anyone you’re looking at them. Mapping and location-specific information, including drive-by pictures of the company exterior and overhead shots, are so commonplace now people don’t think of them as footprinting opportunities. However, Google Earth, Google Maps, and Bing Maps can provide location information and, depending on when the pictures were taken, can show potentially interesting intelligence. Even personal information—like residential addresses and phone numbers of employees—are oftentimes easy enough to find using sites such as Linkedin.com and Pipl.com.

A really cool tool along these same lines is Netcraft (www.netcraft.com). Fire it up and take a look at all the goodies you can find. Restricted URLs, not intended for public disclosure, might just show up and provide some juicy tidbits. If they’re really sloppy (or sometimes even if they’re not), Netcraft output can show you the operating system (OS) on the box too.

Images

NOTE    Netcraft has a pretty cool toolbar add-on for Firefox and Chrome (http://toolbar.netcraft.com/).

Another absolute goldmine of information on a potential target is job boards. Go to CareerBuilder.com, Monster.com, Dice.com, or any of the multitude of others, and you can find almost everything you’d want to know about the company’s technical infrastructure. For example, a job listing that states “Candidate must be well versed in Windows Server 2012 R2, Microsoft SQL Server 2016, and Veritas Backup services” isn’t representative of a network infrastructure made up of Linux servers. The technical job listings flat-out tell you what’s on the company’s network—and oftentimes what versions. Combine that with your astute knowledge of vulnerabilities and attack vectors, and you’re well on your way to a successful pen test!

Footprinting Gone Wild

Suppose, for a moment, you’re actually on a pen test team and you’ve all done things the right way. You hammered out an agreement beforehand, set your scope, agreed on what should be exploited (or not), and got all your legal stuff taken care of and signed off by the right people. You follow your team lead’s direction and accomplish the tasks set before you—this time just some basic (dare I say, passive) reconnaissance. After a few steps and pokes here and there, you run a webcrawler (like Black Widow, GSA Email Spider, NCollector Studio, or even GNU Wget), hoping to get some contact information and employee data. At the end of the day the team gets together to review findings and potential problems. Your team lead enters the room angry and frustrated. It seems that some web application data was deleted in response to an information grab. The team turns and looks at you: “What did I do?!”

Most pen test agreements have some kind of clause built in to protect the team from just such an occurrence. Can a web spider actually cause the deletion of information from very, very poorly programmed web applications? Of course it can, and you—the hapless team member—would have no idea about said terrible application until you ran a test (in this case, a crawl) against it.

Could you be held accountable? Should you be held accountable? The answer is, maybe. If you don’t ensure your pen test agreement is in order and if there’s nothing like

Due to the execution of toolsets, exploits, and techniques, the possibility exists for the unintentional deletion or modification of sensitive data in the test environment, which may include production-level systems...

in your agreement, followed by a statement absolving your team from unintentional problems, then, yes—congratulations—you’re accountable.

Want another one you should think about? Try worrying about what actions your target takes when they see you. If a network admin shuts everything down because he thinks they’re under attack and that causes fill in the blank, are you at fault? You may be if you don’t have a clause that reads something like the following:

The actions taken by the target in response to any detection of our activities are also beyond our control…

What happens if a client decides they don’t want to accept that clause in the agreement? Well, since there’s absolutely no way to guarantee even the calmest of pen test tools and techniques won’t alter or even destroy data or systems, my advice would be to run. Just because toolsets and techniques are designated passive in nature, and just because they aren’t designed to exploit or cause harm, don’t believe you can just fire away and not worry about it. And just as facts don’t care about feelings, tools don’t give a rip about your intent. Get your agreement in order first, then let your tools out on Spring Break.

Images

NOTE    The Computer Fraud and Abuse Act (1986) makes conspiracy to commit hacking a crime. Therefore, it’s important the ethical hacker get an ironclad agreement in place before even attempting basic footprinting.

While we’re on the subject of using websites to uncover information, don’t neglect the innumerable options available to you—all of which are free and perfectly legal. Social networking sites can provide all sorts of information. Sites such as LinkedIn (www.linkedin.com), where professionals build relationships with peers, can be a great place to profile for attacks later. Facebook and Twitter are also great sources of information, especially when the company has had layoffs or other personnel problems recently—disgruntled former employees are always good for some relevant company dirt. And, just for some real fun, check out http://en.wikipedia.org/wiki/Robin_Sage to see just how powerful social networking can be for determined hackers.

Images

EXAM TIP    You can also use alerting to help monitor your target. Google, Yahoo!, and Twitter all offer services that provide up-to-date information that can be texted or e-mailed to you when there is a change.

Google Hacking

A useful tactic in footprinting a target was popularized mainly in late 2004 by a guy named Johnny Long, who was part of an IT security team at his job. While performing pen tests and ethical hacking, he started paying attention to how the search strings worked in Google. The search engine has always had additional operators designed to allow you to fine-tune your search string. What Mr. Long did was simply apply that logic for a more nefarious purpose.

Suppose, for example, instead of just looking for a web page on boat repair or searching for an image of a cartoon cat, you decided to tell the search engine, “Hey, do you think you can look for any systems that are using Remote Desktop Web Connection?” Or how about, “Can you please show me any MySQL history pages so I can try to lift a password or two?” Amazingly enough, search engines can do just that for you, and more. The term this practice has become known by is Google hacking.

Google hacking involves manipulating a search string with additional specific operators to search for vulnerabilities. Table 2-1 describes advanced operators for Google hack search strings.

Images


Table 2-1 Google Search String Operators

Innumerable websites are available to help you with Google hack strings. For example, from the Google Hacking Database (a site operated by Mr. Johnny Long and Hackers for Charity, www.hackersforcharity.org/ghdb/), try this string from wherever you are right now:

Images

Images

NOTE    That filetype: operator in Table 2-1 offers loads of cool stuff. If you want a good list of file types to try, check out https://support.google.com/webmasters/answer/35287?hl=en (a link showing many file types). And don’t forget, source code and all sorts of craziness are indexable, and thus often accessible, so don’t discount anything!

Basically we’re telling Google to go look for web pages that have TSWEB in the URL (indicating a remote access connection page), and you want to see only those that are running the default HTML page (default installs are common in a host of different areas and usually make things a lot easier for an attacker). I think you may be surprised by the results—I even saw one page where an admin had edited the text to include the logon information.

Images

NOTE    Google hacking is such a broad topic it’s impossible to cover all of it in one section of a single book. This link, among others, provides a great list to work through: http://it.toolbox.com/blogs/managing-infosec/google-hacking-master-list-28302. Take advantage of any of the websites available and learn more as you go along. What you’ll need exam-wise is to know the operators and how to use them.

As you can see, Google hacking can be used for a wide range of purposes. For example, you can find free music downloads (pirating music is a no-no, by the way, so don’t do it) using the following:

Images

You can also discover open vulnerabilities on a network. For example, the following provides any page holding the results of a vulnerability scan using Nessus (interesting to read, wouldn’t you say?):

Images

Combine these with the advanced operators, and you can really dig down into some interesting stuff. Again, none of these search strings or “hacks” is illegal—you can search for anything you want (assuming, of course, you’re not searching for illegal content, but don’t take your legal advice from a certification study book). However, actually exploiting anything you find without prior consent will definitely land you in hot water.

And if Google hacking weren’t easy enough, there are a variety of tools to make it even more powerful. Tools such as SiteDigger (www.mcafee.com) use Google hack searches and other methods to dig up information and vulnerabilities. Metagoofil (www.edge-security.com) uses Google hacks and cache to find unbelievable amounts of information hidden in the meta tags of publicly available documents. Find the browser and search engine of your choice and look for “Google hack tools.” You’ll find more than a few available for play.

Another note on Google hacking: it’s not as easy to pull off as it once was. Google, for reasons I will avoid discussing here because it angers me to no end, has decided it needs to police search results to prevent folks from using the search engine as it was intended to be used. As you can see from Figure 2-1, and probably from your own Google hacking attempts in learning this opportunity, Google will, from time to time, throw up a CAPTCHA if it believes you’re a “bot” or trying to use the search engine for nefarious purposes. There are ways around the annoyance that are well documented and accessible via Google searches, but it still doesn’t take away the annoyance factor. With that in mind, while Google hacking is, well, part of Google, don’t discount using other search engines in looking for your holy grail.

Images


Figure 2-1 Google CAPTCHA

Geek Humor

I admit it, a lot of us in the technical realm of life don’t always seem to have the greatest of social skills. In fact, finding a tech guy who can actually communicate with other human beings in a professional or personal setting is like finding a four-leaf clover. But no one can ever say geeks don’t have a decent sense of humor. Until recently, though, geek humor was more of an inside baseball thing—something we knew about and shared among ourselves, gazing down our noses at the teeming masses of users who had no idea what we were talking about. But pop culture and Hollywood finally caught up with us.

In 2011, a guy named Ernest Cline wrote a fantastic book called Ready Player One. It’s a fast-paced tale filled with glorious 80’s references, wonderful characters, and an original story, and is easily one of my favorite escape fiction novels of all time. In it, a creator of a wildly popular virtual reality world hid a prize inside his digital creation and, after his death, made a huge game out of the search for his “Easter egg.”

It used to be that mentioning the term “Easter egg” made folks think about small tubs of vinegar-water food coloring and kids running around fields gathering colored hard-boiled or plastic eggs. But after the book’s release (and the subsequent blasphemous, substandard, horrendous mockery that is the 2018 movie version of the book), most folks knew that an Easter egg is something developers put in an application, website, or game just for giggles. Outside of “gunters” hunting the egg down in a giant virtual world (using their wits and intelligence in the book, or sheer blind luck in the terrible movie), most Easter eggs are usually accessible by some weird combination of steps and clicks. Or sometimes it’s just part of the way things work. For example, a long, long time ago Excel had an Easter egg that showed computerized images of the busts of the developers.

Google has a ton of Easter eggs. For example, open Google and start typing Do a barrel roll and press ENTER: the entire screen will (sometimes before you even finish typing) perform a barrel roll. Another? Perform an image search and type atari breakout. The images will display and then shrink and begin a pong game you can control with the mouse. Enter binary, and the number of results displays in binary instead of decimal. And typing tilt actually tilts the screen.

I could go on and on and write an entire section called “Fun with Google,” but you get the point. Search, explore, and have some fun. There’s plenty of time to study, and who says you can’t have fun while doing it? Besides, you may really want to know how many degrees of separation Zach Galifianakis has from Kevin Bacon. Doing a search for Bacon number Zach Galifianakis will let you know that the answer is 2.

Images

NOTE    More geek humor? Glad you asked. If you’ve ever been asked a ridiculous question by someone and wanted to tell them to just use a search engine like everybody else, try Let Me Google That For You. Suppose someone asks you “Who was the thirteenth president?” or “What’s the atomic weight of hydrogen?” Instead of looking up the answer, go to www.lmgtfy.com and type in the question. Send the person the link and, upon opening it, he or she will see a page typing the question in a Google search window and clicking Google Search. Sarcastic? Of course. Funny? No doubt. Worth it? Absolutely.

Lastly, Google also offers another neat option called “Advanced Search.” If you point your browser to www.google.com/advanced_search, many of these strings we try so desperately to remember are taken care of and laid out in a nice GUI format. The top portion of the Advanced Search page prompts “Find pages with…” and provides options to choose from. Scroll down just a tad, and the next section reads “Then narrow your results by…”, providing options such as language, last updated, and where specific terms appear in or on the site. You can also click links at the bottom to find pages “similar to, or link to, a URL,” among other helpful options. I considered adding a picture of it here, but it’s more than a full page in the browser. The format is easy enough, and I don’t think you’ll have a problem working your way around it.

Website and E-mail Footprinting

Website and e-mail footprinting may require a little more effort and technical knowledge, but it’s worth it (not to mention EC-Council has devoted two entire slide show sections to the material, so you know it’s gonna be good). Analyzing a website from afar can show potentially interesting information, such as software in use, OS, filenames, paths, and contact details. Using tools such as Burp Suite, Firebug, and Website Informer allows you to grab headers and cookies, and learn connection status, content type, and web server information. Heck, pulling the HTML code itself can provide useful intel. You might be surprised what you can find in those “hidden” fields, and some of the comments thrown about in the code may prove handy. A review of cookies might even show you software or scripting methods in use. E-mail headers provide more information than you might think, and are easy enough to grab and examine. And tracking e-mail? Hey, it’s not only useful for information, it’s just downright fun.

Although it doesn’t seem all that passive, web mirroring is a great method for footprinting. Copying a website directly to your system (“mirroring” it) can definitely help speed things along. Having a local copy to play with lets you dive deeper into the structure and ask things like “What’s this directory for over here?” and “I wonder if this site is vulnerable to fill-in-chosen-vulnerability without alerting the target organization.” Tools for accomplishing this are many and varied, and while the following list isn’t representative of every web mirroring tool out there, it’s a good start:

•   HTTrack (www.httrack.com)

•   Black Widow (http://softbytelabs.com)

•   WebRipper (www.calluna-software.com)

•   Teleport Pro (www.tenmax.com)

•   GNU Wget (www.gnu.org)

•   Backstreet Browser (http://spadixbd.com)

Although it’s great to have a local, current copy of your target website to peruse, let’s not forget that we can learn from history too. Information relevant to your efforts may have been posted on a site at some point in the past but has since been updated or removed. EC-Council absolutely loves this as an information-gathering source, and you are certain to see www.archive.org and Google Cache queried somewhere on your exam. The Wayback Machine, available at Archive.org (see Figure 2-2), keeps snapshots of sites from days gone by, allowing you to go back in time to search for lost information; for example, if the company erroneously had a phone list available for a long while but has since taken it down, you may be able to retrieve it from a “way back” copy. These options provide insight into information your target may have thought they’d safely gotten rid of—but as the old adage says, “once posted, always available.”

Images


Figure 2-2 Archive.org’s Wayback Machine

Images

EXAM TIP    Website Watcher (http://aignes.com) can be used to check web pages for changes, automatically notifying you when there’s an update.

And let’s not forget good old e-mail as a footprinting source here. E-mail communication can provide us IP address and physical location information. Links visited by the recipient may also be available, as well as browser and OS information. Heck, you can sometimes even see how long they spend reading the e-mail.

Have you ever actually looked at an e-mail header? You can really get some extraordinary detail out of it, and sometimes sending a bogus e-mail to the company and watching what comes back can help you pinpoint a future attack vector (see Figure 2-3 for an example). If you want to go a step further, you can try some of the many e-mail tracking tools. E-mail tracking applications range from easy, built-in efforts on the part of your e-mail application provider (such as a read receipt and the like within Microsoft Outlook) to external apps and efforts (from places such as www.emailtrackerpro.com and www.mailtracking.com). Simply appending “.mailtracking.com” to the end of an e-mail address, for example, can provide a host of information about where the e-mail travels and how it gets there. Examples of tools for e-mail tracking include GetNotify, ContactMonkey, Yesware, Read Notify, WhoReadMe, MSGTAG, Trace Email, and Zendio.

Images


Figure 2-3 E-mail header

DNS Footprinting

I hate getting lost. Now, I’m not saying I’m always the calmest driver and that I don’t complain (loudly) about circumstances and other drivers on the road, but I can honestly say nothing puts me on edge like not knowing where I’m going while driving, especially when the directions given to me don’t include the road names. I’m certain you know what I’m talking about—directions that say, “Turn by the yellow sign next to the drugstore and then go down half a mile and turn right onto the road beside the walrus-hide factory. You can’t miss it.” Inevitably I do wind up missing it, and cursing ensues.

Thankfully, negotiating the Internet isn’t reliant on crazed directions. The road signs we have in place to get to our favorite haunts are all part of the Domain Naming System (DNS), and they make navigation easy. DNS, as you’re no doubt already aware, provides a name-to-IP-address (and vice versa) mapping service, allowing us to type in a name for a resource as opposed to its address. This also provides a wealth of footprinting information for the ethical hacker—so long as you know how to use it.

Images

NOTE    Although DNS records are easy to obtain and generally designed to be freely available, this passive footprinting can still get you in trouble. A computer manager named David Ritz was successfully prosecuted in 2008 for querying a DNS server. It was truly a ridiculous ruling, but the point remains that legality and right versus wrong seem always in the eye of the beholder—so be careful.

DNS Basics

As we established in the introduction (you did read it, right?), there are certain things you’re just expected to know before undertaking this certification and career field, and DNS is one of them. So, no, I’m not going to spend pages covering DNS. But we do need to take at least a couple of minutes to go over some basics—mainly because you’ll see this stuff on the CEH exam. The simplest explanation of DNS I can think of follows.

DNS is made up of servers all over the world. Each server holds and manages the records for its own little corner of the globe, known in the DNS world as a namespace. Each of these records gives directions to or for a specific type of resource. Some records provide IP addresses for individual systems within your network, whereas others provide addresses for your e-mail servers. Some provide pointers to other DNS servers, which are designed to help people find what they’re looking for.

Images

NOTE    Port numbers are always important in discussing anything network-wise. When it comes to DNS, 53 is your number. Name lookups generally use UDP, whereas zone transfers use TCP.

Big, huge servers might handle a namespace as big as the top-level domain “.com,” whereas another server further down the line holds all the records for “mheducation.com.” The beauty of this system is that each server only has to worry about the name records for its own portion of the namespace and to know how to contact the server “above” it in the chain for the top-level namespace the client is asking about. The entire system looks like an inverted tree, and you can see how a request for a particular resource can easily be routed correctly to the appropriate server. For example, in Figure 2-4, the server for anyname.com in the third level holds and manages all the records for that namespace, so anyone looking for a resource (such as their website) could ask that server for an address.

Images


Figure 2-4 DNS structure

The only downside to this system is that the record types held within your DNS system can tell a hacker all she needs to know about your network layout. For example, do you think it might be important for an attacker to know which server in the network holds and manages all the DNS records? What about where the e-mail servers are? Heck, for that matter, wouldn’t it be beneficial to gains hints on which systems may behold public-facing websites? All this may be determined by examining the DNS record types, which I’ve so kindly listed in Table 2-2.

Images


Table 2-2 DNS Record Types

Images

EXAM TIP    Know the DNS records well and be able to pick them out of a lineup. You will definitely see a DNS zone transfer on your exam and will be asked to identify information about the target from it.

These records are maintained and managed by the authoritative server for your namespace (the SOA), which shares them with your other DNS servers (name servers) so your clients can perform lookups and name resolutions. The process of replicating all these records is known as a zone transfer. Considering the importance of the records kept here, it is obvious administrators need to be careful about which IP addresses are actually allowed to perform a zone transfer—if you allowed just any IP to ask for a zone transfer, you might as well post a network map on your website to save everyone the trouble. Because of this, most administrators restrict the ability to even ask for a zone transfer to a small list of name servers inside their network. Additionally, some admins don’t even configure DNS at all and simply use IP addresses for their critical hosts.

Images

NOTE    When it comes to DNS, it’s important to remember there are two real servers in play within your system. Name resolvers simply answer requests. Authoritative servers hold the records for a namespace, given from an administrative source, and answer accordingly.

An additional note is relevant to the discussion here, even though we’re not in the attacks portion of the book yet. Think for a moment about a DNS lookup for a resource on your network: say, for instance, a person is trying to connect to your FTP server to upload some important, sensitive data. The user types in ftp.anycomp.com and presses ENTER. The DNS server closest to the user (defined in your TCP/IP properties) looks through its cache to see whether it knows the address for ftp.anycomp.com. If it’s not there, the server works its way through the DNS architecture to find the authoritative server for anycomp.com, which must have the correct IP address. This response is returned to the client, and FTP-ing begins happily enough.

Suppose, though, you are an attacker and you really want that sensitive data yourself. One way to do it might be to change the cache on the local name server to point to a bogus server instead of the real address for ftp.anycomp.com. Then the user, none the wiser, would connect and upload the documents directly to your server. This process is known as DNS poisoning, and one simple mitigation is to restrict the amount of time records can stay in cache before they’re updated. There are loads of other ways to protect against this, which we’re not going to get into here, but it does demonstrate the importance of protecting these records—and how valuable they are to an attacker.

Images

EXAM TIP    DNS poisoning is of enough importance that an entire extension to DNS was created, way back in 1999. The Domain Name System Security Extensions (DNSSEC) is a suite of IETF specifications for securing certain kinds of information provided by DNS. Dan Kaminsky made DNS vulnerabilities widely known back around 2010, and many service providers are rolling this extension out to ensure that DNS results are cryptographically protected.

The SOA record provides loads of information, from the hostname of the primary server in the DNS namespace (zone) to the amount of time name servers should retain records in cache. The record contains the following information (all default values are from Microsoft DNS server settings):

•   Source host Hostname of the primary DNS server for the zone (there should be an associated NS record for this as well).

•   Contact e-mail E-mail address of the person responsible for the zone file.

•   Serial number Revision number of the zone file. This number increments each time the zone file changes and is used by a secondary server to know when to update its copy (if the SN is higher than that of the secondary, it’s time to update!).

•   Refresh time The amount of time a secondary DNS server will wait before asking for updates. The default value is 3600 seconds (1 hour).

•   Retry time The amount of time a secondary server will wait to retry if the zone transfer fails. The default value is 600 seconds.

•   Expire time The maximum amount of time a secondary server will spend trying to complete a zone transfer. The default value is 86,400 seconds (1 day).

•   TTL The minimum “time to live” for all records in the zone. If not updated by a zone transfer, the records will perish. The default value is 3600 seconds (1 hour).

Is That a Forest Behind Those Trees?

DNS is undoubtedly the magic running the machine. Without the ability to quickly and efficiently translate a name to an IP address, the Internet might’ve bogged down long, long ago. Sure, we might’ve used it for education and file transfers, but can anyone imagine the Internet without www.insertnamehere.com? And it’s precisely because of that ease of use, that ability to just type a name and click “go,” without really knowing exactly where you’re headed, that sometimes causes heartache and headache for security personnel. Just imagine the havoc inside an organization if a bad guy somehow got hold of the DNS servers and started pointing people to places they’d never knowingly go. But if you think about how name lookup really happens on a machine, you may not even need to get to the DNS system as a whole to cause real problems.

In general, when you type a URL in a browser on a Windows machine, the system takes a couple laps locally before checking DNS. First, the OS will check to see if the request is for itself (that is, localhost or its own name). If not, it’ll then query the local HOSTS file. If the name resolution isn’t found there, then it’ll go to DNS and query (in order) the local cache, the primary (local) DNS server, and then anything it can find in the entirety of the DNS system it can get to. If, again, no name is found, Windows will turn to NetBIOS, WINS, and the LMHOSTS file.

See how this can become an issue? At each step, if a name resolution is found, the process stops and the search ends. Therefore, if the real name resolution is in step four, but you can find a way to interject a fake one in step 2, then why bother hacking DNS in an organization if you can grab and replace a HOSTS file on the box? Try it yourself on your home system. Navigate to C:WindowsSystem32Driversetc and open the HOSTS file in Notepad. Add an entry like this:

Images

Save, close everything; then, open a browser and try to open Google.com. The worst page on the Internet appears instead. Why? Because once Windows found the name resolution, it stopped looking: no need to bother DNS when the answer is right here in the HOSTS file.

See how this can get hairy and dangerous really quickly? If an attacker can simply add a couple lines of text to the HOSTS file on the machine, he or she could redirect traffic without the user ever having to touch DNS at all. But while we’re all aware (or should be anyway) of the importance of protecting access to that particular file to prevent bad guys from using it, have you ever considered using it for good purposes?

Why not update your HOSTS file to “blackhole” sites you know to be malicious? Why not redirect access requests to sites your employees are not supposed to be visiting at work to a friendly reminder site or a valid business site? See, your system is going to check the HOSTS file before making any trips to resolve names in the first place, so whatever you put there is law as far as a PC is concerned.

Pull up a search engine and look up “blocking unwanted connection with a hosts file.” You’ll find countless HOSTS file versions to go to, and after carefully screening them yourself, of course, you may find implementing them in your business or home saves you a malware incident or two in the future. Or you could just continue having fun and send all Google.com requests to a dancing hamster video. In any case, don’t ignore this simple resource in an attempt to better your security. It’s easy, and it works.

P.S. Don’t forget to delete that entry we added earlier from your HOSTS file. Unless you just like that page. Ugh.

I think, by now, it’s fairly evident why DNS footprinting is an important skill for you to master. So, now that you know a little about the DNS structure and the records kept there (be sure to review them well before your exam—you’ll thank me later), it’s important for us to take a look at some of the tools available for your use as an ethical hacker. The following discussions won’t cover every tool available—and you won’t be able to proclaim yourself an expert after reading them—but you do need to know the basics for your exam, and we’ll make sure to hit what we need.

In the dawn of networking time, when dinosaurs roamed outside the buildings and cars had a choice between regular and unleaded gas, setting up DNS required not only a hierarchical design but someone to manage it. Put simply, someone had to be in charge of registering who owned what name and which address ranges went with it. For that matter, someone had to hand out the addresses in the first place.

IP address management started with a happy little group known as the Internet Assigned Numbers Authority (IANA), which finally gave way to the Internet Corporation for Assigned Names and Numbers (ICANN). ICANN manages IP address allocation and a host of other things. So, as companies and individuals get their IP addresses (ranges), they simultaneously need to ensure the rest of the world can find them in DNS. This is done through one of any number of domain name registrants worldwide (for example, www.networksolutions.com, www.godaddy.com, and www.register.com). Along with those registrant businesses, the following five regional Internet registries (RIRs) provide overall management of the public IP address space within a given geographic region:

•   American Registry for Internet Numbers (ARIN) Canada, many Caribbean and North Atlantic islands, and the United States

•   Asia-Pacific Network Information Center (APNIC) Asia and the Pacific

•   Réseaux IP Européens (RIPE) NCC Europe, Middle East, and parts of Central Asia/Northern Africa. (If you’re wondering, the name is in French.)

•   Latin America and Caribbean Network Information Center (LACNIC) Latin America and the Caribbean

•   African Network Information Center (AfriNIC) Africa

Obviously, because these registries manage and control all the public IP space, they should represent a wealth of information for you in footprinting. Gathering information from them is as easy as visiting their sites (ARIN’s is www.arin.net) and inputting a domain name. You’ll get information such as the network’s range, organization name, name server details, and origination dates. Figure 2-5 shows a regional coverage map for all the registries.

Images


Figure 2-5 Regional registry coverage map

You can also make use of a tool known as whois. Originally started in Unix, whois has become ubiquitous in operating systems everywhere and has generated any number of websites set up specifically for that purpose. It queries the registries and returns information, including domain ownership, addresses, locations, and phone numbers.

To try it for yourself, use your favorite search engine and type in whois. You’ll get millions of hits on everything from the use of the command line in Unix to websites performing the task for you. For example, the second response on my search returned www.whois.sc—a site I’ve used before. Open the site and type in mheducation.com (the site for McGraw-Hill Education, my publisher). You’ll find all kinds of neat information at the top on the page—registrant org, registrar, status, IP address, where it’s located, the server type hosting the site (Apache), date created (and last time the record was updated), and how long they can keep the name without re-upping (expires June 8 of 2019, better get on it guys), and even how many image files on the site are missing alt tags (just one).

Scroll down, and the whois record itself is displayed. I’ve copied portions of it here for your review. Notice the administrative, technical, and registrant contact information displayed and how nicely McGraw-Hill ensured it was listed as a business name instead of an individual—way to go, guys! Additionally, notice the three main DNS servers for the namespace listed at the bottom, as well as that (ahem) notice on DNSSEC.

Images

Images

Images

NOTE    As of December 2010, the Truth in Caller ID Act (www.fcc.gov/guides/caller-id-and-spoofing) stated a person who knowingly transmits misleading caller ID information can be hit with a $10,000 fine per incident.

If you do a search or two on some local business domains, I’d bet large sums of cash you’ll find individuals listed on many of them. And I’m sure a few of you are saying, “So what? What’s the big deal in knowing the phone number to reach a particular individual?” Well, when you combine that information with resources such as Spoofcard (www.spoofcard.com), you have a ready-made attack set up. Imagine spoofing the phone number you just found as belonging to the technical point of contact (POC) for the website and calling nearly anyone inside the organization to ask for information. Caller ID is a great thing, but it can also lead to easy attacks for a clever ethical hacker. Lots of whois outputs will give you all the phone numbers, e-mail addresses, and other information you’ll need later in your attacks.

Images

EXAM TIP    You’re going to need to be familiar with whois output, paying particular attention to registrant and administrative names, contact numbers for individuals, and the DNS server names.

Another useful tool in the DNS footprinting toolset is an old standby, a command-line tool people have used since the dawn of networking: nslookup. This is a command that’s part of virtually every operating system in the world, and it provides a means to query DNS servers for information. The syntax for the tool is fairly simple:

Images

The command can be run as a single instance, providing information based on the options you choose, or you can run it in interactive mode, where the command runs as a tool, awaiting input from you.

For example, on a Microsoft Windows machine, if you simply type nslookup at the prompt, you’ll see a display showing your default DNS server and its associated IP address. From there, nslookup sits patiently, waiting for you to ask whatever you want (as an aside, this is known as interactive mode). Typing a question mark shows all the options and switches you have available. For example, the command

Images

tells nslookup all you’re looking for are records on e-mail servers. Entering a domain name after that will return the IP addresses of all the mail servers DNS knows about for that namespace.

The command nslookup can also provide for something known as a zone transfer. As stated earlier, a zone transfer differs from a “normal” DNS request in that it pulls every record from the DNS server instead of just the one, or one type, you’re looking for. To use nslookup to perform a zone transfer, first make sure you’re connected to the SOA server for the zone and then try the following steps:

1.   Enter nslookup at the command line.

2.   Type server <IPAddress>, using the IP address of the SOA. Press ENTER.

3.   Type set type=any and press ENTER.

4.   Type ls -d domainname.com, where domainname.com is the name of the zone, and then press ENTER.

Either you’ll receive an error code, because the administrator has done her job correctly, or you’ll receive a copy of the zone transfer, which looks something like this:

Images

The areas in bold are of particular importance. In the SOA itself, 2013090800 is the serial number, 86400 is the refresh interval, 900 is the retry time, 1209600 is the expiry time, and 3600 defines the TTL for the zone. If you remember our discussion on DNS poisoning earlier, it may be helpful to know the longest a bad DNS cache can survive here is one hour (3600 seconds). Also notice the MX record saying, “The server providing our e-mail is named mailsrv.anycomp.com,” followed by an A record providing its IP address. That’s important information for an attacker to know, wouldn’t you say?

Images

TIP    After finding the name servers for your target, type nslookup at the command prompt to get into interactive mode and then change to your target’s name server (by typing server servername). Performing DNS queries from a server inside the network might provide better information than relying on your own server.

Another option for viewing this information is the dig command utility. Native to Unix systems but available as a download for Windows systems (along with BIND 9), dig is used to test a DNS query and report the results. The basic syntax for the command looks like

Images

where server is the name or IP of the DNS name server, name is the name of the resource you’re looking for, and type is the type of record you want to pull.

You can add dozens of switches to the syntax to pull more explicit information. To see all the switches available, use the following at the command line:

Images

Images

EXAM TIP    You need to know nslookup syntax and output very well. Be sure you know how to get into interactive mode with nslookup and how to look for specific information once there. You’ll definitely see it on your exam.

Network Footprinting

Discovering and defining the network range can be another important footprinting step to consider. Knowing where the target’s IP addresses start and stop greatly limits the time you’ll need to spend figuring out specifics later—provided, of course, your target operates in their own IP range. If your objective happens to run services in a cloud (and rest easy, dear reader, we have another entire chapter dedicated to cloud upcoming), this may prove somewhat frustrating, but at least you’ll know what you’re up against. One of the easiest ways to see what range the organization owns or operates in—at least on a high level—is to make use of freely available registry information.

For example, suppose you knew the IP address of a WWW server (easy enough to discover, as you just learned in the previous sections). If you simply enter that IP address in www.arin.net, the network range will be shown. As you can see in Figure 2-6, entering the IP address of www.mheducation.com (54.164.59.97) gives us the entire network range. In this case, the response displays a range owned and operated by Amazon services, indicating MH Education is making use of Amazon’s cloud services. ARIN also provides a lot of other useful information as well, including the administrative and technical point of contact (POC) for the IP range. In this case, as you can see in Figure 2-7, the contacts displayed point us, again, to Amazon web services POCs, letting us know MH Education is relying on Amazon’s security measures and controls (in part) to protect their resources.

Images


Figure 2-6 Network range from ARIN

Images


Figure 2-7 POC information from ARIN

Another tool available for network mapping is traceroute (or tracert hostname on Windows systems), which is a command-line tool that tracks a packet across the Internet and provides the route path and transit times. It accomplishes this by using ICMP ECHO packets (UDP datagrams in Linux versions) to report information on each “hop” (router) from the source to the destination. The TTL on each packet increments by one after each hop is hit and returns, ensuring the response comes back explicitly from that hop and returns its name and IP address. Using this, an ethical hacker can build a picture of the network. For example, consider a traceroute command output from my laptop here in Melbourne, Florida, to a local surf shop just down the road (names and IPs were changed to protect the innocent):

Images

A veritable cornucopia of information is displayed here. Notice, though, the entry in line 12, showing timeouts instead of the information we’re used to seeing. This indicates, usually, a firewall that does not respond to ICMP requests—useful information in its own right. Granted, it’s sometimes just a router that ditches all ICMP requests, or even a properly configured Layer 3 switch, but it’s still interesting knowledge. To test this, a packet capture device will show the packets as Type 11, Code 0 (TTL Expired) or as Type 3, Code 13 (Administratively Blocked).

Images

NOTE    Traceroute will often simply time out in modern networking because of filtering and efforts to keep uninvited ICMP from crossing the network boundary.

All this information can easily be used to build a pretty comprehensive map of the network between my house and the local surf shop down the road on A1A. As a matter of fact, many tools can save you the time and trouble of writing down and building the map yourself. These tools take the information from traceroute and build images, showing not only the IPs and their layout but also the geographic locations where you can find them. McAfee’s Visual Trace (NeoTrace to some) is one such example; others include Trout and VisualRoute. Other traceroute tools include Magic NetTrace, Network Pinger, GEO Spider, and Ping Plotter. Most of these tools have trial versions available for download. Take the plunge and try them—you’ll probably be amazed at the locations where your favorite sites are actually housed!

Images

EXAM TIP    There can be significant differences in traceroute from a Windows machine to a Linux box. Windows uses the command tracert, whereas Linux uses traceroute. Also keep in mind that Windows is ICMP only, whereas Linux uses UDP (and can be made to use other options). Lastly, be aware that a route to a target today may change tomorrow. Or later today. Or in the next few seconds. Routes can be changed and played with by attackers like everything else.

Other Tools

Attempting to cover every tool ever invented aimed at footprinting is a fool’s errand; there are bajillions of tools out there, and we’d never get through them all. However, there are some more common options here and there, and since those are the more likely ones to be on your exam (and used in your day-to-day job), that’s where we should focus our attention. A few other tools worth mentioning are covered here as well.

OSRFramework

If you haven’t heard of OSRFramework (https://github.com/i3visio/osrframework) yet, you probably need to. Per the download site, OSRFramework is “… an open source research framework in Python that helps you in the task of user profiling by making use of different OSINT tools. The framework itself is designed reminiscent to the Metasploit framework. It also has a web-based GUI which does the work for you if you like to work without the command line.” In other words, it’s a set of libraries used to perform Open Source Intelligence (OSINT) tasks, helping you gather more, and more accurate, data using multiple applications in one easy-to-use package. What kind of data can you find? Things like user name, domain, phone number, DNS lookups, information leaks research, deep web search, and much more.

Here are the applications currently (as of this writing) found in OSRFramework:

•   usufy.py   This tool verifies if a user name/profile exists in up to 306 different platforms.

•   mailfy.py   This tool checks if a user name (e-mail) has been registered in up to 22 different e-mail providers.

•   searchfy.py   This tool looks for profiles using full names and other info in seven platforms. ECC words this differently by saying the tool queries the OSRFramework platforms itself.

•   domainfy.py   This tool verifies the existence of a given domain (per the site, in up to 1567 different registries).

•   phonefy.py   This tool checks, oddly enough, for the existence of phone numbers. It can be used to see if a phone number has been linked to spam practices.

•   entify.py   This tool looks for regular expressions.

Images

NOTE    A relatively new offering the “cool kids” are playing with now is Buscador (https://inteltechniques.com/buscador). I haven’t seen it referenced in any courseware or other study materials yet, but it’s worth your time to check out.

Other Tools

Web spiders are applications that crawl through a website, reporting information on what they find. Most search engines rely on web spidering to provide the information they need in responding to web searches. However, this benign use can be employed by a crafty ethical hacker. As mentioned earlier, using a site such as https://news.netcraft.com can help you map out internal web pages and other links you may not notice immediately—and even those the company doesn’t realize are still available. One way web administrators can help to defend against standard web crawlers is to use robots.txt files at the root of their site, but many sites remain open to spidering.

Two other tools of note in any discussion on social engineering and general footprinting are Maltego (which you can purchase) and Social Engineering Framework (SEF). Maltego (https://www.paterva.com/web7/) is “an open source intelligence and forensics application” designed explicitly to demonstrate social engineering (and other) weaknesses for your environment. SEF (http://spl0it.org/projects/sef.html) has some great tools that can automate things such as extracting e-mail addresses out of websites and general preparation for social engineering. SEF also has ties into Metasploit payloads for easy phishing attacks.

Images

NOTE    Even though all the methods we’ve discussed so far are freely available publicly and you’re not breaking any laws, I’m not encouraging you to footprint or gauge the security of any local business or target. As an ethical hacker, you should get proper permission up front, as even passively footprinting a business can lead to some hurt feelings and a lot of red tape. And any misuse of potential PII (personally identifiable information) or other identifying material, purposeful or not, may lead to problems for you and your team. Again, always remain ethical in your work.

Compiling a complete list of information-gathering options in the footprinting stage is nearly impossible. The fact is, there are opportunities everywhere for this kind of information gathering. Don’t forget to include search engines in your efforts—you’d be surprised what you can find through a search on the company name (or variants thereof). Other competitive intelligence tools include Google Alerts, Yahoo! Site Explorer, SEO for Firefox, SpyFu, Quarkbase, and DomainTools.com. The list goes on forever.

Take some time to research these on your own. Heck, type footprinting tool into your favorite search engine and check out what you find (I just did and got more than 250,000 results), or you can peruse the lists compiled in Appendix A at the back of this book. Gather some information of your own on a target of your choosing, and see what kind of information matrix you can build, organizing it however you think makes the most sense to you. Remember, all these opportunities are typically legal (most of the time, anyway—never rely on a certification study book for legal advice), and anyone can make use of them at any time, for nearly any purpose. You have what you need for the exam already here—now go play and develop some skill sets.

Images

NOTE    Hackers are very touchy folks when it comes to their favorite tools. Take our friendly tech editor as an example. He went nearly apoplectic when I neglected to mention Shodan. “It’s the hacker’s search engine!” Shodan is designed to help you find specific types of computers (routers, servers, and so on) connected to the Internet. For example, try out this search string: https://www.shodan.io/search?query=Server%3A+SQ-WEBCAM. You’re welcome.

Regardless of which methods you choose to employ, footprinting is probably the most important phase of hacking you’ll need to master. Spending time in this step drastically increases the odds of success later and is well worth the effort. Just maintain an organized approach and document what you discover. And don’t be afraid to go off script—sometimes following the steps laid out by the book isn’t the best option. Keep your eyes, ears, and mind open. You’ll be surprised what you can find out.

Chapter Review

Vulnerability research, although not necessarily a footprinting effort per se, is an important part of your job as an ethical hacker. Research should include looking for the latest exploit news, any zero-day outbreaks in viruses and malware, and what recommendations are being made to deal with them. Some tools available to help in this regard are the National Vulnerability Database (https://nvd.nist.gov), Securitytracker (www.securitytracker.com), Hackerstorm Vulnerability Database Tool (www.hackerstorm.com), and SecurityFocus (www.securityfocus.com).

Footprinting is defined as the process of gathering information on computer systems and networks. It is the first step in information gathering and provides a high-level blueprint of the target system or network. Footprinting follows a logical flow—investigating web resources and competitive intelligence, mapping out network ranges, mining whois and DNS, and finishing up with social engineering, e-mail tracking, and Google hacking.

Competitive intelligence refers to the information gathered by a business entity about its competitors’ customers, products, and marketing. Most of this information is readily available and is perfectly legal for you to pursue and acquire. Competitive intelligence tools include Google Alerts, Yahoo! Site Explorer, SEO for Firefox, SpyFu, Quarkbase, and DomainTools.com.

DNS provides ample opportunity for footprinting. DNS consists of servers all over the world, with each server holding and managing records for its own namespace. DNS lookups generally use UDP port 53, whereas zone transfers use TCP 53. Each of these records gives directions to or for a specific type of resource. DNS records are as follows:

•   SRV (Service) Defines the hostname and port number of servers providing specific services, such as a Directory Services server.

•   SOA (Start of Authority) Identifies the primary name server for the zone. The SOA record contains the hostname of the server responsible for all DNS records within the namespace, as well as the basic properties of the domain.

•   PTR (Pointer) Maps an IP address to a hostname (providing for reverse DNS lookups).

•   NS (Name Server) Defines the name servers within your namespace.

•   MX (Mail Exchange) Identifies the e-mail servers within your domain.

•   CNAME (Canonical Name) Provides for domain name aliases within your zone.

•   A (Address) Maps an IP address to a hostname and is used most often for DNS lookups.

The SOA record provides information on source host (hostname of the SOA server), contact e-mail (e-mail address of the person responsible for the zone file), serial number (revision number of the zone file), refresh time (the number of seconds a secondary DNS server will wait before asking for updates), retry time (the number of seconds a secondary server will wait to retry if the zone transfer fails), expire time (the maximum number of seconds a secondary server will spend trying to complete a zone transfer), and TTL (the minimum time to live for all records in the zone).

DNS information for footprinting can also be garnered through the use of whois, which originally started in Unix and has generated any number of websites set up specifically for its purpose. It queries the registries and returns information, including domain ownership, addresses, locations, and phone numbers. Well-known websites for DNS or whois footprinting include www.geektools.com, www.dnsstuff.com, and www.samspade.com.

The nslookup command is part of virtually every operating system in the world and provides a means to query DNS servers for information. The syntax for the tool is as follows:

Images

The command can be run as a single instance, providing information based on the options you choose, or you can run it in interactive mode, where the command runs as a tool, awaiting input from you. The command can also provide for a zone transfer, using ls -d. A zone transfer differs from a “normal” DNS request in that it pulls every record from the DNS server instead of just the one, or one type, you’re looking for.

Native to Unix systems but available as a download for Windows systems (along with BIND 9), dig is another tool used to test a DNS query and report the results. The basic syntax for the command is

Images

where server is the name or IP of the DNS name server, name is the name of the resource you’re looking for, and type is the type of record you want to pull.

Determining the network range is another important footprinting task for the ethical hacker. If you simply enter an IP address in www.arin.net, the network range will be shown. Additionally, traceroute (or tracert hostname on Windows systems) is a command-line tool that tracks a packet across the Internet and provides the route path and transit times. McAfee’s Visual Trace (NeoTrace to some), Trout, and VisualRoute are all examples of applications that use this information to build a visual map, showing geographical locations as well as technical data.

Don’t forget the use of the search engine in footprinting! Google hacking refers to manipulating a search string with additional specific operators to search for vulnerabilities. Here are some operators for Google hacking:

•   filetype   Syntax: filetype:type. This searches only for files of a specific type (DOC, XLS, and so on).

•   index of   Syntax: index of /string. This displays pages with directory browsing enabled and is generally used with another operator.

•   intitle   Syntax: intitle:string. This searches for pages that contain a string in the title. For multiple string searches, use the allintitle operator (allintitle:login password, for example).

•   inurl   Syntax: inurl:string. This displays pages with a string in the URL. For multiple string searches, use allinurl (allinurl:etc/passwd, for example).

•   link   Syntax: link:string. This displays linked pages based on a search term.

•   site   Syntax: site:domain_or_web_ page string. This displays pages for a specific website or domain holding the search term.

Social engineering, e-mail tracking, and web spidering are also footprinting tools and techniques. Social engineering involves low- to no-tech hacking, relying on human interaction to gather information (phishing e-mails, phone calls, and so on). E-mail trackers are applications used to track data on e-mail whereabouts and trails. Web spiders are used to crawl sites for information but can be stopped by adding a robots.txt file to the root of the website.

Questions

1.   Which of the following would be the best choice for footprinting restricted URLs and OS information from a target?

A.   www.archive.org

B.   www.alexa.com

C.   Netcraft

D.   Yesware

2.   Which of the following consists of a publicly available set of databases that contain domain name registration contact information?

A.   IETF

B.   IANA

C.   Whois

D.   OSRF

3.   Which of the following best describes the role that the U.S. Computer Security Incident Response Team (CSIRT) provides?

A.   Vulnerability measurement and assessments for the U.S. Department of Defense

B.   A reliable and consistent point of contact for all incident response services for associates of the Department of Homeland Security

C.   Incident response services for all Internet providers

D.   Pen test registration for public and private sector

4.   An SOA record gathered from a zone transfer is shown here:

Images

What is the name of the authoritative DNS server for the domain, and how often will secondary servers check in for updates?

A.   DNSRV1.anycomp.com, every 3600 seconds

B.   DNSRV1.anycomp.com, every 600 seconds

C.   DNSRV1.anycomp.com, every 4 seconds

D.   postmaster.anycomp.com, every 600 seconds

5.   A security peer is confused about a recent incident. An attacker successfully accessed a machine in the organization and made off with some sensitive data. A full vulnerability scan was run immediately following the theft, and nothing was discovered. Which of the following best describes what may have happened?

A.   The attacker took advantage of a zero-day vulnerability on the machine.

B.   The attacker performed a full rebuild of the machine after he was done.

C.   The attacker performed a denial-of-service attack.

D.   Security measures on the device were completely disabled before the attack began.

6.   Which footprinting tool or technique can be used to find the names and addresses of employees or technical points of contact?

A.   whois

B.   nslookup

C.   dig

D.   traceroute

7.   Which Google hack would display all pages that have the words SQL and Version in their titles?

A.   inurl:SQL inurl:version

B.   allinurl:SQL version

C.   intitle:SQL inurl:version

D.   allintitle:SQL version

8.   Which of the following are passive footprinting methods? (Choose all that apply.)

A.   Checking DNS replies for network mapping purposes

B.   Collecting information through publicly accessible sources

C.   Performing a ping sweep against the network range

D.   Sniffing network traffic through a network tap

9.   Which OSRF application checks to see if a username has been registered in up to 22 different e-mail providers?

A.   mailfy.py

B.   usufy.py

C.   entify.py

D.   searchfy.py

10.   You have an FTP service and an HTTP site on a single server. Which DNS record allows you to alias both services to the same record (IP address)?

A.   NS

B.   SOA

C.   CNAME

D.   PTR

11.   As a pen test team member, you begin searching for IP ranges owned by the target organization and discover their network range. You also read job postings and news articles and visit the organization’s website. Throughout the first week of the test, you also observe when employees come to and leave work, and you rummage through the trash outside the building for useful information. Which type of footprinting are you accomplishing?

A.   Active

B.   Passive

C.   Reconnaissance

D.   None of the above

12.   A pen tester is attempting to use nslookup and has the tool in interactive mode for the search. Which command should be used to request the appropriate records?

A.   request type=ns

B.   transfer type=ns

C.   locate type=ns

D.   set type=ns

Answers

1.   C. Netcraft is the best choice here. From the site: “Netcraft provides internet security services including anti-fraud and anti-phishing services, application testing and PCI scanning.”

2.   C. Whois is a great resource to scour public information regarding your target. Registration databases contain data points that may be useful, such as domain registration, points of contacts, and IP ranges.

3.   B. CSIRT provides incident response services for any user, company, agency, or organization in partnership with the Department of Homeland Security.

4.   A. The SOA record always starts by defining the authoritative server—in this case, DNSRV1—followed by e-mail contact and a host of other entries. Refresh time defines the interval in which secondary servers will check for updates—in this case, every 3600 seconds (1 hour).

5.   A. A zero-day vulnerability is one that security personnel, vendors, and even vulnerability scanners simply don’t know about yet. It’s more likely the attacker is using an attack vector unknown to the security personnel than he somehow managed to turn off all security measures without alerting anyone.

6.   A. Whois provides information on the domain registration, including technical and business POCs’ addresses and e-mails.

7.   D. The Google search operator allintitle allows for the combination of strings in the title. The operator inurl looks only in the URL of the site.

8.   A, B. Passive footprinting is all about publicly accessible sources.

9.   A. The tool mailfy.py checks if a user name (e-mail) has been registered in up to 22 different e-mail providers. The choices usufy.py (verifies if a user name/profile exists in up to 306 different platforms), entify.py (looks for regular expressions), and searchfy.py (looks for profiles using full names and other info in seven platforms) are incorrect.

10.   C. CNAME records provide for aliases within the zone.

11.   B. All the methods discussed are passive in nature, per EC-Council’s definition.

12.   D. The syntax for the other commands listed is incorrect.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
13.58.150.59