Chapter 3

Introduction to Perl

Information in this chapter:

In this chapter we’ll be discussing one of the venerable standbys of the scripting world, Perl. Perl is, in theory, an acronym, standing for Practical Extraction and Report Language [1] or perhaps Pathologically Eclectic Rubbish Lister [2] depending on who we’re talking to and what mood he is in. The motto of Perl is “There’s more than one way to do it,” which is a reference to the very loose and open structure of Perl. Perl can enable us to create scripts ranging from elegant and cleanly laid out to purposefully obfuscated and complex.

Due to the plastic nature of scripts developed in Perl, a common response to the viewing of Perl code is an expression of utter confusion as we attempt to parse what the code is doing, an effect sometimes deliberately amplified by the developer. To illustrate the character of some Perl developers, the now defunct magazine, The Perl Journal, even ran a contest called the Obfuscated Perl Contest for several years, with the main goal being to develop the most incomprehensible, yet functional, Perl code.

We’ll be discussing what we might use Perl for in the course of penetration testing. We will go over some of the basics of the Perl language and build a few simple tools to carry out tasks of a penetration testing-related nature. Ultimately, we will sum up our discussion with details on assembling a Simple Network Management Protocol (SNMP) scanner in Perl, and talk about what we might do with it in the course of penetration testing, and how we can improve it.

Where Perl is Useful

If we look at the “official” composition of the Perl acronym, Practical Extraction and Report Language, we can begin to get an idea of where Perl might be useful to us. The original intent for Perl was largely aimed at developing an improved tool for dealing with text. Over time, Perl evolved into the “Swiss Army chainsaw of scripting languages” [3] and is now used in a wide variety of applications from commercial software to a plethora of home-brew hacks. In general, Perl is a great tool for handling and manipulating data.

Handling text

The original intent of Perl was to ease the production of compiling reports from a variety of data sources. Although Perl has grown considerably from these humble beginnings, such tasks are still at the heart of its set of features. Perl is a great tool for manipulating text and data generically, a task we often run across in the course of penetration testing and security activities in general.

Perl is a fantastic tool for parsing text and pattern matching. There are many cases in the course of a penetration test where we might want to search through directories and the files they contain for some specific text, or text patterns.

For example, if we are searching for improperly exposed Personally Identifiable Information (PII) in e-mail logs, we may find ourselves in need of a tool to parse e-mail messages in search of our target data. Using Perl, we can construct a simple script to parse the common format of such messages and use regular expressions (we’ll come back to these later in the chapter) to search for patterns matching sensitive data such as Social Security numbers or payment card numbers. In fact, since Perl has been around for such a long time and is often used for such tasks, we may find such tools already existing with a brief amount of searching, and we may simply be able to make use of existing code or modules in our scripts.

Perl is also a very useful tool for manipulating text. Perl replicates many of the features and functionality of other text manipulation tools such as sed and awk, and is also capable of passing output to external commands in the operating system if we find a task Perl cannot handle using its internal set of features.

Gluing applications together

Perl is often referred to as a glue language. We can frequently see examples of Perl code that exist solely to take the output of one application, perform some operation on the output, and feed the resultant data to another application. We might use such a technique to migrate data from one tool to another, to handle differing data formats between tools, to create reports incorporating data from a variety of sources, such as we might find in a penetration test report, and other similar tasks.

One of the situations we run across frequently in the penetration testing world is the need to deal with the wide variety of output file formats produced by the different tools we may be using. In an average penetration test we may use Nmap, Burp Suite, OpenVAS, and command-line tools such as dig and whois as well as custom-built tools. It is very common to need to pass data from one tool to another, or compile the data from multiple tools into a single format for a report.

A great example of such a use for penetration testing can be found in the Nmap::Parser1 Perl module (we’ll talk more about modules later in the chapter) that exists to provide us an interface to the output from Nmap scans. In the case where we might want to perform an Nmap scan, and then examine the results of the scan and perform additional activities based on those results, this bit of Perl code can considerably ease our task. For example, if we want to conduct an Nmap scan and then run the Web assessment tool Nikto against any targets that have a Web server operating on port 80 or port 443, Perl can provide us with an easy route for doing so. We can certainly develop our own code to parse out the format of Nmap’s output files, but one of the chief attributes of any good coder and/or penetration tester is a certain amount of well-applied laziness. One of the great things about using one of the common scripting languages, such as Perl, Ruby, or Python, to put together such solutions is that we can stand on the shoulders of giants when assembling a solution.

Working with Perl

There are a number of variations of the Perl interpreter, and a number of versions of each variation. We may find Perl distributions such as ActiveState Perl,2 Strawberry Perl,3 and the “official” Perl distribution from perl.org,4 or any of a number of others. Most of these Perl distributions will be, to a large extent, very similar as long as we are dealing with the same major version of Perl and are using it on the same type of platform (Microsoft versus UNIX-like). Such distributions of Perl are usually tweaked in some fashion for the platform on which they are intended to run, or have additional utilities such as development environments and similar such tools.

The version of Perl we will be using for the examples in this chapter is the one that ships with the BackTrack 5 Linux distribution and is the stock Perl Version 5.10 that can be downloaded from www.perl.org. Any similar distribution of Perl should also suffice, but we may potentially begin to find issues with the packages installed by default if we stray too far from this.

Editing tools

As with most any script and scripting language we will be discussing in this book, we are relatively open in terms of our choice of editing tools. The examples in this chapter were constructed using the Kate editor on BackTrack 5, as we discussed in Chapter 1 when we talked about shell scripting. For those on Windows operating systems, there are a number of editors that will suffice for development, including the truly great editor Ultra Edit,5 and the Windows port of Kate,6 as shown in Figure 3.1.

image

FIGURE 3.1 Kate on Windows

Extending Perl scripts

Perl is a great tool for many things, but, like many other scripting languages, it is limited by the environment in which it is designed to function. In order to run a Perl script, we need to have an interpreter on the system on which it will be run, and we are constrained to what we can do with a command-line interface. For both of these issues, in terms of Perl scripts, we can cheat a bit to get around them.

Compiling Perl scripts

As we’ve discussed in the book thus far, interpreted programming languages require a tool to process the script when it is run, namely the interpreter. If we are in a situation where we do not have an interpreter on the machine on which we wish to run our scripts, which is not an unlikely situation in a penetration test, we have a bit of a problem. Fortunately, this is something we can cope with in some scripting languages, including Perl. We can simply compile our script.

Now wait a minute, didn’t we just discuss how Perl was an interpreted language, one that, by definition, does not get compiled? Yes, we did indeed. What we can do in such cases is not compiling in the strictest sense of the term, but the end results are the same. There are a number of tools, including PerlApp7 from ActiveState and Perl2Exe8 from IndigoStar, which will, in essence, wrap up a small copy of a Perl interpreter, our script, and any modules of other dependencies, and generate an executable binary for a variety of platforms. This can be a very handy capability when we can’t control or change the environment on the system we are using.

GUIs in Perl

In the vast majority of the time, Perl scripts are used to produce command line-driven applications. Sometimes, however, we may need to create a tool with a graphical user interface (GUI). For instance, if we are dealing with a technically challenged user, or we need to add an interface to an existing tool, a GUI might be just what we need.

There are actually quite a few different libraries in which we can access such capabilities in Perl. We can make use of Perl/Tk to access the Tk widget toolkit, access the features of the GIMP Toolkit (GTK+), or any of a number of others. One of the more convenient features of using such tools is that we can generate GUI-driven Perl tools that can be used among several platforms without needing to modify the graphical portions of the script.

Perl Basics

In this section, we’ll be going over some of the basic structures that Perl uses, including variables, how to run shell commands, how to use modules to extend Perl, arguments, control statements, file input and output, and the use of regular expressions.

Hello World

Okay, here we go with the standard Hello World script in Perl:

#!/usr/bin/perl

print "hello world! ";

After creating the file, setting the permissions to make it executable, and running the script by issuing a command such as ./helloworld.pl, we should see output similar to that shown in Figure 3.2.

image

FIGURE 3.2 Hello World Script and Output

Let’s take a look at this very simple script. We start with the shebang, as we discussed in Chapter 1. In this case, we can generally find the Perl interpreter located at /usr/bin/perl on most UNIX-like systems. If it isn’t there, we can try to find it with whereis perl or we can manually check other common directories such as /usr/local/bin, /bin, or /opt/bin/. If all else fails and we have administrative access on the system, we can do a find for it, by issuing something like find / -name perl –print.

TIP

The sharp-eyed among us who looked at Figure 3.2 might point out that we named the file containing our script helloworld.pl. While the .pl extension on the filename is not needed or necessarily used on UNIX-like systems, it does serve to alert us that the script might contain Perl code without us having to open it. There are differing opinions on whether this makes sense in a UNIX-like environment, but it’s really a matter of personal preference in most cases.

On the second line of the script, we simply print out a string. There are a couple of odd bits in there that might be unfamiliar to those of us that have not dealt with Perl previously. The first is the at the end of the string we will be echoing out. The indicates we are inserting a newline at the end of our string. There are a number of similar character combinations we can use in Perl. Table 3.1 lists several of the commonly used combinations.

Table 3.1 Special Characters in Perl

Character Behavior
a Bell
 Backspace
e ESC
f Form feed
Newline
Carriage return
Tab

We might also notice that the output line of our script ends in a semicolon. In Perl, with a few odd exceptions, every statement needs to end in a semicolon. Notice we said “statement” here, not “line.” We can also have multiple statements on one line, or a multiline statement. There are a few cases where we could get away without using a semicolon, but we are generally okay by putting it in anyway, just to be consistent.

Variables

Perl, of course, also has a number of data structures we can use to store things, the more common among them being the scalar variable. Variables in Perl are always addressed with a $, whether altering the contents or simply reading them, and we can use a variable to store and manipulate a variety of content without having to declare it to be of a particular type, such as a string or integer. We can make use of variables in this simple script:

#!/usr/bin/perl

print "Hi. What is your name? ";

$name = <STDIN>; #take in a name

chomp $name;

print "$name is a nice name ";

We used a variable here, and a couple of other new things as well, so let’s walk through them. We have the shebang on the first line, as usual, so that we use the proper interpreter. On the second line, we echo out a string to ask for a name. The next line defines our variable, $name, and waits for input from standard input, <STDIN>, which is the console where we will type when the script runs. This will place whatever we type at the console, until we press Enter, into our variable. Also notice our comment at the end of the line, starting with a pound sign, #. This will prevent the rest of the line from being interpreted when the script runs.

The next line makes use of chomp, a very handy Perl function. When we took the input from <STDIN>, we ended it by pressing Enter. The newline represented by Enter (also known as ) was also fed into our variable, and will show in our last output statement if we don’t get rid of it, which is what chomp does. While chomp is specific to the newline, we could have also used chop, which will get rid of the last character, whatever it happens to be.

The last line of our script will echo out a compliment and make use of the data we stored in $name.

Shell commands

Another interesting and very handy tool we can make use of is the ability to execute shell commands, very similar to what we did in Chapter 1 with bash scripts. We’ll take a look at how we can execute a shell command, and we’ll also take a look at using Perl to manipulate the timestamps of a file.

#!/usr/bin/perl

$file = "testfile";

’touch $file’;

 

$origaccessed = (stat($file))[8];

$origmodified = (stat($file))[9];

print "original accessed = $origaccessed ";

print "original modified = $origmodified ";

 

sleep(5);

’touch $file’;

 

$newaccessed = (stat($file))[8];

$newmodified = (stat($file))[9];

print "new accessed = $newaccessed ";

print "new modified = $newmodified ";

 

utime $origaccessed, $origmodified, $file;

 

$finalaccessed = (stat($file))[8];

$finalmodified = (stat($file))[9];

print "final accessed = $finalaccessed ";

print "final modified = $finalmodified ";

We start, of course, with the shebang. We then set up a variable, $file, and put the string testfile into it. This will be the filename we will be working with in the script. On the second line, we make use of backticks, just as we did in Chapter 1 with bash. In this case, our backticks enclose the Linux command touch, which will update the timestamps on the filename we have stored in $file, and will also create the file if it does not already exist.

Next, we will make a copy of the timestamps that already exist in our file. We will do this using the stat command, which is built into Perl. The stat command will return quite a bit of information9 to us about the file, including the timestamps in which we are interested. We can address the stat command as though it were an array in order to access the particular bits in which we are interested. Here we will make use of stat like (stat($file)[8]), for example, in order to access the eighth element, the accessed timestamp for the file. We would do likewise to access the ninth element for the modified time. For each of these items, we feed the results of stat into our variables, $origaccessed and $origmodified, respectively. We then echo out the contents of these two variables in order to display the starting timestamps for our file.

NOTE

The timestamps returned by stat are returned to us in time measured in seconds since the epoch. This is a fancy way of saying “the number of seconds that have elapsed since January 1, 1970, 00:00:00 UTC.” [4] We’ll talk more about how to render these into human-readable time later in this section.

After recording and echoing the starting timestamps, we then wait five seconds using the sleep command, and execute the touch command to update the timestamps on our file once more. We use the sleep command so that we can see a slightly bigger difference in the timestamps for the next portion of the script.

Next we go through another cycle of using stat to retrieve our newly updated timestamps from the file, record those to a variable, and display them out once more. We should see that the timestamps have changed slightly, largely due to our use of sleep to cause a short delay.

Since we have a record of the original timestamps for the file, we can make use of the utime command to reset the altered timestamps for the file back to the originals. The utime command is very simple to use, and we can just feed the raw timestamp data we have recorded in $origaccessed and $origmodified right back into utime. Simply enough, we use utime by including the code utime, $origaccessed, $origmodified, $file. Next we make one more pass at getting and displaying the timestamps, and we should see the original timestamps as a result, as shown in Figure 3.3.

image

FIGURE 3.3 Timestamps Script Output

We can do a couple of things to make this script a bit better. We can do something with the timestamps so that they are more easily understood, and we can make use of functions so that we don’t have to keep repeating the code that fetches and prints the timestamps from the file.

#!/usr/bin/perl

$file = "testfile";

’touch $file’;

print "original timestamps ";

($origaccessed, $origmodified) = get_timestamps($file);

sleep(5);

’touch $file’;

print "modified timestamps ";

get_timestamps($file);

utime $origaccessed, $origmodified, $file;

print "final timestamps ";

get_timestamps($file);

sub get_timestamps

{

$timestampfile = @_[0];

@timestamps[0] = (stat($timestampfile))[8];

@timestamps[1] = (stat($timestampfile))[9];

print "accessed = ", scalar localtime(@timestamps[0])," ";

print "modified = ", scalar localtime(@timestamps[1])," ";

return @timestamps;

}

Although this looks quite a bit different, it’s largely the same script. We can see a new function at the end, containing our code that fetched and prints the accessed and modified timestamps, which is where most of the changes are. Let’s start by taking a quick look at line 5 where we call the function ($origaccessed, $origmodified) = get_timestamps($file);. In this case, we are doing two things: We are calling the function with the name get_timestamps and passing the contents of the variable $file to it (our filename), and we are taking the output generated by calling that function and placing it in the variables $origaccessed and $origmodified.

Let’s take a quick look at the function. In Perl, functions start with sub, then the function name and curly brackets, {}, to enclose the contents of the function. In this case, our function is called get_timestamps. The first line inside the function, $timestampfile = @_[0], might look like a bit of an oddity. In Perl, @_ is the array that holds arguments. Using it here, we have populated the variable $timestampfile with the contents of the first element of the arguments array, the filename we passed when we called this function. On lines 9 and 12 of our script, we can also see the same function addressed in a slightly different way, as get_timestamps($file);. In these cases, we do not care about storing the returned results in a variable, so we just call the function to get to print out the current timestamps for the file.

Arguments

Although we have briefly discussed the use of arguments within a function, we have not talked about how to use them to pass arguments to the script when it runs. For the timestamp tool we’ve been working on, it might be handy to be able to pass it a filename from the command line, rather than having the filename hard-coded into the script. Let’s take a look at how to do that.

In the case of passing arguments from the command line, once again, Perl uses an array to hold them, called @ARGV. Modifying our timestamp script to make use of this is a simple task. On line 2 of the script, where we presently have $file = "testfile";, we simply need to change it to read $file = @ARGV[0];. The rest of the script remains exactly the same, but when we execute it we now need to provide a filename, something like ./timestamps3.pl testfile2.

WARNING

In Perl, the elements of an array are properly accessed by addressing them as a scalar variable with a $, such as $array[0] to access the first element of @array. In most cases, addressing the elements as @array[0] will work, as we have done in this chapter, but occasionally it may fail in odd and unexpected ways. If we do a bit of searching, we can find proponents of either method, but we should be aware of the “right” way to use arrays.

Just as we will see with several of the other scripting languages in the book, the @ARGV array will contain the arguments passed in at the command line, in order, one in each element of the array. So, if we wanted to pass multiple filenames to the script, we would look for the first in @ARGV[0], the second in @ARGV[1], and so on. We would, of course, also need to change the script a bit to handle multiple files.

Control statements

Now we’ll tackle using control statements in Perl. We will go over how we can make use of conditionals in order to make decisions in our Perl script, as well as how we can make use of the various loops available to us. As we go along, we will build up a port scanner we can use as the basis for our final project in the chapter.

Conditionals

Our main conditional in Perl revolves around the if statement. The if statement in Perl is structured like this:

If (condition){

#execute code

}else{

#execute different code

}

Let’s quickly put something together with that:

#!/usr/bin/perl

use Net::Ping;

 

$host = "10.0.0.1";

 

$pinger = Net::Ping->new("icmp", 1, 64);

if ($pinger->ping($host)) {

print "$host is up ";

} else {

print "$host is down ";

}

We have a few new things in here, and some that should be relatively familiar by now from having looked at shell scripting in Chapter 1.

Line 1 is, of course, our shebang to point at the proper interpreter. Next we have use Net::Ping;. This is the first time we have looked at modules in Perl, so we’ll talk about them for a second. A Perl module is a self-contained chunk of Perl code, generally constructed to serve some specific purpose. The module we use here, Net::Ping, specifically exists to perform ping functions. We can think of a Perl module as being an extension of the idea of using functions. We make use of functions so that we don’t have to repeat the same code over and over, and we can tuck it off to the side somewhere. Perl modules are based on much the same concept, just on a generally larger scale. We make use of modules with the use statement and then the module name, as we did in our line containing Net::Ping earlier, with Net::Ping being the module name.

Next we set up the variable $host with an IP address to feed to our ping module, and then set up the line that will actually conduct our pings. In this case, we’ll call the ping object we will be using $pinger, and we’ll tell it to make a new instance of the object and that we want to send one Internet Control Message Protocol (ICMP) ping.

We next set up an if statement that attempts to ping the value in $host, the conditions of which are whether an error is returned or not. If no error is returned, we echo a message indicating success; if we do see an error, we echo a message indicating failure. It’s pretty simple code, but it functions nicely to ping a host, as we can see in Figure 3.4.

image

FIGURE 3.4 If Script Output

Next, we’ll look at adding a bit of looping to make our script more useful.

Looping

Looping in Perl is, similarly to the conditional statements we discussed, comparable to what we saw when constructing shell scripts for bash in Chapter 1. The basic structure of our most common loop, the for loop, is:

for (starting value;test;alter value){

#code goes here

}

So, if we wanted to do something simple like count to 10, we could set up a loop similar to this:

#!/usr/bin/perl

for($counter=1;$counter<=10;$counter++){

print "the counter is ", $counter, " ";

}

In order to set up the for loop here, we set our variable $counter to 1, indicating the starting place for our loop. We then set up our test, checking to see whether the value stored in $counter is less than or equal to 10. Lastly, we increment the value in the counter variable by 1, using $counter++.

Let’s make our ping code from earlier in the chapter a bit more functional. We can use a bit of looping to turn our single-shot ping tool into a ping sweep tool. This is going to get a bit heavier quickly, due to some magic we need to perform in order to increment IP addresses properly, but hang in there, we’ll walk through the script and explain it all.

#!/usr/bin/perl

use Net::Ping;

 

$ip1 = @ARGV[0];

$ip2 = @ARGV[1];

$rawip1 = get_raw_address($ip1);

$rawip2 = get_raw_address($ip2);

 

for ($counter = $rawip1;$counter<=$rawip2;$counter++){

   $host = get_ip_address($counter);

   $pinger = Net::Ping->new("icmp", 1, 64);

   if ($pinger->ping($host)) {

   print "$host is up ";

   } else {

   print "$host is down ";

   }

}

 

###### get_raw_address ######

#get the raw version of an IP

sub get_raw_address {

 

   my $ipaddress;

   my $oct1;

   my $oct2;

   my $oct3;

   my $oct4;

   my $retval;

 

   $ipaddress = shift;

   ($oct1, $oct2, $oct3, $oct4) = split /./, $ipaddress;

   $retval = $oct4 + ($oct3 2∗∗8) + ($oct2 2∗∗16) + ($oct1 2∗∗24);

   return $retval;

}

 

###### get_ip_address ########

#get the regular version of an IP

sub get_ip_address {

 

   my $rawaddress;

   my $retval;

   my $oct;

   my $counter;

 

   $rawaddress = shift;

   while ($counter<4){

     $oct = $rawaddress % 2∗∗8; #get the rightmost 8 bits

     $retval = $oct . "." . $retval;

     $rawaddress = int($rawaddress / 2∗∗8); #get the next 8 bits

     $counter++;

   }

   chop $retval;

 

   if ($retval =~ m/.(255 |0)$/) { # skip 0 & 255 addresses

     return 0;

   }

   print "retval = ", $retval, " ";

   return $retval;

}

We start with the same shebang and use statement to load the Net::Ping module as we did previously. We then take in our starting and stopping IPs from the @ARGV array and place them into $ip1 and $ip2.

Next, we need to do a little bit of work in order to get our IP addresses into a format we can work with so that we can increment them in a reasonable way. With the IP address in the format of ###.###.###.###, with each octet ranging from 0 to 255, we would have to do quite a bit of contortion to move from one IP to the next, especially over a large range, so we will simply change the number format. In order to do this, we pass the IP addresses in $ip1 and $ip2 to the get_raw_address function and place the results into $rawip1 and $rawip2, working with these IPs in raw form when we need to move from one IP to the next.

In the get_raw_address function, we set up a number of variables to hold the incoming IP address, the individual octets that make up the IP, and the raw value we will return. Notice we use my in front of the variables here, which makes them local in scope to our function. This will keep us from having an issue with the $retval variable, specifically, which is also used in our other function.

The line $ipaddress = shift; is a bit of Perl magic. The shift command is normally used to remove the first element of an array and slide the rest of the array down one, shortening the entire array by one element. If we do not supply an array when we use shift, it will be assumed that we mean either @ARGV if we are working in the main part of a script, or @_ if we are in a function. So, in essence, just using shift by itself will access our array of arguments, starting with the first element of the array, and pulling out the next element in line, each time we call it. In this case, we take the argument we passed when we call the function and put it into $ipaddress.

Next we split the value in $ipaddress at the dots between the octets, and place each octet into $oct1, $oct2, $oct3, and $oct4. We then do a bit of mathematical processing (the ∗∗ indicates an exponent in Perl) in order to convert the octets of our IP to decimal and combine them into one easily incrementable number, and we put the result into $retval.

Once we have done all this, we return the result, to be placed into our $rawip1 and $rawip2 variables from where we called the function originally. Whew. That was a lot of work for something seemingly simple. If we take a quick peek into the $retval or $rawip variable (we can just print them out in the code), we will see an IP address like 10.0.0.1 rendered into a number like 167772161, which we can handle a little more easily for the purposes of incrementing.

Back in the main body of our script, we now set up the for loop that will run our ping sweep, for ($counter = $rawip1;$counter<=$rawip2;$counter++). Here we set up the $counter variable with the raw form of the IP address that starts the range we will be pinging. We then check to see if $counter is less than or equal to the IP that indicates the end of the range; if so, we continue, and if not, we stop. If we are continuing, we increment $counter by 1.

Inside the for loop, everything is largely the same as it was in the previous version of our script, with one exception. Now that we have converted our IP address to the numeric format, it doesn’t do us much good for purposes of pinging, so we need to get it back into the normal IP format so that we can work with it here. In the first line inside our for loop, we call the get_ip_address function and pass it the value in $counter.

Inside the get_ip_address function, we essentially do the opposite of what we did in the get_raw_address function. We start by defining a few variables at the top of the function. Notice here we are using the $counter variable again, which is used elsewhere in the script. This isn’t a problem here, because we have created the variable using the my keyword in order to make its scope local to the function. Next we shift in the argument to the function, pulling it from @_, the arguments array, since we did not specify otherwise.

Next we work through a while loop. Constructed in this way, the while loop works essentially the same way that a for loop would work, but the structure is slightly different. While a for loop generally goes through its cycle a certain number of times, the while loop keeps going until its condition tests false. In this case, we are looking for $counter to be less than 4, and incrementing it with $counter++ within the loop. This results in four passes through the loop, once for each octet in the IP address we will be reconstructing.

Inside the loop, we take the contents of $rawaddress and pull off eight bits at a time, starting on the right side, converting those back into the proper notation, and placing them into $oct. With each pass through the loop, we put the octet into $retval, adding the appropriate dots to delimit the IP.

After the loop finishes, we end up with an extra dot at the end of the IP, so we use chop to remove it. As we discussed earlier in the chapter, chop will remove the last character of a string, whatever it happens to be. When the function finishes, we send $retval back to our for loop in the main body of the script, and keep looping until we hit the end of our IP range. We should have output that looks something like Figure 3.5.

image

FIGURE 3.5 Pingsweep Script Output

That’s all there is to it. Now we have a nice ping sweeper we can use as a basis to build other things, or just use as is. We will ultimately end up using this as the base for our SNMP scanner we will put together at the end of the chapter.

Regular expressions

Regular expressions, otherwise known as regex, are a very handy tool we can make use of to handle text in Perl. We can use regex to search for common patterns in text, such as we might find with MAC addresses or IP addresses, or we may need to construct one for an entirely different pattern altogether, such as a serial number or other relatively unique pattern.

NOTE

We can find regex, or their functional equivalent, in most scripting and programming languages we might care to use. Although we may find some syntactical differences in the way they are handled among different languages, the fundamentals of regex tend to stay the same.

Let’s get a bit of information to work with when using our regex. One bit we might be interested in during the course of penetration testing is the MAC address. MAC addresses can (relatively) uniquely identify the network interface on a given device, and potentially give us information regarding the manufacturer and model of the device.

WARNING

On most operating systems, it is possible, and often trivial, to change the MAC address associated with the network interface. On most Linux operating systems, we can alter the MAC address by using the ifconfig command with something similar to this:

Ifconfig eth0 down

ifconfig eth0 hw ether DE:AD:BE:EF:CA:FE

ifconfig eth0 up

We should be aware that the MAC information we are looking at in a penetration test may have been altered.

We can view our MAC information under Linux using the ifconfig command. Simply issuing ifconfig at a command prompt will echo out quite a bit of information, including the MAC address on the first line, right after HWaddr, as shown in Figure 3.6.

image

FIGURE 3.6 ifconfig Output

While we could simply grep for the MAC by running ifconfig | grep HWaddr and the entire first line of output back, we can also make use of a regex to retrieve items matching the pattern of a MAC address in the output.

#!/usr/bin/perl

 

$text = ’ifconfig | grep HWaddr’;

 

print "the string is ",$text," ";

 

$text =~ m/((?:[0-9a-f]{2}[:-]){5}[0-9a-f]{2})/i;

 

print "the mac is ",$&," "; #$& is the previous successful match

So, here we have the standard shebang, and our ifconfig line to get the line of text from the network information we know will contain the MAC address and place it into $text. Since we piped the output from ifconfig through grep, we won’t have to deal with the other lines that ifconfig returns, and we could use this same method to narrow down the results to other items as well, such as the IP address. We’ll print out the string from $text so that we can see exactly what we’ll be matching against, and then proceed onto our regex.

To those not familiar with regular expressions, this line:

$text =~ m/(([0-9a-f]{2}[:-]){5}[0-9a-f]{2})/i; might seem a bit confusing and look largely like random gibberish characters. The m before the first forward slash / is the match operator. The characters between the two forward slashes are actually the pattern we use to find the MAC address that is part of the line stored in $text. In the pattern, we have two main sections; the first deals with the first five bytes of the MAC address, and the second deals with the last byte.

The first section, ([0-9a-f]{2}[:-]){5}, says to look for a pattern that starts with two characters in the range of 0–9 or a–f, with these followed by a colon : or a dash -, and to look for five repetitions of this pattern, accounting for the first five bytes of our MAC address. The sixth byte of the MAC address does not end in a colon, so we need to change the pattern slightly. For the sixth byte, we match against [0-9a-f]{2}, meaning two characters in the range of 0–9 or a–f.

We wrap the entire set of the pattern in parentheses () and add the /i to make our pattern case-insensitive. This is not a completely perfect regex, but it will match properly the vast majority of the time. We might find a corner case where we have a similar pattern that mixes colons and dashes, for instance, and accidentally match that, but this will likely be a corner case for most applications to which we would put this type of script.

The last line of our script, print "the mac is ",$&," ";, prints out the MAC address we found using our regex. This line is relatively clear, other than the use of a special variable $&, which will contain the string that was found in the most recent pattern match performed in our script, namely our MAC address.

There are a number of other character designators for matching patterns that we can use in our regexes. Table 3.2 lists a few of them.

Table 3.2 Regex Pattern Characters

Character Behavior
d Digit character
D Nondigit character
e Escape
Newline
Return
s Any whitespace character
S Any nonwhitespace character
Tab
Match 0 or more times
. Any character
+ Match 1 or more times
? Match 1 or 0 times
{n} Match n times
{n,} Match at least n times
{n,m) Match at least n times, but not more than m

We can also test out our regular expressions and tweak them separately from our code, by using any of a number of online regex tools, such as we might find at http://regextester.com.

File input and output

We can take the script we used to match against MAC addresses, and build on it to add a few additional features and make it more useful. One common task we might find ourselves wanting to perform in a script is to take output from or send output to a file. File input and output in Perl is simple enough. In order to open a file for output, we just need a name for the file handle and the name of a file. We can open the file with several different options to access it in different ways:

open (FILE, ">logfile.log"); #write

open (MONKEY, ">>somefile"); #append

open (INPUT, "<datafile.dat"); #read

open (MYFILE, "file.txt"); #read

Using the > symbol opens the file for writing, >> opens the file for writing but will append new content to it if it already exists rather than overwriting the file, < opens the file for reading, and using no designator at all opens the file for reading as well. Closing an opened file is very simple as well; we simply use close and the file handle, as in close (MYFILE);. Let’s put file access to use, and tune up our MAC script to be more useful.

#!/usr/bin/perl

 

#fetch the OUI database from IEEE

’wget -N http://standards.ieee.org/develop/regauth/oui/oui.txt 2>/dev/null’;

 

open (LOG, ">>maclog.log") || die "Cannot open maclog.log for append $! ";

 

$netinfo = ’ifconfig | grep HWaddr’;

print "network information is ",$netinfo," ";

print LOG "network information is ",$netinfo," ";

$netinfo =~ m/(([0-9a-f]{2}[:-]){5}[0-9a-f]{2})/i;

 

$mac = $&; #$& is the previous successful match

print "the MAC address is ",$mac," ";

print LOG "the MAC address is ",$mac," ";

@macparts = split /:/, $mac;

 

@ouiparts = splice(@macparts,0,3);

$oui = join(’’,@ouiparts);

print "the OUI is ",$oui," ";

print LOG "the OUI is ",$oui," ";

 

open (OUIDB,"oui.txt") || die "Cannot open oui.txt $! ";

while (<OUIDB>){

   $line = $_; #$_ is the implicit scalar variable

   print "line is", $line,"oui is ",$oui," ";

   if($line =~ /$oui/i){

     @ouientry = $line;

     last;

   }else{

     @ouientry[0] = "manufacturer not found";

   }

}

close (OUIDB) or die "Cannot close oui.txt $1 ";

 

 

@ouientryfields = split(/ /,@ouientry[0]);

 

print "the manufacturer is ",@ouientryfields[2]," ";

print LOG "the manufacturer is ",@ouientryfields[2]," ";

print LOG "∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗ ";

 

close (LOG) or die "Cannot close maclog.log $1 ";

WARNING

We intentionally left a logic error in this script. The script will run, but will fail to behave properly under certain conditions. This is an excellent opportunity to practice our debugging skills.

We run the script as ./checkmac2.pl. We will likely see a slight delay the first time we run the script as the OUI file is downloaded. Then we should see output similar to that shown in Figure 3.7.

image

FIGURE 3.7 Checkmac Output

So let’s walk through the script and see what exactly we are doing. We have the standard shebang at the top, and then we run wget in order to retrieve oui.txt from the Institute of Electrical and Electronics Engineers (IEEE) Web site. Note that, for the wget command only, we redirect the output of the command to /dev/null by using 2>/dev/null, effectively keeping the output of wget from displaying to the console. The oui.txt file is a flat file that maps the Organizationally Unique Identifier (OUI) that constitutes the first three bits of a MAC address, to the company associated with it. The OUI is an identifier, or identifiers, associated with a particular company, and all MAC addresses associated with equipment or software (in the case of virtualized network hardware) produced by that company will carry the company’s OUI.

Once we have the file, we open a new file, maclog.log, in append mode, so we will be able to write out our results later in the script. We will also issue an error message, including the exact error, stored in $!, and quit the script, if we cannot open the file.

We then get the information from ifconfig, just as we did previously, and echo the information to both the console and the log file. We also use the same regex to match the MAC address pattern, placing the MAC address into $mac. Now that we have the MAC address, we need to split it into its component bytes so that we can separate out the OUI, the first three bytes. We do this using the split command, telling split to use the colon as a delimiter, and to place the results into the array @macparts.

We then take the first three elements of @macparts, the OUI, and use splice to extract them and place them into @ouiparts. The splice command takes an array, @macparts, a starting point, element 0, and a length, three, as arguments, allowing us to take exactly the elements we need. After this, we join the elements back together, with no characters in between, and place them into the variable $oui. This is the format we will need to look up the OUI in oui.txt.

Now we can open oui.txt, and parse through it for our OUI. Here we use open with no parameter, thus opening for reading by default, and using OUIDB as a file handle.

Once we have the file open, we use a while loop with the file handle, which will keep looping while there are lines in the file we have not parsed. We use $_, which is the implied variable associated with the line in the file, to pass that particular line into the variable $line. We then use a simple if statement to check whether the value in $line matches the value in $oui, thus indicating we have found the match we are looking for. If the line does match, we place it into the array @ouientry, and issue last in order to exit the while loop. If the line does not match, we place our not found string into the first element of @ouientry. Once we have worked all the way through the oui.txt file, we close it using close.

Now that we have the line from oui.txt we need, and have that data in @ouientry, we can split it out to get at just the piece we need. We do this by using split once again, this time by splitting on the tabs, /t, between the fields, and placing the results into @ouientryfields, with the field we want, the company name, being in element 2. We then print this information out to the console and to the log file and close it. Whew, that was a lot of work.

Putting It All together

We’ve looked at a number of different bits of Perl in this chapter, and we have a few interesting places to start from the scripts we have put together thus far, so let’s build something practical from them.

Building an SNMP scanner with Perl

SNMP is a protocol we can use to monitor and manage a large number of devices on a network. SNMP can be used to collect information from devices and make changes to them, and support for it is implemented in a broad range of hardware and software devices.

A given device may or may not respond to an SNMP request, depending on how it is configured, or authentication of some variety may be needed to talk to it. More recent software and devices tend to be configured more securely, as SNMP goes, and may not respond to such inquiries at all by default. A good test target for SNMP-oriented tools tends to be network printers, as they are very chatty on the network, and tend to be very insecurely configured.

#!/usr/bin/perl

use Net::Ping;

use Net::SNMP;

 

@log; #array that holds the log

$time = localtime;

push (@log," ###### $time ###### ");

 

#variables for ping

$ip1 = @ARGV[0];

$ip2 = @ARGV[1];

$rawip1 = get_raw_address($ip1);

$rawip2 = get_raw_address($ip2);

#variables for SNMP

$mibName = "1.3.6.1.2.1.1.5.0"; # System Name

$mibDescr = "1.3.6.1.2.1.1.1.0"; # System Description

$mibHardwareType = "1.3.6.1.2.1.25.3.2.1.2.1"; # hardware type

$port = 161;

$community = "public";

$retries = 1;

 

#main loop

for ($counter = $rawip1;$counter<= $rawip2;$counter++){

   $host = get_ip_address($counter);

   $pinger = Net::Ping->new("icmp", 1, 64);

   if ($pinger->ping($host)) {

     print " $host is up ";

     push (@log," $host is up");

     &init_snmp;

     &get_snmp_info;

     $session->close;

     &write_log;

   } else {

     print " $host is down ";

     push (@log," $host is down");

     &write_log;

   }

}

 

###### get_raw_address ######

#get the raw version of an IP

sub get_raw_address {

 

   my $ipaddress;

   my $oct1;

   my $oct2;

   my $oct3;

   my $oct4;

   my $retval;

 

   $ipaddress = shift;

   ($oct1, $oct2, $oct3, $oct4) = split /./, $ipaddress;

   $retval = $oct4 + ($oct3 2∗∗8) + ($oct2 2∗∗16) + ($oct1 2∗∗24);

   return $retval;

}

 

###### get_ip_address ########

#get the regular version of an IP

sub get_ip_address {

 

   my $rawaddress;

   my $retval;

   my $oct;

   my $counter;

 

   $rawaddress = shift;

   while ($counter<4){

     $oct = $rawaddress % 2∗∗8; #get the rightmost 8 bits

     $retval = $oct . "." . $retval;

     $rawaddress = int($rawaddress / 2∗∗8); #get the next 8 bits

     $counter++;

   }

   chop $retval;

 

   if ($retval =~ m/.(255 | 0)$/) { # skip 0 & 255 addresses

     return 0;

   }

   return $retval;

}

 

###### init_snmp ######

#set up an SNMP session

sub init_snmp {

   ($session, $error) = Net::SNMP->session(

   Hostname => $host,

   Community => $community,

   Port => $port,

   Retries => $retries

   );

 

   if(!defined($session)){

     die "Couldn’t setup SNMP session "

   }

 

   $session->timeout($timeout);

}

 

###### get_snmp_info ######

#retrieve our specified information

sub get_snmp_info{

 

$name = &get_request($mibName);

 

if ($name =~ /no response/){

   print "no SNMP response from ",$host," ";

   return;

}

print "name = ",$name," ";

push (@log,"name = $name");

 

$description = &get_request($mibDescr);

print "description = ",$description," ";

push (@log,"description = $description");

 

$hardware = &get_request($mibHardwareType);

if ($hardware =~ /1.3.6.1.2.1.25.3.1.5/){

   $hardware = "Printer";

}

if ($hardware =~ /1.3.6.1.2.1.25.3.1.3/){

   $hardware = "Processor";

}

if ($hardware =~ /1.3.6.1.2.1.25.3.1.4/){

   $hardware = "Network";

}

if ($hardware =~ /1.3.6.1.2.1.25.3.1.6/){

   $hardware = "Disk Storage";

}

if ($hardware =~ //){

   $hardware = "Unknown";

}

 

print "hardware = ",$hardware," ";

push (@log,"hardware = $hardware");

 

}

 

###### get_request ######

#grab a specific MIB

sub get_request {

   # Takes only one MIB as an argument!

 

   my $response;

   my $return;

 

   if(!defined($response = $session->get_request($_[0]))) {

     return "no response";

   }

   $return = $response->{$_[0]};

   return $return;

}

 

###### write log ######

#write out all the log entries in @log

sub write_log{

   open (LOG, ">>snmp.log") || print "Error Opening snmplog.log: $! ";

   print LOG join(" ",@log), " ";

   close(LOG) or die "Error Closing snmplog.log : $! ";

   @log = (); #clear the log array

}

We can execute the SNMP scanner script with ./snmp.pl 10.0.0.50 10.0.0.55. This should produce a result similar to Figure 3.8, although the exact information returned will, of course, depend on the network we run the script against, and the configuration of the devices we scan. In order to get good data back from SNMP, it is entirely possible that we might need to enable it on the target device, as is the case with more recent versions of UNIX-like and Microsoft operating systems.

image

FIGURE 3.8 SNMP Scanner Output

Let’s step through the script. A portion of this script is the same as the ping scanner we worked on previously, so we’ll gloss over the points unchanged.

At the top of the script, we can see the shebang, as well as the Net::Ping module we used previously. We’ve also added a statement to make use of the Net::SNMP module, which will allow us to make SNMP connections and retrieve the information we will be looking for.

We have also added a slightly different logging mechanism than what we used in our MAC script. Previously, we opened the log file and left it open while we printed to it throughout the script. Now we will store our log entries in the array @log and we will only open the file when we are actually going to dump out the contents of our log, then close it directly afterward. This keeps us from needing to hold the log file open the entire time the script is executing, which could be an extended period of time if we are scanning a large IP range. We place entries in the log array by using push, as we can see in the line push (@log," ###### $time ###### ");. When we use push, we treat the array like a stack, adding new entries to the end of it and increasing its length by the number of items we add to it each time. When we write the log file, this will allow us to access the array in the proper sequence to write the entries.

Beyond this, we can see two sections of variables. The section we will use for ping is the same as we used in the MAC script, but the SNMP section is new entirely. Here we have three variables, $mibName, $mibDescr, and $mibHardwareType, which we will use to retrieve the host name, description, and hardware type, respectively, from our target device. A management information base (MIB) is a database of information on the device we will connect to over SNMP. Most devices have a generic MIB that contains the information we are looking for here, as well as a number of other MIBs specific to the hardware type or to the particular manufacturer. The values we have placed into the three variables are the addresses in the MIB of the information we are looking for. In order to look up additional information, we would need the proper MIB, which we can look up at any number of sources online, such as www.midepot.com, or in the documentation for our device or software.

In the SNMP variable section, we can also see the $port, $community, and $retries variables. These specify the port we will be using for SNMP traffic, the community name, and the number of times we will retry if our SNMP connection fails. The community name is needed to connect to devices with SNMP, and the default community name for most devices is set to public.

After this, we can see the main loop of the script. The structure here is the exact same for loop we used in the MAC script, and we can generically use this for anything that goes through a set of IPs and does something to each of them. The only difference here is to add a few lines to push information to the log file, and to call our SNMP functions and the function that writes our log array to the file.

In the init_snmp function, we simply set up the SNMP connection to the target device, specified in $host. We also make use of the other SNMP-related variables we specified at the top of the script. We call Net::SNMP to set up a new session and pass our variables to it to give it the parameters for the session. We then check to see if the session was actually defined. If not, we quit and display an error. Additionally, we set the timeout for our SNMP session.

Back in the main loop, we call get_snmp_info in order to retrieve the specific information we want via SNMP. In get_snmp_info, we make several requests, making use of the MIBs we defined at the beginning of the script. We first attempt to retrieve the host name and store it in $name, making use of the get_request function, which simply makes a request via SNMP and returns the results. If we find the text no response in $name after making our request, this means we did not get a reply from the device, even though we made the SNMP connection successfully. If this is the case, we are unlikely to get back any other information in the rest of this function, so we are better off to exit at this point rather than waiting for the other requests to fail. We can exit the function by using return.

If we do get a value in $name, we will print this out, push it to the log array, and go after the next item, the device description. The string returned here will vary considerably, depending on the target we are talking to. Once we have this, we will retrieve the hardware type of the device. The string we get back in $hardware will actually be a MIB address, which needs a bit of translating. Depending on what we get back, we will replace the MIB address with a text string indicating the type of hardware we identified. It is entirely possible that we might find a hardware type we have not accounted for, and will need to modify the code in order to properly identify it. Once we have printed and logged the hardware type, we will return to the main loop.

Here we close down the SNMP session, as we are done with it for this round of the loop. We then call write_log. In write_log, we open our log file, snmp.log, and then process the log array. We perform a join on all the elements of the array, using as a delimiter so that we have a newline between each entry. We then close out the file and clear the @log array. Since we reuse @log for each round of the main loop, if we do not clear the array each time we write it, it will get to be quite large after a few rounds.

Improving the script

We can improve our SNMP script in a number of ways. Here are a few of the more immediately obvious:

• We could potentially collect quite a bit more information via SNMP. Depending on the device in question and how it is configured, we may be able to collect a wide variety of software and hardware, including serial numbers, accounts, hardware specifications, and quite a bit more.

• We presently have the default community name hard-coded as public. While this is the standard community name used by many devices, we could easily take this in as an argument, or pull it in from a list of common names in a file.

• We can also make an attempt to guess the community name, using dictionary files or brute force techniques.

• When we output the log file, we might want to have it in a more standardized format that is easily parsed by people or other tools. We can do this by formatting the log as a comma-separated value (CSV) file, which would go a long way in the right direction.

• We may also want to separate the log information so that we have a specific log for the devices that were up and that returned information via SNMP. This is accomplished easily enough by creating another log and adding a few conditionals in to sort the interesting results into the proper log.

Summary

Perl is useful in quite a few situations as a scripting language. This is reflected in its original purpose, as a tool for manipulating text and reports, and can also be seen in its ability to glue different applications together. We can use Perl to process data and merge data together from disparate sources, a common function in the penetration testing world with its many tools.

Perl distributions are available for many platforms, from the standard Perl available from perl.org, to specific versions that have additional features and come packaged with a variety of utilities and tools, such as those from ActiveState. In general, distributions within the same major version are relatively compatible, and we can move our Perl code from one to another without major rewrites. Perl code can be developed in a variety of tools, from simple editors to specialized integrated development environments (IDEs). We can also make use of additional features, such as the ability to create graphical interfaces for our scripts and, through the use of some utilities, compile them into executable binary formats.

Scripting in Perl follows most of the standard conventions we can find in other scripting or programming languages. We can make use of various data structures, such as variables and arrays to store data in our scripts. We can execute commands in a shell, through the use of backticks, in a very similar way as we do in shell scripting with the bash shell. We can make use of arguments, control statements such as loops and conditionals, as well as regular expressions, file I/O, and many of the other standard programming language features.

ENDNOTES

1. Allen J. Perl 5 Version 12.2 documentation. perldoc.perl.org. [Online] 2011. http://perldoc.perl.org/perl.html#DESCRIPTION.

2. Richardson M. Larry Wall, the guru of Perl. Linux J. [Online] May 1, 1999. [Cited: April 5, 2011.] www.linuxjournal.com/article/3394.

3. Sheppard, D. Beginner’s introduction to Perl. perl.com. [Online] October 16, 2000. [Cited: April 5, 2011.] www.perl.com/pub/2000/10/begperl1.html.

4. What is Unix time? UnixTime.info. [Online] 2011. [Cited: April 11, 2011.] http://unixtime.info/.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.16.135.225