Chapter 5
Programming Fundamentals
Many interviewers (including me) like to “warm up” an interview by starting with an easy question that looks at some fundamental aspect of programming. If a candidate falls at this first hurdle, I might even terminate the interview. Knowledge of fundamentals really is that important.
A thorough knowledge of the elements takes us more than half the road to mastership.
—Aron Nimzowitsch
Many programmers are also disadvantaged by the lack of a formal education in computer science. It might have seemed like a load of stuffy irrelevance when you were learning it (though I hope not), but many situations exist in which this knowledge can mean the difference between doing a quality job and a late night wearing your fingers to nubs while you reinvent the wheel.
Programmers love solving problems, which is just as well because programmers face so many of them. What isn't quite so serendipitous is that many otherwise pragmatic and sensible programmers seem to enjoy solving problems that were comprehensively solved decades earlier. It is true that some of these problems are trivial and have obvious solutions, in which case not much harm is done (and the programmer enjoys the thrill of the learning experience), but other commonly encountered problems such as the manipulation of dates and the choosing of a random number are deceptively difficult. Reinventing solutions to these harder problems inevitably leads to buggy software. I know this from first-hand experience.
This chapter covers many common “algorithm and data structure” interview questions, but please don't think that everything you need to know is contained here: There is no substitute for a good reference book on algorithms and data structures. I recommend Algorithms by Robert Sedgewick and Kevin Wayne.
Before diving into the glamorous world of trees and graphs, let's start with a quick look at the most fundamental of topics: numbers.
There are only 10 types of people in the world: those who understand binary and those who don't.
You might find it hard to believe, but once upon a time a computer programmer had to be conversant with binary, octal, and hexadecimal. In those days it was fairly common that a programmer could convert between these three bases without the aid of a calculator.
Today it is true you don't need quite the same level of familiarity with binary, octal, and hexadecimal, but in case you think an understanding of them is redundant, let me offer some modern examples.
Hexadecimal is used to represent colors in a web page:
<body bgcolor="#007F7F">
Octal is used in UNIX/Linux-based operating systems when setting permissions of a file.
chmod 0664 myfile
Binary is necessary to understand subnet masks in an IP network.
192.168.0.24/30
Still not convinced? Perhaps the fact that some interviewers like to start their tech-quiz at this most basic level will convince you to double-check your understanding of these items.
Counting in base 10 seems the most natural thing in the world. Most of us have 10 fingers and 10 toes, and we learn to count our digits from an early age. Can you imagine a world where everyone had 11 digits?
From our sophisticated vantage point of being supremely familiar with base 10, base 2 (binary) might seem primitive. Yet this is what we have, and it underpins the entire digital age, so there's little point complaining about it. Indeed, far from being primitive, the accomplishment of building a modern computer from the building blocks of 1 and 0 seems nearly magical when you think about it.
The quickest way to understand a number system other than decimal (base 10) is to first understand what base 10 actually means.
It means that for a number such as 7,337 you have:
Or you could have written:
Now, if you take a leap and interpret “7337” as a hexadecimal (base 16) number you would have:
Hexadecimal is as simple as that. Octal (base 8) is equally straightforward:
All this is very simple, but it isn't quite the end of the story. Decimal numbers have the digits 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9. But how do you write numbers in base 16?
The answer is that you borrow letters from the alphabet after you get past 9. For hexadecimal you therefore have
For octal, you simply drop the digits 8 and 9, and for binary there's nothing beyond 0 and 1. You can't take the “7,337” number and treat it as binary because those digits don't exist in binary. Take another random number, “1001,” and give it the same treatment:
It might sound like an interview nightmare that an interviewer asks you to convert a hexadecimal number to binary without the help of a calculator or a computer, but doing the calculation is actually quite easy. With a bit of practice you should be able to do it in your head. Here's how:
Converting hexadecimal to binary is like eating an elephant. You just need to take one nibble at a time.
Remember that a nibble is half a byte, or four bits. Each digit of a hexadecimal number converts neatly into a nibble. Table 5.1 shows the complete list of hexadecimal digits and their binary equivalents.
Hexadecimal | Binary |
0 | 0000 |
1 | 0001 |
2 | 0010 |
3 | 0011 |
4 | 0100 |
5 | 0101 |
6 | 0110 |
7 | 0111 |
8 | 1000 |
9 | 1001 |
A | 1010 |
B | 1011 |
C | 1100 |
D | 1101 |
E | 1110 |
F | 1111 |
With this table in mind you can now convert any hexadecimal number to binary without using any kind of arithmetic. You simply look up the binary equivalent of each hexadecimal digit and concatenate them to get the answer.
Here is the hexadecimal number BEEF converted to binary:
B is 1011 E is 1110 E is 1110 F is 1111
Put it all together and you have:
Converting a binary number to hexadecimal is equally simple if you take it a nibble at a time:
One can easily forget that all the text processed by a computer is ultimately stored as binary and that what you see on the screen is an interpretation of those binary numbers according to some kind of encoding scheme. Many programmers are so accustomed to working with ASCII that they come to think of ASCII as being “plain text” and everything else as being an annoying inconvenience. Those same programmers are quite surprised when they open up a “plain text” file in their favorite editor and find that it looks like Figure 5.1.
This situation happens when a program chooses (or is forced to use) the wrong encoding. It shows you a misinterpretation of the raw numbers, Unicode or otherwise. Just for the record, Vim is a fine editor and handles Unicode perfectly well. I forced it to use the wrong encoding, as shown in Figure 5.1.
Unicode was designed to provide a single, unambiguous, and unifying representation of character in all written languages, a much bigger set of characters than those used in the English language. Prior to Unicode programmers used 7-bit ASCII and what came to be known as code pages. Code pages evolved as an attempt to standardize the interpretation of numbers greater than 127 (the upper limit of a 7-bit number) for different languages and cultures. Unicode was intended to take away the need for code pages, but it hasn't quite achieved that goal. Code pages are still needed for backward compatibility, and (ironically) they are also used to indicate which character set encoding applies to a Unicode file.
A very common misconception is that Unicode means each that character takes up 2 bytes instead of 1. That isn't quite right. Unicode characters can and often do take up 2 bytes in memory, but they can take 1 byte (UTF-8 usually takes 1 byte per character) or more than 2 bytes (UTF-16 takes either 2 or 4 bytes per character). Version 6.2 of the Unicode standard supports more than 110, 000 characters. This is far more than the limit of 65,536 that would be imposed if Unicode were actually restricted to just 2 bytes.
You wouldn't expect a professional mechanic to work on your car with only a hammer and a screwdriver. The same principle applies to programmers whose toolboxes contain only primitive data structures. At risk of overloading my metaphors I might also suggest that data structures are like LEGO pieces; you can make almost anything with the most basic LEGO brick, but using specialized bricks for wheels and motors is a lot easier and quicker.
An array is a sequence of elements where each element is identified by a number. This number is known as an index or a key.
Accessing an element in an array is always a quick operation because the address of each element is obtained by a simple calculation:
In other words, the offset of an element at a given index can be calculated by multiplying the index number by the size of the element.
The size of an array is normally declared before the array is used, which can be awkward when you don't know how many elements the array will be required to hold. All modern frameworks have data structures that serve better than an array; for instance, .NET has the generic List<T>, so the constraints of traditional arrays are no longer the problem they once were. Nevertheless, remaining aware of underpinning structures, such as arrays still pays off.
Also known as an associative array, a hash map, and sometimes simply a hash, a hash table is one of the most useful and widely used data structures.
A hash table is similar to an array in that it allows you to store a collection of values, but instead of accessing elements via sequential integers, the elements of an associative array are accessed using arbitrary data types such as strings, dates, or even classes you've created yourself.
The following Hashtable shows strings used as keys (and coincidentally also as values):
Hashtable h = new Hashtable(); h.Add("JAN", "January"); h.Add("FEB", "February"); h.Add("MAR", "March"); h.Add("APR", "April"); h.Add("MAY", "May"); h.Add("JUN", "June"); h.Add("JUL", "July"); h.Add("AUG", "August"); h.Add("SEP", "September"); h.Add("OCT", "October"); h.Add("NOV", "November"); h.Add("DEC", "December"); Console.WriteLine(h["SEP"]); // Prints "September"
Hash tables are always implemented with the help of a hash function that converts each key (that is, “SEP” in the preceding example) into an index of the element where the corresponding value is stored. Hash functions typically produce a number that is the result of a fixed set of calculations based on the input key. You can think of this number as roughly equivalent to an array index, except that you are not exposed to these numbers as you are to array indexes. They are calculated and handled behind the scenes by the framework.
Keys in a hash table must be unique; you cannot store more than one of each key. Values can be duplicated as many times as you like.
While keys may be arbitrarily large, the resulting index value will always be a relatively small size (otherwise it wouldn't be suitable for use as an index).
Every now and then a hash function might calculate the same index for two different keys. This event is known as a hash collision. When this happens, the hash function resolves the collision by either recalculating the hash with a different set of calculations (rehashing) or by storing the element in a secondary data structure using a method known as chaining. Again, this detail is not normally exposed to the programmer using the hash table.
Hash functions run in constant time, which means that hash tables are quick to store and retrieve elements.
Modern languages such as C# and Java provide a number of concrete data types on top of the basic building block of the hash table. In C# using the strongly typed Dictionary data type is very common. Here is an example with keys of arbitrary type Foo and values of another arbitrary type Bar:
var myDictionary = new Dictionary<Foo,Bar>();
Storing more than one value against a key is nearly as straightforward, because storing lists as values is perfectly acceptable:
var myDictionary = new Dictionary<Foo,List<Bar>>();
Queues and stacks are both lists of values. A queue and a stack both store values in the order in which they were added. The difference between a queue and a stack is simply that items are retrieved from a queue in the order in which they were added (FIFO—first in, first out), whereas items are retrieved from a stack in reverse order (LIFO—last in, first out). The queue data structure can be understood intuitively—everyone has queued for something—and a stack is not much more difficult to visualize—just think of a stack of plates. You put plates on the top, and you take plates off the top rather than from the bottom. That's LIFO.
A tree is a structure in which values are stored in nodes, and each node is the parent of zero or more child nodes. A node without a parent is the root node. A tree in which all nodes have at most two nodes is a binary tree. A binary tree in which there are no duplicate nodes and in which all nodes are arranged in a sorted order is a binary search tree. A B-tree is a sorted tree that allows more than two children per node, which makes it more suitable for reading and writing large sets of data because the height of a B-tree will be smaller than that of an equivalent binary tree.
Despite the fact that most trees in nature grow upward and extend their branches toward the sky, a tree data structure is conventionally drawn with branches and child nodes descending down the page from a root node.
Figure 5.2 shows a simple binary tree. This tree has a height of 3 and a size of 7. This tree is unbalanced and unsorted.
Figure 5.3 shows a binary search tree (BST). The left subtree of any node in a BST contains only nodes with keys less than the node's key. Subtrees on the right contain only nodes with keys greater than the node's key. Notice also that the node with key 9 has just one child—it is not required that all nodes have two children.
Figure 5.4 shows a B-tree illustrating that each node may have more than two children.
A graph is like a tree but it can sprawl in all directions at once. The nodes in a graph are not necessarily connected, and nodes can be connected to themselves as well as to other nodes. Two nodes may have more than one connection between them. In terms of structure, it's almost a free-for-all.
As if node connections weren't already complicated, the connections between nodes (called edges) can be directed, meaning they can be traversed in only one direction. After you've gone over a directed edge, you can't go back to the node you just left unless you go back by another route.
Figure 5.5 shows a graph with a number of interesting features: Node 2 is connected to itself, and two edges exist between nodes 8 and 9. The graph is disconnected because, for instance, no path connects node 1 to node 5.
Interviewers often want to check that you have at least a basic understanding of tree and graph traversal. You might recall the two basic approaches: depth first, which means following a path all the way down to a leaf node before you follow adjacent nodes, and breadth first, where you visit the children of a node before visiting any further-removed descendants of that node.
Algorithms that perform breadth-first traversal normally use a queue (FIFO) to keep track of nodes, whereas algorithms that perform depth-first traversal normally use a stack (LIFO).
When visiting nodes in a depth-first algorithm the three distinct orders of traversal are preorder, inorder, and postorder. See question 8 in this chapter if you need to brush up on the difference.
Sorting data is such a common operation that the collection classes in most programming frameworks expose a sort method or powerful native sorting capabilities such as those found in Perl and Ruby.
In C# you can sort a list of integers in a number of ways; here are a few examples:
var list = new List<int> { 1, 4, 1, 5, 9, 2, 6, 5, 3, 5, 8, 9, 7}; // Method 1 - the simplest way to sort a list of integers list.Sort(); // list now contains { 1, 1, 2, 3, 4, 5, 5, 5, 6, 7, 8, 9, 9 }
For sorting objects you can use the OrderBy extension method:
var listOfPeople = new List<Person> { new Person {Name = "Fred", Age = 29}, new Person {Name = "Barney", Age = 27}, new Person {Name = "Wilma", Age = 22}, new Person {Name = "Betty", Age = 23} }; // Method 2 - sorting with an extension method and a Lambda expression var sortedListOfPeople = listOfPeople.OrderBy(item => item.Name);
You can also use LINQ to Objects:
// Method 3 - uses LINQ to Objects var anotherSortedList = from p in listOfPeople orderby p.Age select p;
Behind the scenes of the Sort method in .NET is an implementation of the quick sort algorithm. You should be familiar with how this algorithm works because some interviewers (thankfully a decreasing number) still want candidates to code up a quick sort implementation from scratch while standing at a whiteboard. Merge sort is another common sorting algorithm that competes with quick sort for title of “most popular sorting algorithm.” Both of these sorting algorithms run quickly with average time of O (n log n), but quick sort can run in O (n²) time in the worst case.
If you are wondering which sorting algorithm is the best, there is no clear answer. Each algorithm has advantages in certain contexts. Even the lowly bubble sort can be optimal in some cases; for example, a nearly sorted list will be sorted quickly by a bubble sort with minimal memory overhead.
Table 5.2 provides a cheat sheet for some of the more common sorting algorithms.
Recursion happens when a method calls itself. If you've never encountered recursive code you might think a method that calls itself is asking for trouble in the form of a stack overflow, and you would be right—except that all correctly written recursive methods test for a condition that terminates the recursion when the method reaches the base case, as in the following example:
int factorial(int n) { if (n <= 1) // the base case, terminates recursion return 1; else return n * factorial(n-1); // the recursive call }
Recursion is more than an academic exercise. It is an essential tool for implementing algorithms that take advantage of the technique known as divide and conquer, where a complex problem is repeatedly divided into similar (but smaller) problems until a simple case (the base case) is reached. The biggest practical problem with recursive calls (and with the general approach of divide and conquer) is that running out of memory or exceeding the number of calls allowed by a framework is very easy to do.
The Fibonacci sequence of numbers is generated by adding the numbers 0 and 1 (equals 1), then adding 1 to 1 (equals 2), then adding 1 to 2 (equals 3) and so on as per the following sequence:
0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, …
An iterative algorithm for generating the first n numbers in this sequence might be something like:
int FibonacciIterative(int n) { if (n <= 1) return n; int firstNumber = 0; int secondNumber = 1; int result = 0; for (int i = 2; i <= n; i++) { result = secondNumber + firstNumber; firstNumber = secondNumber; secondNumber = result; } return result; }
In contrast, a recursive algorithm requires much less code and is easier to read:
int FibonacciRecursive(int n) { if (n <= 1) return n; return FibonacciRecursive (n - 1) + FibonacciRecursive (n - 2); }
If these classic textbook examples aren't sufficiently inspiring you could look to the spectacular 1997 chess match where a computer named Deep Blue convincingly beat the world champion Garry Kasparov. This was the first time ever that a computer beat a world champion in match play. It is also a fine example of recursion because at the heart of all chess-playing programs is the minimax algorithm, a recursive algorithm that searches the move space of a chess game looking for optimal moves. Without this algorithm it is doubtful that computer chess would have been as advanced as it was in 1997.
At the heart of object-oriented programming is an emphasis on modeling the entities of a domain as a collection of objects. These objects combine data with behavior and perform operations on data when prompted by a message, a method call, or an event that triggers an action.
Object-oriented programming is one of the most fundamental areas of programming knowledge to have when you are developing line-of-business applications. If you don't know the difference between a class and an object then you will be in trouble at most interviews.
Let's get the most obvious question out of the way. What exactly is the difference between a class and an object?
Very informally, a class is like a blueprint, and an object is the thing you get by following the blueprint.
Slightly more formally, an object is an instance of a class. You can't create an object without a class. When a program runs, objects spring into existence and are eventually destroyed, whereas classes normally exist throughout the entire run-time of an application.
// Here is the definition of a class called MyClass public class MyClass { // This class contains a method that does something public void DoSomething() { Console.WriteLine("Don't panic"); } } // And here is another class that creates an instance // of MyClass so it can do something... public class MyOtherClass { // Here we create an object var foo = new MyClass(); // And now we can do something with the newly created object foo.DoSomething(); }
In practice, the difference between a class and an object can be somewhat academic, because declaring classes in such a way that they can be used as if they were objects is perfectly possible (and reasonably common).
public class MyClass { public static void DoSomething() { Console.WriteLine("Don't panic"); } } // And here is another class that uses MyClass // directly, no object is explicitly created. public class MyOtherClass { // We don't need an object to use the static method MyClass.DoSomething(); }
Notice that the method DoSomething has been declared as static, which allows it to be used without first creating an object. In this example I can still create an instance of the class even though I don't need to. If I were writing a utility class then I would normally also declare the class as static because it tells the compiler that this class may not be inherited and that all the methods in this class will also be static.
Inheritance is simultaneously one of the greatest strengths and one of the greatest weaknesses of object-oriented programming. It provides for very strong modeling of a domain, and it also provides for effective code reuse where a derived class can take advantage of inherited methods and properties. It isn't all roses, however, because in practice it turns out that naïve enthusiasm for inheritance leads to complex class hierarchies where the behavior of a program becomes difficult to predict, where debugging is far from easy, and where the correctness of a program is impossible to prove. To be fair, the same might be said of any sufficiently complex program, but complex hierarchies of inheritance have contributed to some of the worst headaches in my programming career to date. Untangling and making sense of a complex hierarchy of inheritance can take days, and for that reason I advise programmers to consider composition before inheritance. Here is an example of the difference:
// Foo inherits from Bar // Foo *is a* Bar public class Foo : Bar { ... } // Foo is composed of a Bar // Foo *has a* Bar public class Foo { Bar bar = new Bar(); ... }
How does composition help? For one thing, the maintenance programmer no longer needs to worry about whether changes in a base class will break the functionality of derived classes somewhere along the chain of inheritance. You can switch out the use of Bar in the Foo class for something else and be confident that changes are localized to the Foo class, with no unpleasant ripples downstream in descendants of the class.
A typical textbook example of polymorphism describes how a cat and a dog are both animals and yet the dog barks while the cat meows. You usually see examples like this:
public class Animal { public string MakeNoise() { return string.Empty; } } public class Cat : Animal { public new string MakeNoise() { return "Meow!"; } } public class Dog : Animal { public new string MakeNoise() { return "Woof!"; } } class Program { static void Main() { var cat = new Cat(); var dog = new Dog(); Console.WriteLine("Cat goes " + cat.MakeNoise()); Console.WriteLine("Dog goes " + dog.MakeNoise()); } }
And when you run this (silly) program you get
Cat goes Meow! Dog goes Woof!
Examples like this might lead you to two conclusions: that polymorphism is somehow connected with animals and that polymorphism implies overriding, as shown here with the MakeNoise method. The connection with animals is common in textbooks but merely coincidental. The connection with overriding is also common but is not essential for polymorphism. You can also have polymorphism when classes implement an interface or inherit an abstract class. In both of those cases the subclasses are not overriding the method of a base class but still provide different implementations of a method and are therefore polymorphic. Here is an example of polymorphism without overriding:
interface IAnimal { string MakeNoise(); } public class Cat : IAnimal { public string MakeNoise() { return "Meow!"; } } public class Dog : IAnimal { public string MakeNoise() { return "Woof!"; } } class Program { static void Main() { var cat = new Cat(); var dog = new Dog(); Console.WriteLine("Cat goes " + cat.MakeNoise()); Console.WriteLine("Dog goes " + dog.MakeNoise()); } }
The word most commonly associated with encapsulation is hiding. Things that are hidden include the implementation details of a class and the data that belongs to a class. If a class exposes data members, encapsulation means that these members are accessed via getters and setters (called accessors) rather than directly as public fields.
Encapsulation encourages programmers to think of objects as black boxes. Data goes in, data comes out, and the processing performed by an object is hidden from the programmer. This approach can be very effective in simplifying the model of a system, because the programmer does not need to hold so many implementation details in her head, but when things go wrong—a bug in the black box perhaps—then this approach can make debugging harder. For this reason the “1.0” release (the first version) of a framework is often treated with suspicion by many experienced programmers.
Functional programmers think about problems and model their solutions differently from object-oriented programmers. Some key differences include:
It is said that functional programs scale up better than object-oriented programs. One reason for this scalability is that functional programs completely avoid the complications associated with concurrent operations on data such as race conditions and thread safety.
You can't go very far into a software development career without dealing with a requirement for data storage and retrieval, and you can't go very far into the requirement for data storage and retrieval without encountering some form of database. One of the earliest and certainly one of the most widespread forms of database is the relational database management system or RDBMS. SQL is the language a programmer uses to manage data stored in a RDBMS. If you are new to SQL I have some good news and some bad news: The good news is that there are just four statements to learn that will allow you to add, retrieve, update, and delete data. The bad news is that every major RDBMS vendor (including Oracle and Microsoft) has a very different implementation of SQL, so a lot of what you might learn about one RDBMS will not be transferable to another.
A cornerstone of RDBMS is adherence to ACID, a set of properties that are intended to guarantee operations on a database are processed reliably:
By far the biggest adjustment a programmer must make when learning SQL is to think about processing data in sets rather than iteratively. This does not necessarily mean writing monstrous, impenetrable 100-line SQL statements, but it does mean thinking in terms of partitioning, selecting, and processing sets of data in preference to selecting and processing individual rows one at a time. Apart from anything else, RDBMSs are excellent at optimizing set-based SQL statements. If you process one row at a time the RDBMS can do little to optimize your code.
Developing software for the World Wide Web requires that a programmer (or a team of programmers) possess a diverse set of skills. So many technologies are involved that a new term has been invented to describe a programmer who has competency in all of them: a full stack web developer.
A full stack web developer is a programmer who might one day code up a responsive web page with HTML and CSS, and the next day write JavaScript code with jQuery and Ajax, and the next day write a stored procedure to optimize a data retrieval routine. I've barely scratched the surface; full stack web development can also require knowledge of:
The technology that underpins the Web, by which I mean primarily the Hypertext Transfer Protocol (HTTP), was not designed to be stateful. It was designed for the transmission of independent requests and responses, and in itself it has no mechanism for retaining information about a session. This is what people mean when they say “The Web is stateless.”
Fortunately, or unfortunately, depending how you look at it, every modern programming framework provides a range of ways in which a façade of session state can be maintained. Cookies, hidden form values, and many forms of server-side session management all contribute to the façade of a stateful user session.
All mechanisms for maintaining session state struggle with the same fundamental problem; they all struggle with user behavior. Users as a whole do not sit idle while waiting for a server to respond. They load up new tabs and do something else, or they lose interest and close their browsers altogether. The server doesn't know whether the user is waiting or has gone; it can only make assumptions based on what was in the last transmission from the client browser and how much time has passed.
You can't fully appreciate a regular expression until you have read it in the original Klingon.
Many programmers have a love/hate relationship with regular expressions. They love them for their expressive power, and they hate them for being hard to read. They love them for their all-round usefulness, and they hate them for being unforgiving of mistakes. No matter where you stand on the love/hate spectrum, you need to have a good knowledge of regular expressions for the programming interview.
At their most basic, regular expressions are a fancy way of finding strings in a body of text. They are nothing to be scared of, at least not until someone hits you with a regular expression like this one:
/^([w!#$\%&'*+-/=?^`{|}~]+.)*[w!#$\%&'*+-/=? ^`{|}~]+@((((([a-z0-9]{1}[a-z0-9-]{0,62}[a-z0-9]{1})|[a-z]).)+[a- z]{2,6})|(d{1,3}.){3}d{1,3}(:d{1,5})?)$/i
Incidentally, this regex validates an e-mail address, although not perfectly.
One of the first hurdles facing a regex novice is gaining the ability to recognize the constituent parts of a regular expression. It might be composed of literal text, or it might be composed of tokens that represent character classes. It might contain anchors, quantifiers, groups, and assertions. Proficiently reading and writing complex regular expressions takes practice, but the basics are easily mastered, and I cover them here.
Both JavaScript and the .NET framework have adopted a flavor of regular expression syntax that is identical to the syntax found in Perl. Perl predates both JavaScript and .NET, but still survives as a popular language for getting things done. To illustrate some of the basics of regular expressions I have written a Perl script, shown next. It reads from STDIN, creates a case-insensitive regular expression from the first line of input, and evaluates the remaining lines against that regular expression. By feeding a variety of text files to this script you can see how various regular expressions match or don't match the corresponding samples.
use strict; use warnings; my $re; my @match; my @nonmatch; for (<>) { chomp; unless ($re) { $re = $_; print "The regular expression /$_/i matches as follows "; next; } if (/$re/i) { push @match, $_; } else { push @nonmatch, $_; } } print " MATCHES ", join (" ",@match) if @match; print " DOES NOT MATCH ", join (" ",@nonmatch), if @nonmatch;
I then run the script as follows:
C:>perl Regex.pl RegexTest1.txt
The file RegexTest1.txt contains a list of strings for testing against the regular expression (the first line in this file is interpreted as a regular expression).
When run, the script outputs two lists: matching and non-matching lines.
The regular expression /great/i matches as follows MATCHES !!GreaT!! GrEaT great Great GrEaT greatful ungreatful UNGREATFUL UNgreatFUL DOES NOT MATCH grate grated grea greet reat ungrateful
You can see that a regular expression will match a pattern anywhere it can find it. It matches the “great” in “!!Great!!” even though the word is surrounded by exclamation marks. If you want to find the word “great” but not the (misspelled) word “ungreatful” then you need to use anchors or word boundaries.
You can think of anchors and word boundaries in regular expressions as giving context to a search pattern. In the previous example you saw that searching for /great/ will match ungreatful. There are at least two ways to change that behavior; you can use an anchor to say (for example) that you only want to match the great or greatly or great!!. Here is what that regular expression would look like:
^great
The caret character (ˆ) is an anchor that tells the regular expression to match only at the start of the string. If you wanted to anchor the pattern at the end of the string you would use the dollar character ($) like so:
great$
This example would match !!great but not great!!.
If you want to match text that contains nothing but the word great you can anchor at both the start and at the end.
^great$
In passing, it is worth noting that these anchors have no “width.” In the pattern they are effectively zero characters long, only serving to indicate where the pattern should be matched in a string. Another noteworthy zero-width meta-character is the word boundary represented by , as in the following example:
great
This matches great in both !!great and great!! but not in ungreatful because no word boundaries exist around the pattern great.
Here are some more examples using the Perl script shown previously:
The regular expression /^foo/i matches as follows MATCHES FOO BAR foo, bar FOO! DOES NOT MATCH FOOBAR foody bar foo foofoo The regular expression /Bar$/i matches as follows MATCHES Foo bar !bar foo-bar great!bar DOES NOT MATCH BAR! bard's tale bar's tale
Quite often you will want to match a set of characters rather than one particular character. If you want to find grey and gray then you could use the expression
/gr[ea]y/
The square brackets and the two characters inside them are a character class that matches either e or a. You can also specify a range of characters in a character class, as follows:
/gr[a-z]y/
This matches any single character in the alphabet, a, b, c, and so on through to z, but because it only matches one character it will not match greay.
Here are some more examples of matching with character classes:
The regular expression /gr[ea]y/i matches as follows MATCHES gray grey GREY gray?! ##grey DOES NOT MATCH greay graey gravy The regular expression /gr[oa][oe]vy/i matches as follows MATCHES groovy begroovy! DOES NOT MATCH gravy more gravy? gravy boat!
You can also use some predefined character classes. A single dot represents the class of “any character except a newline.” Table 5.3 shows some of the most important classes you should know about.
Character Class | Interpretation |
. | Any character except a newline |
d | Any digit, including non-ASCII digits (for example, Arabic) |
D | Any non-digit |
[A-Za-z0-9] | Any character of the alphabet or any digit |
w | Any alphanumeric character, usually including underscores and periods |
W | Any character not in the class w |
s | Any whitespace character, including spaces, tabs, and sometimes also newlines (depending on which modifiers are used) |
S | Any character not in the class s |
Here is another example showing how the character class w matches underscores and numbers in addition to characters of the alphabet:
The regular expression /grwwvy/i matches as follows MATCHES groovy begroovy! groovyfruits gra_vy gr__vy gr00vy DOES NOT MATCH gravy more gravy? gravy boat! gr!!vy gr..vy
In addition to specifying characters and character classes, you can also specify how many (or how few) instances of a character or class you want to match.
Here is an example of matching zero or more occurrences of the character “a” in the word gravy:
/gra*vy/
This matches gravy, but it will also match grvy (zero occurrences) and graaaaaaaaavy (more than zero occurrences).
Table 5.4 lists the quantifiers you should know about.
Quantifier | Interpretation |
* | Match 0 or more times |
+ | Match 1 or more times |
? | Match 1 or 0 times |
{n} | Match exactly n times |
{n,} | Match at least n times |
{n,m} | Match at least n but not more than m times |
Here are more examples of how you can use qualifiers:
The regular expression /gra*vy/i matches as follows MATCHES grvy gravy more gravy? gravy boat! DOES NOT MATCH groovy begroovy! groovyfruits gra_vy The regular expression /gra+vy/i matches as follows MATCHES gravy more gravy? gravy boat! DOES NOT MATCH grvy groovy begroovy! groovyfruits gra_vy
Sometimes just matching a pattern isn't enough. Sometimes you need to extract parts of the text in order to do something with them. You might have a log file that contains values you want to capture for analysis, or you might want to find and print all the links in a page that contain the word “Dilbert,” or you might want to capture the thing my dog doesn't have:
/My dog has no (w+)/i The regular expression /My dog has no (w+)/i matches as follows MATCHES My dog has no nose (captured ‘nose') My dog has no canines (captured ‘canines') My dog has no k9s (captured ‘k9s') My dog has no nose, ask me how he smells (captured ‘nose') DOES NOT MATCH My dog has no !*?#!! (nothing captured)
You can supply alternatives in a pattern by using the vertical bar meta-character. If you want to match either fit or fat, for example, you could use this pattern:
/fit|fat/
If you want to match these words within a sentence you need to surround the alternatives with parentheses, just to limit the scope of the alternation:
/trying to get (fit|fat)/
If you don't want to capture anything you can surround the alternation with non-capturing parentheses:
/trying to get (?:fit|fat)/
Another reason you might want to capture something from a match is to replace it with something else. In Perl you can replace fit with fat using the substitution operator, like this:
s/fit/fat/
You could replace either fit or fat with paid using:
s/fit|fat/paid/
If you want to include the matched word in the replacement text, you can do it with a backreference. In Perl, that looks like
s/trying to get (fit|fat)/I got $1/
and it has the following effect:
Before: "trying to get fit" After: "I got fit" Before: "trying to get fat" After: "I got fat"
$1 refers to the first captured group. If you had more groups you would use $2, $3, and so on.
A common “gotcha” is failing to account for the greedy nature of regular expression pattern matching. The following regular expression is an attempt at matching any number of characters between two delimiters.
The expression
/".*"/
when matched against the string
I "fail" to "see" the irony
might be expected to match
"fail"
but in fact it matches from the first quotation mark right up to the last quotation mark in the string:
"fail" to "see"
This (default) behavior of regular expressions is described as greedy matching. It can be surprising if you aren't aware of it. One effective way to circumvent this greediness is to avoid using the “any character” class (the dot) and instead specify that after the first delimiter you want to match anything except the delimiter, up to the very next occurrence of the delimiter.
To say “anything except” you use the caret character inside a character class, which inverts the class.
So this expression
/"[^"]*"/
can be interpreted as
Perhaps even more problematic is the fact that scenario where a regular expression that works in one or two cases is assumed to work in all cases. Here are some flawed regular expressions you might see in the wild.
This first one is supposed to ensure a calendar date is valid:
The regular expression /d{1,2}/d{1,2}/d{4}/ matches as follows MATCHES 01/01/1999 # ok 31/02/1999 # invalid 10/10/0000 # invalid 0/0/0000 # invalid 99/99/9999 # invalid
This next one tries to validate e-mail addresses, but incorrectly rejects addresses at the 37signals.com domain. It also allows some dodgy e-mails to pass validation:
The regular expression /w+@[A-Za-z_]+?.[A-Za-z]{2,6}/i matches as follows MATCHES [email protected]. a@[email protected] DOES NOT MATCH [email protected]
This one tries to match HTML tags, but fails almost instantly:
The regular expression /(</?[^>]+>)/ matches as follows (The incorrect match is shown emphasized) MATCHES <a title=">>click here!" href="goofy.com">
Finally, if you want to see if a number is in the range 1–50, please use the normal comparison operators, and never do this:
/^[1-9]$|^[1-4][0-9]$|^50$/ # don't ever do this
This regular expression works, but it is far less readable than the equivalent code that uses normal comparison operators:
if (number > 0 && number <= 50)
By far the best reference for regular expressions is the book Mastering Regular Expressions by Jeffrey Friedl (O'Reilly Media, 2006). Although I have covered some of the basics of regular expressions you will want to refer to Friedl's book for many more interesting examples, along with an in-depth examination of using regular expressions in Perl, Java, .NET, and PHP.
Some types of problems are really hard. You might think reading a complicated regex is difficult, but that is nothing compared to the hardest problems in the domain of computer science. For instance, finding factors for large integers (of say 200 digits) is widely regarded as very difficult, and finding factors that are also large prime numbers is even harder. The best-known algorithms for finding these large prime factors are still impractically slow (think years) to run on even the fastest computers. Starting with two large prime numbers and multiplying them to produce a large integer is easy, but reversing the process is very hard. In fact, the problem of finding these prime factors for a large integer is presumed to be so difficult that it is a key part of modern cryptographic applications. Quite simply, the kind of technology an attacker would need to crack these cyphers does not exist except in theory and in experimental forms of computing that rely on quantum mechanical phenomena.
Finding large prime factors is just one example of the class of problems that have no known solutions that run in polynomial time, meaning that for practical purposes they are impossible to solve. These are not the kind of problems you would be given as CS101 homework. More to the point, these are not the problems you would be asked to solve at an interview. At the interview you will only need to recognize this kind of problem so you can avoid the trap of trying to solve it while standing at a whiteboard. Interviews normally last an hour or two; brute-forcing a solution to an NP-Complete or NP-Hard problem might take several million years.
What follows is a list of questions related to the topics in this chapter. You can use these questions to identify areas you need to brush up on. Most programmers will not find these questions too difficult, and you will find that you either know the answer or you don't. Avoid over-thinking the answers.
class BinaryTree { public int Value; public BinaryTree Left { get; set; } public BinaryTree Right { get; set; } }
bool BreadthFirstSearch(BinaryTree node, int searchFor)
class BinaryTree { public int Value; public BinaryTree Left { get; set; } public BinaryTree Right { get; set; } }
bool DepthFirstSearch(BinaryTree node, int searchFor)
public static List<string> Permutations(string str)
var myValue = myHashtable[myKey];
bool keyExists = myHashtable.Contains(myKey); // quick!
foreach (int i=0; i<myArray.Length;i++) if (myArray[i] == myKey) return true; // Found key return false; // Did not find key (and took a long time)
class BinaryTree { public int Value; public BinaryTree Left { get; set; } public BinaryTree Right { get; set; } }
bool BreadthFirstSearch(BinaryTree node, int searchFor)
bool BreadthFirstSearch(BinaryTree node, int searchFor) { Queue<BinaryTree> queue = new Queue<BinaryTree>(); queue.Enqueue(node); int count = 0; while (queue.Count > 0) { BinaryTree current = queue.Dequeue(); if (current == null) continue; queue.Enqueue(current.Left); queue.Enqueue(current.Right); if (current.Value == searchFor) return true; } return false; }
class BinaryTree { public int Value; public BinaryTree Left { get; set; } public BinaryTree Right { get; set; } }
bool DepthFirstSearch(BinaryTree node, int searchFor)
bool DepthFirstSearch(BinaryTree node, int searchFor) { if (node == null) return false; if (node.Value == searchFor) return true; return DepthFirstSearch(node.Left, searchFor) || DepthFirstSearch(node.Right, searchFor); }
class BinaryTree { public int Value; public BinaryTree Left { get; set; } public BinaryTree Right { get; set; } }
static void DFS PreOrder (BinaryTree node) { if (node == null) return; visit(node); DFSPreOrder(node.Left); DFSPreOrder(node.Right); }
static void DFSInOrder(BinaryTree node) { if (node == null) return; DFSInOrder(node.Left); visit(node); DFSInOrder(node.Right); }
static void DFSPostOrder(BinaryTree node) { if (node == null) return; DFSPreOrder(node.Left); DFSPreOrder(node.Right); visit(node); }
bool DepthFirstSearchIterative(BinaryTree node, int searchFor) { Stack<BinaryTree> nodeStack = new Stack<BinaryTree>(); nodeStack.Push(node); while (nodeStack.Count > 0) { BinaryTree current = nodeStack.Pop(); if (current.Value == searchFor) return true; if (current.Right != null) nodeStack.Push(current.Right); if (current.Left != null) nodeStack.Push(current.Left); } return false; }
public static List<string> Permutations(string str)
public static List<string> Permutations(string str) { // Each permutation is stored in a List of strings var result = new List<string>(); // The base case... if (str.Length == 1) result.Add(str); else // For each character in the string... for (int i = 0; i < str.Length; i++) // For each permutation of everything else... foreach (var p in Permutations(EverythingElse(str, i))) // Add the current char + each permutation result.Add(str[i] + p); return result; } // Return everything in a string except the char at IndexToIgnore private static string EverythingElse(string str, int IndexToIgnore) { StringBuilder result = new StringBuilder(); for (int j = 0; j < str.Length; j++) if (IndexToIgnore != j) result.Append(str[j]); return result.ToString(); }
Size of String | Number of Permutations |
1 | 1 |
2 | 2 |
3 | 6 |
4 | 24 |
5 | 120 |
6 | 720 |
7 | 5,040 |
8 | 40,320 |
9 | 362,880 |
10 | 3,628,800 |
11 | 39,916,800 |
12 | 479,001,600 |
13 | 6,227,020,800 |
14 | 87,178,291,200 |
15 | 1,307,674,368,000 |
public static List<int> GeneratePrimes(int n) { var primes = new List<int>(); int nextCandidatePrime = 2; primes.Add(nextCandidatePrime); while (primes.Count < n) { if (isPrime (nextCandidatePrime)) primes.Add(nextCandidatePrime); nextCandidatePrime += 1; } return primes; } private static bool isPrime (int n) { for (int i = 2; i < n; i++) { if (n % i == 0) return false; } return true; }
private static bool isPrime(int n) { for (int i = 2; i <= Math.Sqrt(n); i++) { if (n % i == 0) return false; } return true; }
100000 primes generated in 00:00:01.3663523
private static bool isPrime(int n) { if (n % 2 == 0) return false; for (int i = 3; i <= Math.Sqrt(n); i += 2) { if (n % i == 0) return false; } return true; }
100000 primes generated in 00:00:00.7139317
public static List<int> GeneratePrimesOptimized(int n) { var primes = new List<int>(); // Prime our list of primes primes.Add(2); // Start from 3, since we already know 2 is a prime int nextCandidatePrime = 3; // Keep going until we have generated n primes while (primes.Count < n) { // Assume the number is prime bool isPrime = true; // Test if the candidate is evenly divisible // by any of the primes up to sqrt(candidate) for (int i = 0; primes[i] <= Math.Sqrt(nextCandidatePrime); i++) { if (nextCandidatePrime % primes[i] == 0) { isPrime = false; break; } } if (isPrime) primes.Add(nextCandidatePrime); // We proceed in steps of 2, avoiding the even numbers nextCandidatePrime += 2; } return primes; }
100000 primes generated in 00:00:00.3538022
nnn.nnn.nnn.nnn
(?:[0-9]{1,3}.){3}[0-9]{1,3}
10.10.1.22
999.999.999.999
(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)
1234.255.255.2550 # Invalid, we don't want this
(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?).){3} (?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)
3.12.120.202