Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Carey ParkerFirewalls Don't Stop Dragonshttps://doi.org/10.1007/978-1-4842-3852-3_2

2. Cybersecurity 101

Carey Parker¹

(1)

North Carolina, USA

Before we can begin to discuss security, we really have to define some key computer terms and concepts. You do not have to memorize this stuff, and it’s okay if you don’t follow everything here. There’s a glossary at the end of this book that you can refer to for quick help, or you can return here if you want to refresh your memory. But for the rest of this book to make sense, I need to get you up to speed on the basics of how computers and the Internet work. I’m trying to cover all the bases here, so if you see a topic you already understand, feel free to skip it or just skim it. I’ve also taken the liberty of throwing in some fun little tidbits that will help to keep things interesting. There will be a wide range of people reading this book, and I just can’t take the time to fully explain everything. But in this chapter, I give you a solid base to work from.

Here Be Dragons

Most people have no frame of reference for computer security—it’s too abstract and too technical. I find that you can explain lots of aspects of computer security using the analogy of a castle. You have people and valuables you would like to protect from various types of attackers. To do this in the old days, people erected physical barriers like walls and moats. It turns out that these concepts work well to describe computer security, too. With that in mind, let’s talk about the fundamentals of computer security using the classic medieval stronghold as an analogy.

I Dub Thee…

Congratulations! Your king, in his infinite wisdom and generosity, has granted you a lordship and has given you huge tracts of land! You have been charged with populating and protecting this new territory, all the better to generate more tax revenue for your liege. To this end, you have been provided a fixed budget of gold and raw materials. Now what? What do you do?

Well, the first thing you need to do is to determine what and whom you need to protect. Obviously, you will need to protect your gold and other valuables. You will also need to protect the natural resources of your realm—farmlands, mines, quarries, forests, and sources of fresh water. You will want to control access to your realm, particularly the highways and waterways. The safety of the people within your realm is also paramount—not just because you’re a good person but because these people are needed to farm the land, work the mines, create the goods, pay the taxes, and populate your army and personal guard.

The next thing you need to determine is what and whom you need to protect your people and resources from. Unfortunately, the threats to your people and assets are many and varied. Not only do you need to consider malicious marauders and adversarial armies, but you must beware of feral fauna, perilous pestilence, and even crafty con men. (Oh my!) It’s also helpful to understand the resources of your opponent. Are they well-funded? What weapons do they possess? What attack techniques do they employ? Last but certainly not least, you need to understand the motivations of your attackers. Are they greedy or just hungry? Are they looking to make some sort of political or social statement? Do they have a reason to target you specifically?

You have a vast realm but only finite resources. You want to keep out the bad guys, but you must still allow for trade and travel. There are many possible attackers, but the risk of attack from each type of adversary is not the same. Likewise, while you have many different types of people and assets to protect, let’s face it—you’re of the strong opinion that some are more important than others (you and your gold, for instance). In short, you will need to make some trade-offs. Security is always about trade-offs. But now that we’ve done your threat analysis, you have a much better idea where to spend your limited resources.

As a consumer with a finite budget, you also need to evaluate your threats and decide where to spend your time and money. Luckily for you, I’ve done most of that work for you in this book!

It should be noted that if your attackers are much better funded than you are or have a lot more resources, then you’re pretty much screwed. For example, if the Russian mob or the National Security Agency (NSA) wants to specifically target you, as an individual, you’d be hard-pressed to stop them. In our analogy, this would be like trying to defend your realm against a dragon. Prior to June 2013, I think many people would have put the likelihood of ubiquitous NSA surveillance on par with that of the existence of dragons—something crazy people wearing tinfoil hats might have raved about but almost surely wasn’t true. But as the saying goes, just because you’re not paranoid doesn’t mean they’re not out to get you.

If our fictional lord were to buy into the existence of dragons and decide that he must therefore defend against them, what could he really do? No wall could be built high enough to stop a flying beast. Regular weapons would be mostly useless against such a powerful creature. You could certainly try to construct some sort of domed fortress and an array of mighty, dragon-sized weaponry, but you would probably go bankrupt in the process. Even if this could be done, how exactly would you test these defenses? As if that weren’t enough, it might well be that the act of attempting to create such a worthy stronghold would actually attract the attention of the dragon! Why else would someone build such a fortress unless they had something extremely valuable to protect (such as hordes of gold and jewels)? No, the only real hope of security in the face of such an adversary would be to hide in a deep, dark cave, strive to escape notice, and remain out of reach. But what sort of a life is that for your family and your subjects?

The point of this book (and the inspiration for the title) is that you shouldn’t be trying to defend yourself against dragons. In our analogy, you’re really not the lord—you’re a commoner. You don’t have a lot to protect compared to rich people or large corporations. And yet, what you have is yours, and you would like to take some reasonable, cost-effective measures to safeguard your family and possessions. Think about your current residence. You probably have tens of thousands of dollars’ worth of stuff inside—clothes, electronics, jewelry, and furniture. But what stands between burglars and all that stuff? Probably a simple key lock that costs $25 at the hardware store. Could it be picked? Yes, without a doubt (you’d be surprised at how easily standard door locks can be picked—I learned how to do it in about 15 minutes). And yet, in most neighborhoods, this simple lock is enough.

By the same token, your goal isn’t to protect your computer and its contents from the Russian mob or the NSA . The only way you could hope to prevent that would be to go “off the grid.” Just in case you’re curious, here’s how that might work. First, buy a new laptop from a big-box store. Before you power it up, open the laptop chassis and completely remove any hardware necessary for wireless communication (for example, the Wi-Fi and Bluetooth chips). Set up your new computer in a windowless basement. Line the walls and ceiling of the basement with wire mesh to block radio signals. Never, ever connect the laptop to the Internet or to any sort of peripheral device (mouse, printer, or anything with a USB plug). Now…how useful would that be? A computer that can’t connect to the Internet is almost useless today. But even if you could live with that, these drastic security measures still wouldn’t stop someone from simply breaking into your house or serving you with a warrant. If you’re a political dissident, corporate whistle-blower, or just uber-paranoid, then you need to read a different book. Oh, and you’d better buy the book using cash.

But the point here is that the Russian mob and the NSA are not likely to target you specifically. As we discussed earlier in the book, your most likely threats are e-mail scammers, hackers trying to access your financial accounts, and corporations or the government engaged in mass surveillance. For these threats, you can actually do quite a bit to protect yourself without spending a lot of time or money, and that is the focus of this book.

Prevention, Detection, and Recovery

Security generally has three phases: before, during, and after—also referred to as prevention, detection, and recovery. Prevention is about trying to keep something bad from happening, detection is about knowing definitively that something bad is happening or has happened, and recovery is about repairing or mitigating the damage after something bad happens. Obviously most people prefer to focus on prevention so that detection and recovery will be unnecessary. But no security plan is complete without all three.

Let’s return to our newly minted lord and the defense of his realm. The most common defense mechanism in medieval times was the venerable castle. It’s not practical to build a 75-foot-high wall around the entire realm, so a lord would build walls around the cities (particularly the one in which he and his family reside). Castles were usually built on high hills to better monitor the surrounding countryside for incoming threats and to put attacking armies at a disadvantage. In times of war, the peoples of the realm could come within the walls for protection. Castles were built mostly to defend against people on foot or on horseback. There were no airplanes, so you didn’t really need to protect yourself from the air, but the walls had to be high enough to stop attackers with ladders, arrows, and catapults. In addition to walls, castles often had moats or trenches in front of the walls to make it difficult for opposing armies to attack or scale the walls. Because people still needed to come and go, the walls required at least a few openings—so you would fortify these openings with large wooden or iron gates, sometimes multiple gates per opening. Armed guards, charged with evaluating who would be allowed to pass through (in either direction), would protect these openings. The walls and grounds were also patrolled by guards, trying to identify bad people who may have gotten through the outer defenses.

Castles had defense in depth . That is, they didn’t have just one mechanism for keeping out attackers; they had many. Each mechanism was different, not just because different attackers had different capabilities but also to diversify the overall fortifications in case the attacker found some clever way to neutralize one of the defenses. Some defenses were passive, like the walls and the moat. Once in place, they required little to no maintenance. Other defenses were active, like the castle guard. The guards could think for themselves, take orders, and evaluate and respond to individual threats on the fly. Once given comprehensive orders and training, though, even the guards could function pretty well on their own.

It turns out that setting up defenses for your computer is similar in many ways to constructing a castle. First, your main gate is your Internet router. This is what separates you from the outside world, from both good people and bad. You need to have access to the outside world to get your food, water, and supplies, as well as to sell your goods, but you have to be careful not to let the really bad guys inside. Your router has a built-in mechanism for this called a firewall . Using the castle analogy, the firewall allows people from the inside to get out and come back, but it prevents all foreigners from entering unless you say it’s okay. That is, the firewall allows your computer to initiate connections to the Internet (like google.com) and knows to allow the thing you contacted to reply (like giving you back your search results). But if something on the Internet tries to initiate contact with your computer, it refuses the connection (or simply ignores it), unless you’ve given it explicit instructions to allow the connection. By default, almost all external connections are refused by firewalls.

Antivirus (AV) software is sort of like your armed guards. They actively watch your system for suspicious behavior , and if they find it, they try to stop it. Often the AV software will then make you aware of the suspicious behavior and ask if you want to allow it. But if it’s sure that it has found something bad, antivirus software will usually just go ahead and neutralize it for you.

But not all threats to your computer come from the Internet. Gaining physical, in-person access to your computer, while not as likely, is still a real threat. In this case, your “castle walls” are made of 2x4s and drywall, and your “moat” is really just a well-lit yard with some thorny shrubbery in front of the windows. As long as your computer is inside your home or some place you trust, you’re counting on the locked doors, walls, and such to keep bad people away. However, if you have a laptop and you take it out of your home, then you probably have a lot less physical protection—maybe a locked car or just a laptop bag. At that point, you are the best protection—keeping the laptop with you, in a bag, probably over your shoulder. For laptops, you really need to set a password for your computer so that if someone does manage to steal or gain physical access to it, they can’t actually access your data. You should also encrypt your hard drive, particularly on laptops. There are sneaky ways to get at the laptop data without having to log in, so encrypting the data on your drive will make that data useless if the attacker manages to circumvent the log in process.

This is defense in depth. You should have multiple layers of security for your computer and the data it contains. The more layers, the more secure you’ll be. This book will help you add those layers.

We won’t spend a lot of time in this book on detection. Actively determining that your computer is being attacked or has recently been attacked is tricky to do. Though you can buy some software that will monitor your personal computer for suspicious activity (usually by watching outgoing network traffic), most intrusion detections systems (IDSs) are typically marketed toward owners of large, public servers. Intrusion detection systems are similar to firewalls, but whereas firewalls are generally used to prevent intrusion in the first place, an IDS will monitor background activities and communications on the computer for signs that the attackers have already gotten in and are trying to do bad things.

In our analogy, castle guards are charged with monitoring the people moving around within the walls, looking for suspicious behavior and illicit communications with outside agents. When intruders are identified, alarms are sounded, and the rest of the guard is put on high alert until the threats have all been identified and neutralized. If regular guards are like antivirus software, then the elite castle guards are more like an intrusion detection system, but the two functions are similar, mainly different in scope and complexity.

While not particularly practical (yet) for home computers, the role of detection in security is quite important in other areas that we see every day. For example, the sealing wrappers and pop-up lids on food products are there to tell you whether the jar or bottle has already been opened. The paper or foil seals covering the openings of over-the-counter medicine were implemented as a direct result of the Tylenol murders in Chicago in the early 1980s. The seal is there not to prevent tampering but to make tampering evident. If you see the seal has been broken, you know someone has been messing with it, and therefore you shouldn’t use it. In the old days, wax seals were used to secure important letters—not to prevent them from being opened but to make it obvious if someone had opened the letter prior to the intended recipient.

Despite your best efforts, you may not be able to prevent all attacks, so you also need a plan for handling the case where something bad does happen. In some cases, if you know your security has been compromised, you can take steps to mitigate the impact. You can run malware removal tools, change your passwords, close your accounts, or remotely wipe your data (in the case of a stolen computer or device). Sometimes your only option is to replace what was lost and move on with your life.

Computer Lingo

Computers are everywhere. I don’t just mean desktop computers and laptops, either. Just about every piece of modern electronics has some sort of computer chip in it, and we’re now entering the era where all of these computers are interconnected: appliances, thermostats, TVs, radios, even toasters (I’m not joking). Despite their ubiquity, computers are not well understood—and for good reason: they’re phenomenal, intricate pieces of engineering!

My first computer was a Texas Instruments TI-99/4A (see Figure 2-1).

../images/466102_3_En_2_Chapter/466102_3_En_2_Fig1_HTML.jpg — Figure 2-1
Texas Instruments TI-99/4A personal computer (circa 1981)

This computer had only 16KB of memory. My daughter’s first iPod had more than 250,000 times as much memory! A guy named Gordon Moore (cofounder of Intel) wrote a paper in 1965 that predicted that computing power would double about every two years. This prediction turned out to be very accurate and was later dubbed Moore’s law . The act of doubling is deceptively powerful. Let me give you a classic example. Let’s say I gave you a penny on January 1, and I offered to double its value every day for the rest of the month. So on January 2 you’d have two cents. A week later you’d have a whopping $0.64. Two weeks later you’d be up to $81.92. Now we’re getting somewhere, but it still doesn’t seem like much. By the third week, however, the value of your stash would be more than $10,000, and by the end of the month, after 31 days of doubling, you would have a whopping $11 million dollars! It’s hard to believe, I know, but that’s the power of doubling. Using Moore’s law , we can estimate that desktop computers today are more than 65,000 times as mighty as my little TI-99/4A.

I could write an entire book explaining how computers work (and maybe I will one day), but for the purposes of this book, I will cover only the essential terms and concepts, and I will discuss them at a very high level. Remember, you do not have to remember all of this; I’m just trying to get you familiar with some terms and concepts. Remember to check the glossary if you need a quick note to jog your memory.

Hardware and Software

You can’t talk about computers without talking about hardware and software . Hardware refers to the physical computer and all its parts—basically the things you can see and touch. This includes not only the computer itself but the things attached to it, as well. We generally refer to these attached devices as peripherals , and they include the mouse, keyboard, monitor, printer, disk drives, and so on.

Software is the computer code that runs on the hardware—the complex set of instructions telling the hardware what to do in every possible situation the programmer could think of, including (and often especially) when bad things happen. The prime piece of software that runs the computer itself is the operating system (OS). Software applications are smaller, more-focused pieces of software that are meant to do specific things, like help you edit a document, listen to music, or surf the Web. The operating system handles things like starting and stopping applications, managing the computer files, connecting you to the network, and directing the peripheral devices like your mouse, keyboard, monitor, and printer. The OS is a special software application that must be there for the computer to run, and the applications are optional add-ons that perform specific tasks.

For most PCs, the operating system is Microsoft Windows; for Apple’s Macintosh computers, the operating system is called macOS (previously “Mac OS X”). Examples of applications include Microsoft Word, Adobe Reader, Mozilla’s Firefox, and Apple’s iTunes.

You can sort of think of hardware and software in terms of a stage play. The hardware is the stage, set, curtains, lights, and sound system. The operating system is like a combination of a script, director, and stage crew—the OS controls the overall flow of the show, making sure the actors follow their lines and setting up the environment for the scene (props, costumes, lighting, sound effects). Finally, the actors are sort of like the applications , performing specific roles based on the script and using and sharing the props (data and peripherals) with other actors.

File Manager

When you’re sitting at your computer and you want to look at the list of files stored on it, you do this with the file manager . In Windows, this is called Windows Explorer or File Explorer; on a Mac, it’s called the Finder. When you double-click a folder on your desktop or open My Computer or Macintosh HD, you are launching the file manager. This may seem trivial, but some people aren’t used to giving it a name.

Bits and Bytes

All the data on your computer—every document, every image, and every movie and song—is stored as bits and bytes. By the way, this includes all of your software applications and even the operating system itself. Bits and bytes are the building blocks of digital things. A bit is the smallest piece of information on a computer. A bit has two possible values—zero or one. A bit, therefore, is binary. That’s just a fancy way of saying that it can have only two possible values (bi means “two”). We abbreviate bit with a lowercase b.

To build bigger things—like larger numbers, alphabet characters, and those fun little emoji icons—we need more than a single bit. The next largest chunk of digital data is the byte , which we abbreviate with a capital B. There are eight bits in a byte. So, since each of those eight bits has two possible values, if you do the math, there are 256 possible combinations of eight bits: 00000000, 00000001, 00000010, 00000011, and so on, till you get to 11111111.

We use various patterns of these bits to represent meaningful pieces of information—this is called encoding . There are a gazillion ways to encode things, mostly because you can do it any way you want as long as you spell it out clearly so that someone else knows how to decode it. One common example of this is ASCII encoding (pronounced “ass-key”). ASCII is used to encode characters—letters, numbers, and other symbols that you would find on a computer keyboard. For example, the letter A is encoded as 01000001 in ASCII. Because we want to tell the difference between uppercase and lowercase, we need an encoding for both. Lowercase a is encoded as 01100001. Basically, anything you can type using a computer keyboard has an ASCII representation. So, using ASCII mapping, “Hello” would look like this:

0100100001100101011011000110110001101111

The first eight bits, 01001000, represent the letter H, and so on. It looks ugly and complicated to you as a human, but this is the language that computers speak, day in and day out.

Now, let’s talk about bigger data. One byte might be enough to encode a single key from the keyboard, but that’s not terribly useful. To encode entire documents, images, movies, etc., we need lots of bytes . And just like we have shortcuts for large decimal numbers (thousand, million, billion, etc.), we have shortcuts to describe lots of bytes. These terms are based on standard metric prefixes: kilo- (thousand), mega- (million), giga- (billion), tera- (trillion), and so on. So, a kilobyte (KB) is a thousand bytes, a megabyte (MB) is a million bytes, and a terabyte (TB) is a trillion bytes.¹

Storage

For your computer to store and process all these bits and bytes, it needs somewhere to keep them. There are all sorts of sophisticated levels of storage space on your computer, but the two most important ones are random access memory (RAM) and the hard drive. When you buy your computer, these are two of the most prominent specs you’ll see on the box.

The best analogy I’ve heard for understanding how computer memory works is comparing your computer to a garage workshop. You garage holds all your tools and materials. You use the tools to work on the materials. You either make new things or modify old things. But you don’t simultaneously use every tool and all your materials—you generally use one at a time, or a few at most. When it’s time for you to work on a project, you fetch the specific tool and materials that you need for the moment and bring them to your workbench. When you’re done, you put the tool and remaining materials back. If you try to work on too many things at once, you’ll find that your workbench is full—you literally cannot use all your tools and materials at once; you have to choose. Maybe you can try to just move some to the floor nearby and swap them with something else when you need it again, but this is not very efficient.

Your computer works in a similar way. Your hard drive is like your garage—it holds all your data and all your apps, ready for you to use. When you launch a particular application or open a file, the data for the file and the application are copied from your hard drive into RAM where the computer can work with them directly and most efficiently. When you close the app, the data is copied back from RAM to the hard drive. That’s why the hard drive is so much larger than your RAM. The hard drive has to hold everything, whether you’re working on it right now or not; the RAM only needs to hold what you’re actively working on. The more RAM you have, the more you can work on at one time—like having a bigger workbench. When you fill up your RAM, your computer actually moves things back to the hard drive to make room, trying to swap things back when you need it. That’s why your computer becomes really sluggish if you try to open too many applications or big files at the same time.

If you have too much crap in your garage, you have two options: build a bigger garage or get some sort of external extra storage space, like a rented storage room or maybe a shed in the backyard. Similarly, if you want to work on more things at once, you need a bigger workbench. When this happens on your computer, you have similar choices. When you want to hold more data, you get a bigger internal hard drive or buy an extra external hard drive. When you want to work on more things at once, you need to buy more RAM .

Networks (Wired and Wireless)

When computers talk to one another, they do it over a network . Put another way, a network is a set of connected computers that are all communicating using a common computer communication language, or protocol. Most computer networks today use a communication language called the Internet Protocol (IP). Networks vary wildly in size but come in two primary forms: public and private. The public network we all know and love is the massive, globe-spanning network called the Internet (originally from the term Internetwork).

However, there are countless small, private (or semiprivate) networks that are attached to the Internet. These are usually referred to as local area networks (LANs). You might think of corporations being the primary owners of LANs, but if you have a router at home with more than one computer or device on it, then you have a LAN, too.

Networks can be wired or wireless or (often) both. Computers that are wired to the LAN use Ethernet cables (they look like phone cables with a wider plastic clip at each end). Computers that are wirelessly connected to the LAN use Wi-Fi.²

Figure 2-2 shows a picture of a standard Ethernet cable (wired connection) and the internationally recognized symbol for Wi-Fi (wireless connection).

../images/466102_3_En_2_Chapter/466102_3_En_2_Fig2_HTML.jpg — Figure 2-2
Ethernet cable (left) and Wi-Fi logos (right)

You can’t talk about cybersecurity without understanding some basic things about how computers communicate. I’ve devoted a whole section to this topic later in this chapter.

Bandwidth

When we move data around, we like to know how fast it can go. We measure this in “bits per second,” usually megabits per second (Mbps) and sometimes gigabits per second (Gbps). This measurement is referred to as bandwidth . Most people run into this value in two places: when they’re buying Internet service from their Internet service provider (ISP) and when they’re looking at networking products like Wi-Fi routers.

The best way to think about bandwidth is moving water through a pipe. You can move more water, faster, through a bigger pipe. The Internet is often referred to as a series of pipes (or tubes), and that’s because of the water analogy. If you think about how water gets to your house from your town’s main water source and back through the sewer system, that’s similar to data moving on the Internet, at least in terms of speed (see the next section for the routing part). There are big pipes at the source that take lots of water to your neighborhood, then there are smaller (slower) pipes that come to your home, and finally there are much smaller pipes that take the water to your sink, shower, and so on.

It’s much the same with your Internet service. The connections within your house are the slowest (the smallest pipes). The connection from your house to the neighborhood junction box is a good bit faster. The connection from the junction box to your cable provider is faster yet, and the connection between your service provider and the next service provider is the fastest—the biggest “pipe.”

In the next chapter, I’ll talk about how the Internet actually moves these bits and bytes around.

Bluetooth

Bluetooth is the funny name of a handy wireless connection technology. Its main purpose is to remove the need for pesky wires. Originally this was used mainly to connect peripheral devices to your computer over short distances like mice and keyboards, but it’s gained a lot of popularity with sending and receiving audio; examples include hands-free car connections to your mobile phone, portable speakers, and, of course, wireless headphones. It is also used in some modern remote controls because it doesn’t require line-of-sight like the more common infrared (IR) remotes. Figure 2-3 shows the official Bluetooth symbol so you’ll recognize it when you see it.

../images/466102_3_En_2_Chapter/466102_3_En_2_Fig3_HTML.jpg — Figure 2-3
Bluetooth logo

Clients and Servers

When two computers talk to each other, one of them is usually asking the other one to do something. The requester is called the client , and the responder is called the server . While it’s possible for one computer to be a client in some cases and a server in other cases, in today’s Internet age, almost all requests originate on your computer (client) and terminate at some big computer out on the public Internet (server). Example requests might include “show me the video with the cat playing piano” or “find all the web sites with pictures of Scarlett Johansson” or “post this totally cool photo of me for all my friends to see.”

The Cloud

This has to be one of the most overused (and abused) terms in modern computer lingo. It’s the latest industry buzzword, and it’s being thrown around like crazy. So, what is it? When someone refers to the cloud, all they really mean is that it exists or is happening somewhere out in the Internet.

When engineers draw diagrams of computer networks, they often use a picture of a cloud to represent some vague grouping of things “out there”... we can’t really see what’s in there and don’t really care. Things enter one part of this nebulous fog, and they emerge somewhere else. Sometimes they change in there, and sometimes they just come out the same. Sorta like the Internet, eh?

Today all of our computers and devices are connected via the Internet, and the connections have become so fast that we can actually send our data out there for processing instead of doing it all locally on your computer. When you send your data to the Internet for something out there to work on it, we call it cloud computing . An example of this would be Google Docs: instead of running Microsoft Word locally on your computer (modifying a file on your hard drive), you edit your document in a web browser, and all of this is happening on a faraway computer owned by Google. When you store copies of your data using a service like Dropbox or Google Drive, this is cloud storage —while the files exist on your local hard drive , they also exist on the Dropbox or Google server.

Net Neutrality

Unless you’ve been under a rock, you’ve heard the term net neutrality. You’ve also probably seen a lot of politicians and corporate spokespeople screaming about why it’s going to destroy the Internet and e-commerce…or why it will save it. So, which is it?

Net neutrality, as a concept, means that no person or company on the Internet should be allowed to have preferential treatment. The Internet should be a level playing field. When you define it like that, it sounds pretty good, right? So, let’s look at a real-world example and see how this definition holds.

Netflix is a popular movie-streaming service—so popular that by some estimates it accounts for over a third of all Internet traffic during peak hours! As you might expect, slinging all those bits around can seriously tax the servers on the Internet. If an ISP has to buy more equipment to handle all that demand, shouldn’t Netflix have to pay them some more money to help cover the costs?

It’s tempting to say yes, they should. But it’s a slippery slope. The bottom line is that Internet service providers are selling a service, and if they can’t deliver what they’ve promised, then they need to invest some money in their infrastructure. If that requires raising costs for the service, then that’s what they need to do, perhaps creating tiers of service so that only the people using the most bandwidth pay the most money. But we can’t be asking the companies providing popular services to foot the bill or it will turn into a money grab, and the only companies that will be allowed to play are the ones who can pay. The reason Netflix and other startups flourished in the first place was because they could compete on an even playing field. The little guys always want regulation to keep things fair because otherwise they could never compete with the deep-pocketed incumbent powerhouse corporations. But once you become a powerhouse corporation, you want to get rid of these regulations so you can tilt things in your favor and keep out the competition.

Net neutrality is not new. It’s what we’ve had all along, at least to this point. We need to preserve it so that the next Google, Netflix, or Facebook can come into being. Unfortunately, as of the writing of this book, the U.S. government has rolled back regulations that would have ensured a fair playing field.

The Internet of Things

The term Internet of Things (IoT) refers to the current push to add Internet connectivity to everyday devices: toys, appliances, cars, water meters, even light bulbs, and thermostats. This allows you to control these devices from your smart phone or computer, even from outside the house. Of course, this means that others may be able to control them, as well, if they can find a bug in the software to exploit.

These devices are almost always tied to some sort of cloud service. These devices are constantly “phoning home” to the manufacturer or third parties to provide those cool Internet-connected features. The Amazon Echo and Google Home products are often brought up when discussing IoT—small Internet-connected devices that respond when you say the phrase “Alexa” (this is referred to as the wake word). You can ask for the weather forecast, get the latest news, order products from Amazon, and even control other IoT devices in your house. However, this also raises some obvious questions about privacy. I’ll cover that later in the book.

Know Your Enemy

I’m willing to bet that you’ve heard most of those computer terms already, even if you didn’t quite know exactly what they meant. But most people are a lot less familiar with the terminology of security. I will cover many of these terms in greater depth in future chapters, but I want to quickly define a handful of terms up front so I can at least make some passing references to them without losing you.

Malware

Malware is short for “malicious software,” and it’s a catchall term for software behaving badly. Much of this book is dedicated to helping you avoid malware. Protecting your computer and avoiding infections protects not just you but every other person you know (or at least all the people in your computer’s address book).

As you might imagine, there are many different varieties of malware out there, and it’s actually common for a single piece of malware to have multiple components/behaviors. The following are the most common types of malware out there today.

Virus / Worm : This is the term that most people have heard, and it’s probably the first thing that popped into your mind when you read the definition of malware. A virus is a piece of malware is spread through infected files, usually requiring a user to open it to become infected. A worm is a special kind of virus that can replicate itself and travel autonomously to other computers over the network by exploiting bugs in the services that computers often make available remotely. Like a human virus, computer viruses tend to spread to the people you know and come in contact with. The virus can use your e-mail address book, browsing history, and other information on your computer to figure out who to infect next.

Trojan : A trojan is malware disguised or hidden inside “legitimate” software. The name comes, of course, from the classic tale of the Trojan horse that the Greeks used to sack the city of Troy. The Greeks hid a select group of troops inside a large, hollow wooden horse statue (Figure 2-4). They left the horse at the gates of the city as a trophy and then sailed away in apparent defeat. The people of Troy brought the horse into the city gates, and when night fell, the troops came out of the statue and opened the gates from the inside. The Greek army, which had returned under cover of night, was then able to enter the city and take it.

../images/466102_3_En_2_Chapter/466102_3_En_2_Fig4_HTML.jpg — Figure 2-4
Classic image of the famed Trojan horse3

A trojan application or document works much the same way. You’re presented with what appears to be a legitimate application or document, but the malware is hidden inside it. When you launch the app or open the document, the malware program runs and infects your computer.

Spyware : This is malware that is designed to spy on you—either to watch what you do or to look for sensitive information on your computer—and then relay what it learns to the bad guys. This might include things like account passwords, your contacts, or personal information that would allow the watcher to impersonate you (identity theft).

Scareware : Sometimes the bad guys take the direct approach to getting your money. This type of malware might surreptitiously do something bad to your computer and then offer to fix it (for a fee). Sometimes this type of malware disguises itself as a free antivirus scanner or something, and in reality this software is causing the problem and then offering to undo the damage. In other cases, the malware watches your online habits, waiting for you to do something potentially embarrassing, such as visiting porn sites or illegal download sites. At this point they pop up some sort of fake law enforcement warning, threatening to expose you unless you pay a fine.

Ransomware : With the rise of digital currencies like Bitcoin, we’ve seen an explosion in a specific type of malware called ransomware. Crooks infect your machine with software that effectively steals your data and holds it for ransom. The malware surreptitiously encrypts your hard drives, making the data completely unreadable—but also completely recoverable with a simple decryption key. It would be like crooks sneaking into your house, putting all your valuables in a vault (in your house), and then leaving you a note that says “If you want the combination, you must deposit $10,000 in my Swiss bank account.” You technically still have all of your stuff... you just can’t use it. Clever, eh? If you pay the fee, they give you the keys to decrypt and recover your data.

Note

I’ll talk in Chapter 5 about how you can encrypt your hard drive to prevent someone from accessing your files if they were to steal your laptop or hard drive. However, encrypting your hard drive will not prevent ransomware... there’s nothing preventing a file from being encrypted twice!

Potentially Unwanted Program (PUP) : This is probably the most innocuous form of malware, though it can still be quite annoying and frustrating. This sort of malware commonly accompanies “free” software you download from the Web—you download the free app, but one or more PUPs come along for the ride—and the vendor makes money from the third-party vendors of the unwanted programs for installing their stuff on your computer. Most of these installers will warn you about the hitchhiker apps, if you dig in deep enough, but they don’t make it easy to find. Assuming you find the installer configuration that shows you the extra software, you can usually deselect the additional, unwanted crap.

A more insidious type of PUP is called adware. Once installed, adware will pop up advertisements in some form or another. For every ad shown or clicked, a referral is paid, which is how they make their money. Sometimes adware will also collect information from your computer, as well, to be sold to marketers.

Bot : A bot (short for “robot”) is a piece of software that automates something that a human would normally do. Bots are not always bad. For example, bots are used by search engines like Google and Bing to “crawl the Web” looking for new and interesting web sites to provide in their search results. Other bots might be used for slightly less benevolent things like beating real humans at online games, registering for online contests, or creating bogus e-mail accounts from which to send junk e-mail (or spam).

That’s why many web sites force you to look at a seriously distorted picture of some letters and numbers and then ask you to enter what you see. (Those annoying tests are called CAPTCHAs.⁴ See Figure 2-5.)

../images/466102_3_En_2_Chapter/466102_3_En_2_Fig5_HTML.jpg — Figure 2-5
Example of a web CAPTCHA form

They’re trying to prevent bots from automatically doing whatever it is that you’re trying to do. They specifically create problems that they believe are easy for humans to solve but very hard for computer programs.

However, in the context of malware, bots can be used to do things on your behalf that you would not normally do—like send e-mails to everyone you know advertising pills for male genital enlargement. Some bad guys actually try to build up large networks of infected computers to form a botnet . A botnet is essentially an army of remote-controlled computers, all pretending to be the owner of the computer but actually working for someone else. Worse yet, you may never even know that computer is doing this. One common use for botnets is to simultaneously slam some specific web site with requests in an attempt to bring their servers down (called a distributed denial of service, or DDoS, attack ). They do this because either they don’t like the owner of the web site or maybe they’re trying to extort protection money from them.

Even worse, all of our wonderful new Internet of Things devices are also computers. Any computer connected to the Web is subject to hacker attacks—and because these devices are usually very cheaply made, their security is bad or nonexistent. IoT devices are favorites for creating botnets.

Cryptomining : With the meteoric rise of “cryptocurrencies” like Bitcoin, there’s been a sort of digital gold rush to “mine” it. Bitcoins are mined by performing some very hard, very taxing computer operations—and as time goes on, these digital currencies are designed to be harder and harder to mine to control the supply. It didn’t take long for the bad guys to turn their botnets into massive, networked mining rigs. Thankfully, the only real impact on you and your devices is a higher electric bill. Cranking up your computer’s brain to work on these complex operations requires significant power.

Rootkit : This is a particularly nasty bit of malware that often attacks the operating system. It’s specifically designed to hide from you and from malware detection software like antivirus programs. Sometimes the rootkit actually disables or breaks your security software, preventing it from working. To make matters worse, rootkits often find ways to hide copies of their code so that even if you do find the running copy and remove it, it just replaces itself and runs again. Rootkits are often used as a sort of beachhead on your computer, a safe landing point for other invaders. The techniques in this book will help to greatly reduce your risk for this (and all) malware.

Hardware Bugs

In recent years, a new class of threat has become popular: hardware hacking. Instead of finding bugs in the operating system, web browser, or other applications, hackers have managed to cleverly exploit vulnerabilities in the computer’s processing chips, called the central processing unit (CPU) .

Two bugs that garnered a lot of media attention (and rightly so) were dubbed Spectre and Meltdown. In an effort to eke every last ounce of processing power out of our computer chips, CPU manufacturers AMD and Intel devised techniques that allowed the CPU to precalculate various possible outcomes ahead of time and, when the particular choice of course was made, have the answer ready and waiting. This involved prefetching all the necessary data to make these decisions. It turns out that this mechanism wasn’t properly locked down and a rogue application could actually peek at this data without proper permission, so this information leaked. And if that information contained private data, it could be harvested and used for nefarious purposes.

If that sounds complicated, that’s because it is. These vulnerabilities and the hacks required to exploit them were devilishly crafty. But because AMD and Intel make the brains for just about every computer on the planet, it meant that just about everyone was vulnerable—Mac, PC, and even Linux (which is the dominant operating system for most big-time computer servers on the Internet). In this case, it was actually possible to patch the bugs with low-level software changes. But the day will surely come when a software fix won’t cut it, and in that case, your only choice will be to throw your computer out and get a new one. Computer chips take many months to design, manufacture, and integrate into computer systems, so if this type of bug were to be found, it could be a long time before new computers would be available for purchase.

Exploit

Exploit is the general term for some sort of chink in your cybersecurity armor—a vulnerability that will allow bad things in. Exploits are usually software bugs in your operating system or an application like a web browser, PDF reader, or browser plugin. A bug is just a mistake in the code—some software engineer (like me) screwed up and missed something. It’s actually easy to do, even for seasoned software writers, which is part of the reason we have so many problems with malware these days.

A zero-day exploit is one that was previously unknown to the general public, in particular the creator of the software containing the vulnerability. Essentially it means that the bad guys found it before the good guys did. It’s important to note that many bugs go unnoticed for years before they’re discovered. The vulnerability dubbed Shellshock, announced in September 2014, is one of the worst zero-day bugs ever found—and it was lurking in the software since 1989!

How the Internet Works

The Internet is a fascinating piece of technology (actually many different technologies, playing together more or less nicely). Entire books are devoted to explaining this stuff, but I want to focus here on how the data moves—because it’s important to understand these concepts before talking about the things we do to protect the data as it moves.

When you surf the Web, send an e-mail, upload a video to YouTube, or stream a movie on Netflix, you are transferring data between your computer and some other entity on the Internet—often a big server owned by the likes of Amazon, Google, or Netflix, but sometimes another computer like yours (if you’re doing a Skype session with a friend, for example). Data can travel in two directions, and we like to label these directions so we can talk about it. When data goes toward the Internet, we call that direction up; when data moves toward your computer, we call that direction down. So, when you are pushing a video file to YouTube, we call that uploading; when you are buying a song from iTunes to play on your computer, you are downloading . If you think of the Internet as “the cloud,” then you can remember this by thinking that clouds are up.

But how does that really work? Glad you asked! It’s actually really cool. As I said earlier, data is made up of bits and bytes . If you make a perfect copy of the bits, then you make a perfect copy of the data. To get data from one computer to another computer, you just have to find some way to reproduce the same bits, in the same order, on the other person’s computer. Let’s say you want to send someone a Microsoft Word file. You don’t physically send your bits to someone else—you send copies of the bits. When you’re done, the file exists on both your computer and on their computer, and the two files are 100 percent identical.

So, how do we do that exactly? Again, it would take an entire book to tell you the real deal here, so we’re going to simplify things considerably. The Internet works a lot like the U.S. Postal Service’s first-class mail system. The USPS is in charge of getting letters from point A to point B, and it does so using a vast array of methods and technologies—most of which you never see or know about. All you know is that if you put your letter in the mailbox, with the recipient’s address on it and a stamp, they’ll take care of the rest! If you want that other person to send something back, then you need to also include a valid return address. As far as you know, it’s just magic—it somehow leaves your mailbox and ends up in the other person’s mailbox, right?

Well, let’s take a look at what really happens behind the scenes after you put that letter in the mailbox. Every so often, usually once a day, a mail carrier comes by to check your mailbox. If the mail carrier finds a letter in the mailbox, they take it and throw it in their pile of outgoing mail. At the end of the day, the mail carrier takes the pile of outgoing mail to a sorting facility that looks at all the “to” addresses and starts separating the mail into bins based on destination. Local mail may go right back out the next day with the same mail carrier, whereas mail to another state will likely be put on a truck or plane and sent to another post office for local delivery at the far end. They may split that pile up and send it different ways, even to the same destination. How they split this up might depend on how full the truck or plane is, on whether a particular truck or plane is out of commission, or even on the fluctuating costs of fuel. The route taken and the method of transport will also vary based on what type of service the user paid for (media mail, first-class mail, or special overnight delivery). There may be multiple stops along the way and multiple methods of transport. Again, as the sender, you don’t know or care how it gets there—only that it does get there, and in a reasonable amount of time.

Internet routing works very much the same way. Every computer on the Internet has a unique address. It’s called an Internet Protocol address, or IP address for short. An IP address is set of four numbers separated by periods like 74.125.228.46 or 72.21.211.176.

Fun Fact

Each one of those four numbers in an IP address ranges from 0 to 256. So if you add up all the possibilities, that’s more than 4 billion possible unique addresses…and, believe it or not, we’re already running out! There’s an expanded addressing scheme that is slowly being rolled out that will save the day called IPv6 (which stands for Internet Protocol version 6, which will replace the current IPv4) . IPv6 has a truly stupefying number of unique addresses.

340,282,366,920,938,463,463,374,607,431,768,211,456

Yeah, that’s one number. I’ve heard it said that every single atom on the planet Earth could have its own IPv6 address... and there would still be enough left over for 100 more Earths!

To make those numbers easier for humans to remember, we give them aliases like “google.com” or “amazon.com”—we call those aliases domain names or hostnames . When your web browser or e-mail application is told to use one of these aliases, your computer invokes a service called the Domain Name Service (DNS) to convert the alias to an IP address.

So, when you tell your computer to send some data across the Internet, it packages that data up, slaps on the destination IP address, and puts the data in a sort of digital mailbox, waiting to be picked up. Your return IP address is automatically added so that the computer at the other end can send you a reply. Once the data is picked up for delivery, it’s dispatched through a series of special computers called routers that figure out how to get your info from here to there. The exact route it takes may be different each time, based on how busy the routers are, the costs charged by the various carriers, and any special handling that your computer may have requested. More on that in a minute.

Now, as fancy as the Internet is, it doesn’t have the notion of shipping large things in a single package—like shipping a big box full of stuff at the post office. Basically, everything is letter-sized. Why is that? Well, when you’re dealing with bits and bytes , you can deconstruct anything into small pieces and reconstruct them perfectly at the other end! (It’s sorta like that scene in Willy Wonka where Mike Teavee is transported across the room.) Since the individual pieces may take different routes, they may actually arrive out of order at the far end. So, how do we fix that problem? As you chop up the big thing into little pieces, you give them a number, counting from zero to whatever. At the other end, they can use the numbers to put everything in the proper order when reassembling the data.

To think about this, let’s go back to our analogy of the post office. Let’s say you want to ship a copy of the Oxford Dictionary to someone across the country, but you can ship them only one page at a time. How would you do that? Well, first you’d have to copy all the pages and stuff each one into its own envelope. Every one of those envelopes would need to have the recipient’s address on it. Instead of putting a stamp on each one, you just pay the USPS some sort of bulk mail rate and they take care of that for you. The pages are already numbered, so if they arrive out of order, your friend will be able to put them back in the proper order.

However, now you realize that your mailbox isn’t actually big enough to hold all of these letters at once. Now what? Well, you just have to do this in batches, maybe 100 letters at a time. Once the mail carrier empties your mailbox, you can put in 100 more. The person at the other end opens all the envelopes and reassembles them in order (using the page numbers). When the last letter arrives, they will have the complete book!

Now, the U.S. mail service (like the Internet) isn’t perfect…sometimes letters get lost. So, how do you handle that? Well, you ask the person at the other end to send a letter back every so often telling you what they’ve received so far. If they find that some pages are missing, they can ask you to resend them.

That—in a nutshell—is exactly how data is sent across the Internet—but instead of taking weeks, it takes fractions of a second! Your computer breaks the file or e-mail or whatever into small chunks called packets . Each packet is labeled with a “from” address and a “to” IP address. Those packets are then numbered so that the computer on the receiving end will a) know in what order to reassemble them and b) know if any packets are missing and need to be re-sent. See, it’s not that hard!

Why do you need to know all of that? I’ll explain in later chapters. We’ll revisit this analogy when we talk about privacy, authentication, and encryption.

Tools of the Trade

Now that we’ve reviewed the basics of computers and we’ve learned the fundamental operation of the Internet, it’s time to turn to the really cool stuff! Okay, I will stipulate here and now that normal people may not share my deep-seated fascination for the mechanics of cybersecurity. But I’m going to do my best to win you over. One of the important takeaways from this book is that you can trust the math. The media and pop culture like to make it look easy to hack into secure systems, cracking passwords and breaking encryption with ease. In reality, modern computer security algorithms are extremely robust and nearly impossible to crack, if implemented properly (which turns out to be a crucial “if”). Despite the billions of dollars spent on supercomputers at places like the NSA and GCHQ,⁵ the tools that cryptographers have created hold up surprisingly (and thankfully) well. In this chapter, we’re going to discuss the processes and tools that allow us to shop and bank online, as well as communicate privately and keep our computer data from being read by others.

Encryption and Cryptanalysis

What problems are we trying to solve here? Why do we need cryptographic tools in the first place? Let’s consider the classic case of a personal diary. I would like to be able to write some thoughts down and save them in such a way that only I will ever be able to read them. You could just hide this journal somewhere and hope no one finds it. This is a form of security through obscurity , something that is secure only because its existence isn’t known. In the realm of cryptography, this is never sufficient, at least not as the primary means of security. No, what you really want is some sort of mechanism by which we can render the words completely incomprehensible by a third party, even if they steal the journal—and yet somehow be fully readable by you.

Security Through Obscurity Fail

Security through obscurity is used way more often in the real world than it should be. For anything important, it should simply never be used. Here’s a real-life example of how the implementation of security is just as important as the underlying technical aspects. A tech startup company debuted a new Wi-Fi-controlled “smart” light bulb. This would be part of the Internet of Things: connecting everything to the Internet so that users can control them remotely using their smartphone or computer. The idea was that once you set up your first bulb (where you must give it your Wi-Fi credentials so it can connect to your home network), any added bulbs would talk to the “master” light bulb and automatically connect to your Wi-Fi network. How convenient! These amazing little devices had a private “mesh” network that allowed them to talk to one another. Since wireless communications are broadcast over the airwaves, hackers can easily eavesdrop on the signal. Well, the engineers at this company understood this, so they used “military-grade” encryption to prevent eavesdroppers from tapping into these communications. Unfortunately, the encryption key they used—the lynchpin for any successful encryption—was a fixed, hard-coded value! That is, there was only one key for all of their products. But they hid this key away inside the memory on their light bulbs…who would ever figure out how to extract it? Hackers, of course. They were able to take one of these light bulbs, hook up some wires to the computer chip inside, and pull out the key. All a hacker needs to do is come close enough to your house to “hear” the light bulbs talking to each other, and they can decrypt the transmissions and see your household Wi-Fi login ID and password. The company counted on no one bothering to poke around in their devices to find the encryption key—security by obscurity. Once this key is made public (i.e., no longer obscure), the jig is up for all customers who bought these bulbs. The company updated its software with a fix, but it still requires the customers to get around to updating their light bulbs to remove the vulnerability... which is not very likely.

As you may have guessed, what we need here is encryption . We need to somehow transform the words we wrote into some other representation that appears to be pure gibberish to someone else. In cryptography, the original text is referred to as the plain text , and the garbled output is called the cipher text . We call it that because to convert the plain text to cipher text, we run the plain text through a cipher . A cipher is a secret code or algorithm that maps letters in the plain text to letters, numbers, or symbols in the cipher text. These types of cyphers are referred to as substitution cyphers because you substitute each letter of the regular alphabet with one character from a cypher “alphabet.”

This is the classic “secret decoder ring” from bygone years. If you’re not familiar with the secret decoder ring concept, it’s basically two wheels, side by side, connected together on a common hub. Each wheel has a sequence inscribed on its edge so that if you rotate the wheels, you can align the letters or numbers in multiple ways. When the letter sequence is the same on both wheels, we call this a rotational cipher . Let’s say that each wheel just has the alphabet in the proper order so that if you align the letter A on one wheel with the letter A on the other wheel, then they are in perfect alignment all the way around... B on the first wheel maps to B on the second wheel, all the way to Z mapping to Z. That’s a crappy cipher, so let’s give the wheel a spin... let’s say now that A maps to N. That would mean that B would map to O, C would map to P, and so on. Because the wheels are round, when you get to Z, the next letter is A, and it continues. This is actually a special rotational cipher referred to as ROT13 , which is short for “rotate 13 places”—because the letter N is 13 places away from the letter A in the alphabet. Since the English alphabet contains 26 characters, if you run ROT13 encoding on something twice, you end up back where you started (because 13 + 13 = 26).

Figure 2-6 shows an example of this. You can see the entire alphabet mapping and also see how the word Hello would be encoded using this scheme.

../images/466102_3_En_2_Chapter/466102_3_En_2_Fig6_HTML.jpg — Figure 2-6
Alphabet character mappings for the ROT13 rotational cipher

My favorite pop culture example of the secret decoder ring comes from the classic movie A Christmas Story. The primary character, Ralphie Parker, has just received his Little Orphan Annie secret decoder pin in the mail, and now he’s part of the Little Orphan Annie “Secret Circle.” At the end of each radio broadcast of the classic adventure series, announcer Pierre Andre would read off a secret, ciphered message that only kids with the decoder could interpret. In this case, the cipher mapped letters to numbers, so one wheel had the alphabet while the other wheel had numbers. (See Figure 2-7.⁶)

../images/466102_3_En_2_Chapter/466102_3_En_2_Fig7_HTML.jpg — Figure 2-7
Little Orphan Annie decoder pin from A Christmas Story

To decode the message, you needed to know how to align the two wheels—and Pierre informed his listeners to set the dials to B-2. In other words, the letter B mapped to the number 2. This is the simplest conversion possible because B is the second letter of the alphabet. Therefore, A is 1 and Z is 26. Ralphie’s first message as part of the Secret Circle was as follows⁷:

2-5-19-21-18-5-20-15-4-18-9-14-11-25-15-21-18-15-22-1-12-20-9-14-5

Using his decoder pin, Ralphie painstakingly decoded the message. We know that 2 maps to B, so the first letter is B. The fifth letter of the alphabet is E, so that’s the second letter of the message. The 19th letter is S, and so on, until we ultimately get this:

B-E-S-U-R-E-T-O-D-R-I-N-K-Y-O-U-R-O-V-A-L-T-I-N-E

At this point, Ralphie slowly reads the message, figuring out where the word boundaries are... “Be sure… to drink… your… Ovaltine. Ovaltine? A crummy commercial?”

Now let’s flip this whole process on its head. Let’s play the bad guy now. You have intercepted this secret message, and you want to know what it says. How would you go about trying to break the cipher? The process of trying to break a cipher is called cryptanalysis. Modern cryptography is performed by computers and is essentially beyond the capability of a mere mortal to break unassisted, but in the olden days, ciphers were performed by hand and broken by hand.

So, let’s begin our cryptanalysis by taking a close look at the cipher text. First, we need to make some assumptions. We will assume that the plain text is in English, and we will also assume that each symbol in the cipher text maps to a single letter in the English alphabet. Those assumptions may be wrong, but one of the most powerful tools in a code breaker’s arsenal is to know as much about the input as possible. Having prior knowledge, even if it’s just a good guess, can significantly reduce the effort required to decrypt a piece of cipher text.

Next, we need to study the cipher text and look for patterns or clues. We see numbers ranging from 1 to 25. Given that the English language has 26 characters, we will assume that this cipher is a simple substitution cipher that maps letters (A to Z) to numbers (1 to 26).

Here’s where things get interesting. Since we’ve assumed that the plain text is in English, then we can use our knowledge of the English language to help us crack the cipher. It turns out that the most used character in the English language (by a healthy margin) is the letter E. How do we know that? Well, most people would have no reason to know this, but it’s a code breaker’s job to know these sorts of statistics. So if you were to take a rather large block of cipher text that came from an English plain text, you could be pretty sure that the most common symbol would map to the letter E. The next three most common letters in the English language, in order, are T, A, and O. Counting the use of symbols in cipher text and trying to map them to common letters is called frequency analysis —using the frequency of the symbol in the cipher text (that is, how often that symbol appears, relative to other symbols) to guess the associated English character in the plain text.

Our cipher text is pretty short here, but let’s give frequency analysis a shot. If you count the number of times each number is used here, you will see that three numbers are used the most: 5, 15, and 18. Each one of these numbers appears three times. So, let’s assume that one of these symbols maps to the letter E. We will try each one to see whether it works. Let’s first guess that the number 15 maps to E. Now we’re going to make another assumption, just to see if we can get lucky: let’s hope that whoever created this cipher wasn’t trying very hard and they just mapped the letters to numbers in order. So, if E=15, then F=16, G=17, and so on. Using that mapping, the code breaker substitutes all the letters and gets the following guess at the original plain text:

R-U-I-K-H-U-J-E-T-H-Y-D-A-O-E-K-H-E-L-Q-B-J-Y-D-U

Hmm. That doesn’t look right. So, at least one of our assumptions was wrong. But since we’re lazy, let’s stick with the assumptions we have and just try another mapping. After all, there were three symbols that all showed up three times in our cipher text, so just because 15 didn’t map to E, maybe 5 or 18 will do the trick.

If we try E=5, keeping our other assumptions, we get the following:

B-E-S-U-R-E-T-O-D-R-I-N-K-Y-O-U-R-O-V-A-L-T-I-N-E

Bingo! We’ve just decrypted the message! Woo-hoo! If the cipher text was helpful enough to contain spaces, marking the boundaries of words, we could have used other statistics to help us decipher the message. The most common single-letter words in the English language are I and a. The most common three-letter word is the. The most common first letter of a word in English is the letter S. So, just knowing where the word boundaries are would have given us a lot more information that we could exploit to guess the mapping and break the code.

Obviously, cryptanalysis in the modern world is a lot more complicated—today we use computers and much more complicated algorithms to encrypt our data. But the previous exercise gives a glimpse into how an attacker might try to break a code. One of the most famous acts of code breaking occurred during World War II where Englishman Alan Turing , with his colleagues at Bletchley Park, cracked the Enigma Machine ciphers (Figure 2-8) used by Nazi Germany (improving on techniques originally pioneered by the Polish). However, code makers and code breakers have played significant roles throughout history, dating back to at least 1500BC in Mesopotamia.⁸

../images/466102_3_En_2_Chapter/466102_3_En_2_Fig8_HTML.jpg — Figure 2-8
German WWII Enigma machine (photo by Karsten Sperling)

Modern Cryptography

Modern ciphers revolve around the notion of a secret key. You take your plain text, plus some sort of text key (much like a password), and run it through an encryption algorithm. The result looks like complete gibberish. Unlike our simple substitution cipher previously, the output may not have a direct character-for-character mapping—it might well not even be representable as English characters at all. But, if you take the same secret key and apply the algorithm in reverse (the decryption algorithm), you get the original plain text back. This sort of cipher is called a symmetric cipher because the same key is used to both encrypt and decrypt (see Figure 2-9).

../images/466102_3_En_2_Chapter/466102_3_En_2_Fig9_HTML.png — Figure 2-9
Symmetric encryption flow diagram

It’s crucial to note here that the encryption and decryption algorithms themselves are public and well known. That is, the process for mangling and unmangling (demangling?) the text is not secret. In the past, people have argued that we would be more secure if the algorithms themselves were secret. That sort of sounds intuitively correct—like a magician not telling how his tricks are done. But that’s exactly the point—it’s not really magic if there’s a trick to it. If I could break your encryption just by knowing how you did it, then the entire process is dependent on that secret never being leaked or discovered—not just for one person but for everyone who ever used that algorithm—and therefore, every message or file ever encoded with that algorithm. In today’s world, that’s just not good enough. Any encryption scheme worth using must be secure even if the detailed mechanics of the scheme are well known. This also allows the algorithm to be exhaustively tested and scrutinized by crypto experts.

If you’re just using a cipher to encrypt personal data, like your diary, the symmetric cipher works quite well. You choose a password, commit it to memory, and use it as the secret key to encrypt the text of your diary. No one knows the key but you, and as long as you picked a strong password, that diary is completely unreadable by anyone. But again, you have to notice the caveat I slipped in there…you must choose a strong password for this to work. We’ll discuss this in depth in a later chapter, but for now just realize that computers are so powerful that they can guess millions or even billions of passwords per second in an attempt to crack your diary open—so choosing your granddaughter’s name or favorite football team is not going to cut it.

You can think of symmetric encryption as a trunk with a strong lock—you have the key, and only you can open the lock. It doesn’t matter if people have the trunk in their possession—unless they have the key, they can’t access the diary kept inside. It shouldn’t even matter if they know how the locking mechanism was made. For this analogy to compare well to modern cryptography, we must assert that the trunk would be completely impenetrable, and the lock would be effectively impossible to pick. The crucial element then is the key for the lock. As long as you have the key in your possession and there are no other copies of the key, then only you can open the lock, and therefore only you can access the contents.

Having the ability to lock up your diary is a good thing, but there are other situations in the real world that require security. In cryptographic parlance, we’ve covered the case of data at rest , but what about data in motion ? That is, what about communications? In this new scenario, we’ve crucially increased the number of people who need to be able to access the information from one to more than one. We’ve also added the requirement that we need to be able to send this data across some distance, in the hands of third parties that we can’t necessarily trust. To use a symmetric cipher for this purpose, you must now share your secret key with the intended recipient. But how will you do that securely? You can’t just send it to them—someone might intercept it along the way. You might tell them over the phone, but what if your phone or their phone is being tapped? You would have to meet them in person and whisper it in their ear. And yet, that’s not always practical or even possible.

Thankfully, some really sharp researchers devised a radical new approach to encryption in the mid-1970s that revolutionized secure communications and encryption in general. They essentially invented a lock with two keys: a public key that is used to close the lock and a private key that opens the lock. This is referred to as asymmetric encryption because different keys are used to encrypt and decrypt. In this scheme, copies of the public key are given away freely and indiscriminately, while only the owner holds the private key. Anyone wanting to send a message will use the recipient’s public key to lock the message, and the recipient will use their private key to unlock it. What a brilliant and elegant solution! With this technique, it is now possible for two people to communicate securely without ever having to physically exchange a single, shared key.

You can think of it like this. Let’s say Alice wants to send some confidential documents to a colleague across the country, Bob.⁹ While Alice could get on a plane and hand them off in person to Bob, that’s way too expensive and inconvenient—she wants to just ship them via U.S. mail. Alice has a nice little box that she can secure with a lock, but she can’t just send Bob a copy of the key in the mail—after all, anyone along the way could make a copy. Being a smart woman, she knows that she doesn’t have to rely on a symmetric locking mechanism—she can use an asymmetric method. To accomplish this, Alice tells Bob to send her an open padlock. She then takes Bob’s padlock and fastens it on the box. She didn’t need Bob’s key to close it; anyone can close the lock simply by clicking it shut. However, only Bob can open the padlock because only he has the key. If Alice was really smart, she could even include one of her own padlocks with the box when she ships it so that Bob could return the documents to her using the same technique.

So, how does all of this apply to you and your computer? As we discussed in an earlier part of this chapter, whenever we send an e-mail, watch a video on YouTube, shop at Amazon.com, pay bills using our online bank account, or just surf the Web, we’re sending data to and receiving data from the Internet. That data is chopped up into bite-size pieces called packets , and those packets make their way from your computer to some other computer (or vice versa). Along the way, those packets go through potentially dozens of other computers, routers, switches, and servers. The key thing to remember here is that those digital missives are a lot more like postcards than letters. That is, any person (or computer) along the way who takes the time to look at them can see whatever they contain.

For some of what we do on the Internet, that may be acceptable, but obviously there are situations—like banking and shopping—that will require a secure connection. The communication protocol used by the Internet is HyperText Transfer Protocol (HTTP) . You’ve seen this many times when looking at web addresses like http://yahoo.com /. If you are observant, you may have noticed that sometimes it’s not http but https. That extra S stands for “secure.” When your web browser is communicating via HTTPS, you should see some sort of indicator like a padlock icon that indicates that your communications with that web site are secure. That means that no matter how all your packets are bouncing around the Internet, only the computer at the far end can actually decipher them.

Authentication and Message Integrity

Hurray! We have asymmetric encryption! Problem solved, right? Not quite. It’s not sufficient just to prevent someone from seeing the contents of your communication. Let’s go back to our document shipping analogy with Alice and Bob to see why. How did Alice know that she was actually shipping the documents to Bob? If I were a crafty spy, say Mallory, I could try to insert myself between Alice and Bob—pretending to each of them to be the other party. This is called a man-in-the-middle attack . The crucial step here is to intercept communications between Alice and Bob. Let’s say that Mallory infiltrates the U.S. mail service so that she has access to all the correspondence between Alice and Bob. When Bob sends one of his open padlocks to Alice, Mallory intercepts the package and replaces Bob’s open lock with her own open lock. She doesn’t have Bob’s key, but she also doesn’t need it. When Alice sends the classified documents to Bob, she has unwittingly locked the box with Mallory’s padlock, not Bob’s. Mallory intercepts the package and opens the box using her key. If she wants to be really sneaky, she can simply copy the contents and relock the box with Bob’s lock (which she saved). Bob receives the package, unlocks the box using his key, and wrongly assumes that the contents have not been viewed by anyone else.

As a man (or in this case, woman) in the middle, Mallory can create all sorts of mischief. For example, not only could she copy the contents of the messages and packages, she could change them. Let’s say Bob asks Alice to send him $10,000 in cash. Since Mallory has intercepted this communication, she could alter the message to say $20,000 instead. When Alice sends the $20,000, Mallory can remove half the money and send the remaining $10,000 to Bob, who will receive exactly what he asked for and no one would be the wiser. Better yet, Mallory could replace the cash with counterfeit bills and pocket everything!

So, how do we solve this problem? We now need to come up with some way for Alice to convince Bob, beyond a shadow of a doubt, that not only was she the only person who could have sent this package but also that the contents of the package have in no way been altered. To solve this problem, we need to not only use our public and private keys, but we need to introduce a new tool called a cryptographic hashing function . (I know... it sounds super technical... just hang with me on this.)

First of all, for simplicity, let’s just talk about sending a document from Alice to Bob. Alice has a set of pages that she needs to send to Bob securely, with complete confidence that no one has altered or intercepted them along the way. What sort of tool would we need here?

Let’s say there exists a document X-ray machine that can effectively merge multiple pages into one. It stacks up all the pages and shoots a magic ray through them all, capturing what is essentially a shadow of all the text on the pages, merged into a single composite “X-ray” image. We’ll call this machine a Dox-Ray machine !

To see how this marvelous Dox-Ray machine works, let’s use a simple example. Figure 2-10 shows five sentences of about the same length.

../images/466102_3_En_2_Chapter/466102_3_En_2_Fig10_HTML.png — Figure 2-10
Five phrases

Now let’s stack these five phrases on top of each other (Figure 2-11).

../images/466102_3_En_2_Chapter/466102_3_En_2_Fig11_HTML.png — Figure 2-11
Five phrases stacked on top of one another

This is the Dox-Ray image of all five phrases, the merging of all the lines, one on top of the other.

This Dox-Ray image has some very interesting properties. First, it’s completely unique and predictable, based on the input. The one and only way to get this exact X-ray is to use these exact same words, in the exact same order. If you changed, added, or deleted even a single character, the X-ray image would also change.

Second, there is no way for you to get the original five phrases back from this Dox-Ray image. This is a “lossy” conversion: information is lost in creating it. It’s therefore a one-way conversion. Given the input (the five phrases), you can always reproduce the output (the X-ray image); but given the output, you cannot get back the original input.

Finally, this process produces something small and a fixed size. If you imagine that we run this X-ray process on a set of pages of a book, the output will be the size of a single page, regardless of how many pages are in the original book. You’re basically squashing the entire tome into a single sheet of paper.

So, our Dox-Ray machine can take a set of document pages and produce a unique image of all those pages—a single page that represents dozens or even hundreds of input pages.

Now let’s give this Dox-Ray machine to Alice and Bob! With this nifty device, Alice could take an X-ray of her document and include that image with the package. When Bob receives the package, he could use his own Dox-Ray machine to create an image of the document he received. He can then compare his image with the one sent by Alice. If they’re identical, then Bob can be assured that the document hasn’t been altered in any way. If Mallory attempted to alter the document in any way—removed a page, used Wite-Out correction fluid, replaced a page with different text, or tried to add a new page—the Dox-Ray machine would produce a different image. So, even if Mallory could intercept and inspect the package, she can’t alter the document in any way without being detected.

This is essentially how a cryptographic hashing function works. You put in a bunch of text and out comes a short value. If you alter just one letter of that input text, the value changes completely. It’s also effectively impossible to find another set of words that will end up with the same hashing result.

Let’s test that theory, just to prove the point. Using a popular crypto hash function called SHA-256, here is the hash output using this entire 300+ page book as input:

4a939d318741a70f67af9e3fbb28d5f2ebeccc799c0ee6bf0de59628bbcc7178

Now here’s the hash of this book after altering just a single character:

1fd4094a7780cdc18c54c339bc4011669656455e7a64a32b4769aafa6516e6b7

Do you see any similarities in those two numbers? Nope. They’re completely and utterly different. So, this is even better than our Dox-Ray machine —one tiny change in the input creates a massive change to the output.

If I’ve managed to make you properly paranoid, you will realize that there’s still a problem here. Why wouldn’t Mallory, who obviously has access to the contents of every package, just swap out the X-ray image that Alice included with the package? She could alter the contents, re-create the X-ray with the new document, and include that image in the package. This is where the public key crypto comes back into play, and this is where the padlock analogy breaks down.

Now we need to imagine a different sort of lock, one that doesn’t exist in the real world—let’s call it a SuperLock . This lock has two key slots: one to lock it and one to unlock it. The keys to work this special lock always come in pairs. One is a public key, which is copied and given out to anyone who wants to have one; the other is a private key, which is never copied and held only by the owner. Either key can be used to close the lock, but once locked, only the other matching key can open it. The same key cannot do both things—one key locks it, and the other must unlock it. Any two keys will work, which is the beauty of the SuperLock, but the keys have to be a matched pair. With me so far?

SuperLocks are available in fine stores everywhere, but to have a pair of matched keys made you must go to a very special locksmith—a SuperLocksmith, naturally. This locksmith will create a pair of matched keys, one public and one private. The locksmith will also keep a copy of your public key and make a duplicate for anyone who asks, free of charge. Finally, this locksmith will also vouch for the fact that he made the lock for you, specifically, and can prove that he gave the single, private key directly to you. Basically, the SuperLocksmith is verifying and vouching for your identity.

Okay... assuming you followed all of that and bought into the concept, we can now say two powerful things. First, if I lock a SuperLock with someone’s public key, then that someone is the only person on the entire planet who can open that lock (because they are the only person who has the private key). That is to say that I know for a fact that if I were to lock this special lock with one of Alice’s public keys (which I could pick up for free from the locksmith), then I can be sure that Alice, and only Alice, could unlock it. Second, if I were to receive something with a lock on it and Alice’s public key could open that lock, then it must have been Alice who locked it—and therefore I can guarantee that whatever is in the locked box did in fact come from Alice.¹⁰

Now we have all the tools we need to resolve our dilemma. We’re in the home stretch here! Hang with me just a bit longer...

Let’s start from the top again and see how this works with our new tools. Alice wants to send Bob some confidential papers. She wants to make sure that no one else will be able to read these documents, even if they were to intercept them in transit. She also wants to make sure that Bob will know beyond a shadow of a doubt that Alice was the one who sent them. Finally, just to be safe, Alice wants to also give Bob the means to verify that the documents have not been altered in any way.

So, Alice and Bob both buy a Dox-Ray machine and a SuperLock . Alice and Bob take home their private keys and order a copy of the other person’s public key online or pick it up from a local locksmith. Alice puts the confidential papers in the box. Then she takes out her Dox-Ray machine and scans the documents. She puts the Dox-Ray image into another smaller box. She then locks this smaller box with one of her own SuperLocks using her private key. Since the box was locked with Alice’s private key, anyone can open it using one of Alice’s freely available public keys. The point in this case is not to protect the contents (the X-ray image)—the point is to declare that Alice was the one who X-rayed the documents and therefore the one that sent the package. Now Alice puts the little box inside the big box, along with the documents, and locks the big box with another SuperLock using Bob’s public key. At this point, the only person who can unlock that big box is Bob. Alice confidently hands her package to her new mail carrier, Mallory.

When Bob receives the box, he unlocks it with his private key. He then removes the smaller box that contains the X-ray image. Using Alice’s public key, he opens the smaller box and removes Alice’s X-ray image. Because the SuperLock opened using Alice’s public key, he knows that Alice was the one who locked the SuperLock on the smaller box. He then whips out his Dox-Ray machine and takes his own image of the documents he received. The image exactly matches the one Alice sent! Now he knows that even if the box was somehow compromised, the contents have not been altered.

And there you have it! At a base level, these are the same mechanisms used to secure just about all communication on the Internet today. There are many different algorithms and complex combinations of these mechanisms, but the fundamental principles are the same. The public and private key pairs are negotiated behind the scenes by your computer (the client) and the far-end computer (the server). While you could actually create a public/private key pair and register them with a formal key authority (the locksmith in our analogy), most average people have no need for this. You prove your identity when you first set up your online account, and from then on you use a combination of user ID or e-mail address and password.

Triple-A

We’re almost done with the basics of security, but we need to take a minute to explain AAA , or Triple-A. No, not the American Auto Association. Not Anti-Aircraft Artillery, either. In the world of cybersecurity, AAA stands for “authentication, authorization, and accounting.” We’ve already discussed authentication, so let’s quickly talk about authorization and accounting.

Once you have a mechanism to definitively identify someone (authentication), you may also want to restrict what that person is allowed to do. For example, if you’ve ever worked for a large company, you may have been issued an ID badge that can be swiped to give you access to different areas within the company’s buildings. The ID badge provides authentication under the assumption that only the person to whom the badge was issued would be in possession of the badge. At this point, you can allow or deny that person access to certain areas by either allowing or disallowing their badge to unlock certain doors. This allows you to compartmentalize access to sensitive regions within a building. Knowing that only a handful of people can get through a given door means that you have a limited set of suspects if something is stolen from that area, for example.

You can also use these badge swipes to keep a record of who has accessed the restricted region and when they were present. This is the third A of Triple-A : accounting. If a new piece of lab equipment is reported missing from a restricted lab, the security team can see who was in that room around the time it went missing.

Newer Isn’t Always Better

In the realms of security, there is a constant struggle between using the latest and greatest technology while at the same time not trusting anything until it has been proven using the test of time. Let’s say you were in a castle with an angry horde outside your gates and I came to you with a new-fangled magical force field that I guarantee will be stronger than any stone wall you could build. Would you tear down your walls and install the force field based on my word? Probably not. What if I showed you reams of laboratory data where I tested my force field against a stone wall with simulated attacks? If you were properly worried about your safety, this would still not be sufficient. Yet, if this product were truly better than what you have now, you’d be wrong to ignore it. So, what would you do?

Well, you’d probably deploy the new force field in addition to your existing walls and see how it held up during some actual assaults by actual adversaries. And you would probably do this for some time before you decided it was safe to rely solely on this new technology. Stone walls are something you know and trust. You know how they’re built; you know how to build them well because you’ve gone through many iterations of building them. You know their weaknesses, and you’ve figured out how to shore them up. But this new force field thing... it’s never been used. You really have no idea how it will perform in the real world. What if the designer missed something crucial? What if it works great in the lab where you can control everything but works poorly in the real world where there are many variables outside your control: having unfailing and unlimited access to magical energy, for example.

The same is true for security. New encryption schemes, new authentication systems, and new security processes are generally shunned until they’ve been around for at least a few years. This gives independent security teams a chance to kick the tires, take it for a spin, and see how it performs in the real world. It’s all well and good to come up with a new encryption algorithm that’s mathematically sound on paper, but it’s quite another to actually implement this algorithm in software and hardware.

While there is certainly a lot more to know about cybersecurity tools and processes, this chapter has introduced you to the primary conceptual building blocks that underlie information security.

Privacy and Tracking

We can’t discuss computer security without discussing privacy. As we’ve discussed in earlier parts of this chapter, bad guys having access to your private data can greatly increase their ability to crack your passwords or convince someone that they’re you (identity theft). For these reasons alone, you need to be very careful about how much information you give away about yourself online to Google, Facebook, Instagram, Twitter, LinkedIn, and so on.

But there’s much more to privacy than what you knowingly and willingly give away. In fact, the truly scary part is what you’re giving away every day without really knowing it. Your credit cards and debit cards are convenient for making purchases, both online and at brick-and-mortar stores. It won’t surprise you that MasterCard, Visa, and American Express know all about where you shop, what you buy, and how much you spend. It also shouldn’t surprise you that major retailers also keep similar records.

After the Cambridge Analytica scandal, no one should be surprised at the amount of information social medial companies like Facebook and LinkedIn have on you. You should probably also realize that Google has more than any of them. Google is not a search engine, an e-mail provider, or a document collaboration service... Google is an advertising company, period, end of story. More than 90 percent of its revenue is from selling ads, and it can charge a premium for its ads because they’re highly targeted. And they’re highly targeted because they know unbelievable amounts of things about all of us.

What you, as a consumer, need to realize is this:

If the product is free, then you are probably the product.

That is, all of these “free” online services have to make money somewhere. If they’re giving you some service for free, then they have to make money somewhere else.

What you may not realize is that many of these companies sell your data to other companies, as well. These are companies we call data brokers . I’ve seen estimates that there are 2,500 to 4,000 data brokers in the United States alone. These companies are essentially unregulated (as of the writing of this book…changes may be coming soon). You don’t own your data, and in most cases, you can’t even look at your data. Have you considered why grocery stores have “loyalty cards”? Whenever you present your card, you are allowing them to associate all your purchases with you.¹¹ While they will almost certainly use this information to better target you with store promotions, they are probably also selling this information to third parties.

Want to know what info they have on you? Acxiom, one of the largest data brokers , has created a web site that lets people view the profile they have on you. If you want to see what your profile looks like, go to aboutthedata.com . You’ll have to create an account by giving them your e-mail address. (Don’t worry about giving away that information... trust me, they already have it.) If you go to this web site, you will see that they have collected a treasure trove of information about you: income range, house and car value, marital status, number of children, education level, political party affiliation, types of products purchased, debt levels, household interests, and so on. While you may be surprised at the information you find there, what you need to realize is that this is just the tip of the iceberg. This site only shows you the “core” information they collect. It doesn’t show things like sexual orientation, propensity for gambling, and known health issues. It also doesn’t show you the correlations and associations they’ve made about you…whether you have a sick mother, a rich uncle, or a shopaholic spouse.

One of the most lucrative consumers is a newly pregnant one. New parents spend tons of money, and locking in brand loyalty at this stage is paramount to product sellers. In February 2012, Forbes ran a story about how Target correctly predicted that a woman in a household was pregnant based on the types of products she was buying: unscented lotion and soap, large quantities of cotton balls, washcloths, and particular vitamin supplements. Based on a high “pregnancy prediction score,” according to the story, Target took it upon themselves to send some unsolicited baby product coupons to the woman in the mail. Unfortunately, this “woman” was a teenage high school girl living with her parents, and she had not yet told her father the news. Some of the specifics of this story have been called into question, but this is precisely the sort of situation that can arise from broad data collection and correlation. This is the world in which we are currently living, and it’s just getting worse.

Though the stated purpose of gathering this information and creating these detailed personal profiles is to provide supposedly anonymous demographic information to advertisers, you have to realize that you have no real control over the buying and selling of this data. You can attempt to “opt out,” but you’re relying on these unregulated companies to voluntarily comply with your wishes—not just now but forever. As if this weren’t bad enough, you must also realize that because someone is maintaining this information and saving all of this data, that means the government and law enforcement agencies will also have access to this information—and they probably don’t even need a warrant to get it. In fact, you can access some of this information yourself. Your IP address, because of the way these ranges of numbers are administered, can tell a lot about you—who your ISP is and where you are physically located (at least city and state¹²).

And just to completely demoralize you, I must state one more fact: even if you do pay for a service, that doesn’t guarantee that the company providing the service still won’t turn around and try to make extra money off your personal information. It was revealed a couple years ago that both AT&T and Verizon were adding tracking information on all of their users’ Internet usage. Basically, any time you browse the Web from your smart phone, AT&T and Verizon were tagging all of your requests with a marker (sometimes referred to as a supercookie ), which can be seen by any web site you visit. This marker changed periodically, allowing AT&T and Verizon to basically sell subscriptions to your online activity. But even if web sites don’t pay for this information, they can still use that ID to track you. If they can find some other way to independently figure out who you are, then it has the same effect. The only good news here is that the tracking happens only when using the cellular network for Internet access, and it works only on unencrypted web surfing. When you go to web sites that support HTTPS, the communication is protected, and no tracking IDs can be added.

After a consumer backlash, AT&T and Verizon eventually gave their users a way to opt out of this tracking. In fact, Verizon was fined $1.3 million by the FCC over this (which is nothing to Verizon). But this just goes to show that even paying a lot of money for a service doesn’t guarantee that they won’t sell your information to make more. Sadly, as of March 2017, the rules that were put in place to stop these practices were gutted. ISPs are now totally free to collect as much info on you as they can to use and/or sell as they please.

Here’s another story that will make you cringe. A story in Salon by Michael Price¹³ tells about a marvelous new “smart TV” that he purchased to replace a failing, old “dumb” TV. This TV came with all sorts of wonderful, built-in features like web browsing, e-mail, social media, streaming TV services, apps, and games. Then he made the mistake of reading the fine print.

… I’m now afraid to use it. You would be too—if you read through the 46-page privacy policy. The amount of data this thing collects is staggering. It logs where, when, how and for how long you use the TV. It sets tracking cookies and beacons designed to detect “when you have viewed particular content or a particular e-mail message.” It records “the apps you use, the web sites you visit, and how you interact with content.” It ignores “do-not-track” requests as a considered matter of policy.

It gets worse. This TV has a built-in microphone and camera that are used to provide face and voice recognition. The privacy policy actually states the following: “Please be aware that if your spoken words include personal or other sensitive information, that information will be among the data captured and transmitted to a third party.”

Of course, we now live in the age of virtual assistants like Apple’s Siri and Amazon’s Alexa. We purposely buy products that listen to us all the time so that we can ask them for the news or the weather, buy stuff, and even turn things off and on.

This is the world we live in now, filled with smart devices that track our every move, physically and virtually, and report that information to some nebulous “third party.” As the Internet of Things becomes a reality and many more dumb devices become smart devices, the potential for tracking and breach of privacy will grow by leaps and bounds.

The real issue today with monetizing your data is that most people are simply not aware of what is actually happening and they have little to no control over the data once it’s collected. We need more transparency, and we need the option to “opt out”¹⁴ of data collection , even if taking that option costs money. People should have the choice, and it should be an informed choice. As consumers and as citizens, we need to compel our government representatives to stand up for our interests. These companies must provide free and easy access to the profiles they keep on us (much like credit companies must provide free copies of our credit reports) and allow you to edit this information or completely remove it, if you ask. I would go further and say that you must “opt in” for any data collection, but it’s frankly too easy to get this approval. How often have you actual read the end user license agreement (or EULA) that comes with all the software you install or the online services you use?¹⁵ Buried in there somewhere is language that allows the software maker to pretty much do as they please. The only way we’re going to change this is to elect representatives that will enact regulations to keep these companies in line.

While the European Union has taken some big steps in the right direction with the General Data Protection Regulation (GDPR) —addressing most if not all the aspects I listed earlier—the United States is actually deregulating the data collection industry. It remains to be seen whether the Equifax, Facebook, and Cambridge Analytica scandals will finally reverse this trend and foster some commonsense regulations in the United States.

Who Can You Trust ?

So, who can you trust these days? That’s a critical, fundamental question, and as a society, we need to be asking it a lot more than we are. Unfortunately, there’s no easy answer. If you think about this too hard, it quickly becomes deeply philosophical. But my job here is to try to simplify things for you, so let’s look at this from a practical standpoint.

First, we need to decide what we mean by trust. Violating trust is not just about intentionally trying to do you harm. Online service providers and marketing firms are certainly not trying to harm their customers by gathering and selling mass quantities of “anonymized” personal data. In fact, they would (and do) argue that they’re helping you by trying to tailor the ads you see on the Web to your specific interests. By better targeting the advertisements presented to you, they increase the likelihood that you will actually want to know about the products and services for sale, while avoiding showing you ads for things you don’t want to see. Because targeted ads are more valuable to marketers, they can also make more money per ad, meaning they should be able to (theoretically) reduce the overall number of ads to make the same money. They view this as a win-win situation (or at least this is what they claim). You may very well decide that you’re willing to sacrifice some privacy to get free web content and services. But it’s important that you realize how you’re paying for these services and to realize that your data is probably getting sold to other third parties, as well, who may have entirely different purposes.

At the end of the day, you need to try to determine the motivations of the person, corporation, or entity behind the advice or service being offered. In many cases you just need to ask, how do they make their money? Are you paying a fair price for what you’re getting? If you’re paying nothing or if the price is too good to be true, then you should think carefully about trusting the source. Many online services that are “free” make their money by selling advertising and by getting kickbacks when you click links that take you to other sites. They also make money off selling your information to interested third parties—which you probably agreed to when you signed up for the service, in cryptic language buried deep in the license agreement. Sadly, as we discussed in the previous chapter, we also know that paying a fair value for a service does not mean you can trust the provider, but the odds are at least better.

Besides “following the money,” you also need to think about other motivations and ask deeper questions. Is it in this service provider’s own best interest to be open, honest, and transparent with you? Or do they stand to gain more if they confuse, mislead, or scare you? Let’s think about that one for a minute. Before the advent of the Internet and before CNN came into being, most people got their news from newspapers and the nightly TV news. In the “old days” (like 30 to 40 years ago), TV news was not about making money—it was about providing a public service. (This wasn’t out of the goodness of their hearts; it was actually mandated by the Federal Communications Commission under various “public interest” requirements for broadcast licensees.¹⁶) However, in the intervening years, as Nielsen ratings and advertising dollars have become the driving force in television, our “news” programs have become “infotainment.” The focus is not educational or informational; it’s about getting (and keeping) eyeballs. Boring facts don’t rivet people in their seats. “Will watching the nightly news actually kill you? The answer will shock you! Tune in at 11!” The more sensational, the better.

Summary

Cybersecurity is analogous to physical security. To be truly secure, you need understand your adversary’s capabilities and motives. You need to know where your weaknesses are. And you want to have several layers of defense, not just one.
Security is never absolute, and there are always compromises. You’d go broke trying to be completely secure, so you have to make intelligent trade-offs.
We reviewed all the basic computer and cybersecurity terms. If you forget something, remember to turn to the glossary for quick help.
Encryption is a marvelous tool that allows us to keep our data safe and secure—both on our hard drives and as it traverses the public Internet.
The real threat today is to our privacy. In the United States, corporations and third-party data brokers are harvesting vast amounts of personal data with little or no regulation. This data is ripe for abuse and begging to be stolen by hackers or misappropriated by overzealous agencies.

Checklist

It’s time to start doing stuff!

Tip 2-1. Know Thyself

We need to take a moment to jot down some key information that you will need to have handy. This information will be needed for you to choose which sets of directions to follow in the checklists. You’ll want to make note of the following information:

What sort of computer do you have—is it a desktop computer or a laptop? A desktop computer is always plugged into the wall for power; a laptop has a built-in battery so that the computer is portable and does not need to be plugged in all the time.
What operating system (OS) is running on your computer? This book covers modern versions of the two most popular OSs: Microsoft Windows and Apple macOS (formerly called OS X). One quick way to check the type and version of your OS is to simply go to this web site:
http://whatsmyos.com /
At the top of the page, it will tell you its best guess at your OS type and version. The page will also explain how to “manually” determine your operating system. This is crucial information, so you might take the time to verify it manually. Examples of Windows versions are 7, 8, 8.1, and 10. Mac OS versions have numeric versions as well as code names. Examples of macOS are Sierra (10.12) and High Sierra (10.13). For older Macs, you might find Mac OS X version 10.11 (El Capitan) or 10.10 (Yosemite). If you do not have any of these versions, then your OS is pretty old, and you will have some trouble with the OS-specific recommendations in this book. However, other recommendations will still be useful.

Tip 2-2. Know What They Know

I’ve tried to explain to you how much data is being collected on you, but I doubt the magnitude of this has really sunk in. Did you know many popular services give you the option to download all the information they’ve collected on you? Now, realize that in many cases, they only give you the raw data, not all the things they’ve derived from this data. They also don’t share the information they may have from other sources. Nevertheless, seeing this data for yourself is quite eye-opening, and I strongly encourage you to download as much of this data as you can and take a long hard look at it.

Note that in just about every case, they will arrange to send you a big “zip” file to download. A zip file is a compressed archive of a bunch of files and folders. Once you download this file, you can double-click the .zip file, and it should expand to a folder that is full of other files and folders.

Google:

1.
Go to https://takeout.google.com/settings/takeout .
2.
You’ll see dozens of Google services there. I would leave all the boxes checked, except perhaps Gmail; you already have all your e-mails, and there’s no need to download them all again.
3.
Scroll to the bottom and click Next.
4.
The default settings here should be fine, so just click Create Archive.

Facebook:

1.
Go to Settings on the Facebook web site.
2.
Click “Download a copy of your Facebook data” at the bottom of the General Settings page.
3.
Click Start My Archive.

Twitter:

1.
Go to your Account Settings on the Twitter web page.
2.
Click the Request Your Archive button.

LinkedIn:

1.
Go to your Account settings for Privacy.
2.
Click How LinkedIn Users Your Data at the left.
3.
Click Download Your Data.
4.
Click The Works.
5.
Click Request Your Archive.

Apple: ¹⁷

1.
Log in to Apple’s Privacy web site: https://privacy.apple.com/ .
2.
Under “Obtain a copy of your data,” click Getting Started.
3.
Select the data you want to download. You may want to skip things you already have like e-mails.
4.
Select the chunk size for the data, that is, what the maximum size of each file download should be.
5.
You’ll get an e-mail when your files are ready for download.

Footnotes

Almost. Because computers are so tied to binary counting, they count things based on the powers of two. You frankly will probably never need to know this, but if some smart-ass tells you that 1KB is not really a thousand bytes, they’re right…it’s technically 1,024 bytes. Why? Because. Just trust me. For most purposes, you can just call it a thousand and be done with it. The same is true for the others (MB, GB, TB)—just go with thousand, million, billion, and trillion. It’s close enough.

The term Wi-Fi is just a marketing term someone made up. It was meant to sound like Hi-Fi but doesn’t really stand for “wireless fidelity.” It’s just a lot catchier than 802.11, which is the technical name.

Image source: Histoire des jouets by Henry René d’Allemagne (1902)

An acronym for Completely Automated Public Turing test to tell Computers and Humans Apart. A Turing test, named for Alan Turing, is a test that attempts to verify that you are communicating with a real human and not a computer.

Britain’s version of the NSA, which is called Government Communications Headquarters

This is my personal decoder pin, obtained from the A Christmas Story House and Museum in Cleveland, Ohio. If you’re a fan of the movie, it’s a must-see ( https://www.achristmasstoryhouse.com/ )!

As you can see in the figure, the actual decoder pin wasn’t a true rotational cipher. But I’ve simplified it here for the purposes of our example.

Turing’s work has been wonderfully captured in the Academy Award–winning movie The Imitation Game. And if you find the history of cryptanalysis as fascinating as I do, I highly recommend you read The Code Book by Simon Singh.

Alice and Bob are well-known in the cryptographic world. These are the names used when describing communication scenarios in lieu of saying “Party A” and “Party B.”

This provides something called nonrepudiation. That’s a fancy legal term that basically means Alice can’t plausibly deny that something digitally “signed” with her private key came from someone else.

Here’s a fun workaround to giving your personal info. If your store allows you to find your loyalty card using your phone number, try your local area code plus 867-5309. If you were a 1980s teenager, you’ll recognize this as the telephone number for “Jenny” from the one-hit-wonder song by Tommy Tutone.

You can try this yourself. Go to ip2location.com .

https://www.salon.com/2014/10/30/im_terrified_of_my_new_tv_why_im_scared_to_turn_this_thing_on_and_youd_be_too

The term opt out refers to a situation where a company signs you up for something but gives you the option to back out if you make some effort. For example, they might automatically include you in some data-gathering program but allow you to “opt out” if you change some preference on your account or send them a signed form. An opt in program is the opposite: you have to explicitly ask to be in.

There’s a great documentary on the topic of these EULAs and the data being collected on all of us called Terms and Conditions May Apply.

http://benton.org/initiatives/obligations/charting_the_digital_broadcasting_future/sec2 , reference 32. Amendments to Delegations of Authority, 59 FCC 2d 491, 493 (1976)

Note that this feature will initially be available only in the European Union but should roll out to other countries soon.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 2. Cybersecurity 101

Create new playlist

Sign In

Sign Up