© Sreejith Keeriyattil 2019
S. KeeriyattilZero Trust Networks with VMware NSXhttps://doi.org/10.1007/978-1-4842-5431-8_1

1. Network Defense Architecture

Sreejith Keeriyattil1 
(1)
Bengaluru, Karnataka, India
 

You’ve probably heard the saying, “security is the next big thing.” Security has been an important industry buzzword for many years now. What most analysts fail to convey is that security is not optional. Network and application security have to be built into the design; security shouldn’t be an afterthought.

This chapter covers important incidents that shook various industries because of the loopholes they revealed in the network architecture.

Malware that Shocked the World

The world’s largest shipping conglomerate, Maersk, was in for a shock on the morning of June 27, 2017 (see Figure 1-1). The shipping industry is a 24/7 business. With innovations in IT, complex software applications and business logic have helped make the world’s biggest and oldest business efficient and agile.
../images/483938_1_En_1_Chapter/483938_1_En_1_Fig1_HTML.jpg
Figure 1-1

Notification that Maersk’s IT systems were down

Every 15 minutes of every day, a dock somewhere around the world is unloading between 10,000 and 20,000 containers. Maersk has more than 600 sites in 130 countries. You can imagine the complex logic Maersk’s software system must use to make this process run smoothly across the world. Given the reliability of this kind of business, there needs to be a considerable number of engineers looking into their IT systems and ensuring that they run smoothly around the clock.

Considering the sheer amount of data that’s generated and the updates that happen every day, the infrastructure that is required to attain such a feat is enormous. Along those lines, you can imagine that the attack vector also increases in these kinds of enterprise setups. They run multiple applications with different requirements in multiple data centers across the world. To keep everything in sync and to make sure only trusted clients can visit and enter these systems is a complex task. It requires months of fine-tuning and, more importantly, conducting monthly drills and security auditing.

As I don’t have in-depth information on the specific IT systems used at Maersk, I can assume that they followed standard processes and architectures commonly used in IT operations.

If that is true, what went wrong? One by one, Maersk’s systems were affected across the globe by the NotPetya ransomware (there are some discussions that consider the attack cyberwarfare, but that’s up to the investigation).

NotPetya is a comparatively complex piece of code that uses multiple ways to spread its chaos. One of the ways it spreads is to use a Microsoft vulnerability called EternalBlue. The chain of attack can be in general listed as follows. (Note that NotPetya is complex and is used in multiple ways to attack and spread. What follows is the most common way it’s used.)
  1. 1)

    Through email or by any other means whereby the user is tempted to click on a link.

     
  2. 2)

    Windows user access control requests permission to run the program.

     
  3. 3)

    If the user makes the ill-fated decision to give permission, this allows the backdoor to be installed. The remaining code required to start the targeted attack is then downloaded.

     
  4. 4)

    From this launchpad system, NotPetya starts scanning the network for any vulnerable open ports, specifically for the SMB 1 (139/445) vulnerability known as EternalBlue/EternalRomance.

     
  5. 5)

    Once it identifies the vulnerable systems, it starts spreading and infecting all the vulnerable computers in the network.

     
  6. 6)

    It then encrypts the files and the MBR and asks the users to reboot. Once users reboot, they will be greeted with a boot screen asking for a ransom.

     

This particular method of attack is too hard to stop. In a big corporation like Maersk, which has thousands of servers and desktops, stopping such attacks requires a well-patched system and a wide variety of access rules and restrictions. But it is very unlikely that this restriction can help once you are affected.

Maersk suffered close to 50,000 affected endpoints, with more than 4,000 servers affected. This resulted in a $300 million loss.

SamSam Ransomware

The Colorado Department of Transportation administers the state’s 9,144-mile highway system and its 3,429 bridges. This amounts to millions of vehicles passing through every year. Their system was affected by the ransomware called SamSam. Its modus operandi is similar to NotPetya’s—find a vulnerable port/application and use the affected system as a launchpad to affect other systems.

This specific incident caused millions of dollars in damages to several organizations. There was another reported incident of ransomware-affected hospital networks, whereby critical IT systems were affected and the employees had to resort to manual recordkeeping to continue working.

Here the target was RDP ports, which are open to the public. A third-party research institute identified that over 10 million computers face the Internet with their RDP port 3389 exposed. Attackers simply scan for any vulnerable systems online (there are multiple free tools available online for this). Once they find a vulnerable system, they use a brute force password attack tool like John the Ripper or Cain and Abel. Once they are in and have privileges to install software, they can install malicious software on the system. They can install ransomware or they can use the system as a part of the bot network for a DDoS attack elsewhere.

The point is that you don’t need high-level knowledge to do these kinds of attacks; one of the most common ways to attack is through open ports and vulnerabilities. Most common vulnerabilities can be found at this site: https://www.shodan.io/.

Common Themes of Attack

Figure 1-2 shows a new attack. Note that there is a pattern emerging in these types of coordinated attacks. The attacker specifically targets the vulnerabilities in the software or operating system. Figure 1-2 shows only one specific type of cybersecurity attack among the plethora of attacks that are happening in the current IT space.
../images/483938_1_En_1_Chapter/483938_1_En_1_Fig2_HTML.jpg
Figure 1-2

Common attack process

For a large organization like Maersk, the infrastructure application and server will have hundreds of open ports and external connectivity links. Blocking all ports with external access is not an option. The fine-grained access policies and firewall rules, with IPS and IDS, do the task of filtering the unwanted traffic out of the desirable traffic.

This is one of the most used and well-known attack methods. The following sections cover other types of attacks.

Reconnaissance

The primary objective of reconnaissance is to identify the attack target. This can be an entire corporation or a specific company. This is the point of entry to the system.

Port Scanning and Access

Once the target is identified, the attacker needs to enter the network. This can be done through multiple toolsets available on the public domain. There are multiple port scanners that will perform port scanning to check for vulnerable ports.

Before that, attackers need to make the victim install the payload, which contains the necessary code to do the port scanning. This can be done in multiple ways. It can be through social engineering or via email, where the victims are tricked into clicking on a link that, in turn, installs the malware. An organization with a culture of “security first” will have multiple threat detection tools and processes to prevent all these issues. Given that these types of attacks still happen around the world, it is very difficult to educate all employees of security threats.

Once the software is inside the system, it can download other feature sets required to perform further attacks. All it needs to do is attack the vulnerable ports, gain root access to the system, and do the intended task. An intelligent hacker will also make it difficult to trace his steps, by deleting the logs and software he uses. There are cases where attackers have used the same process, again and again, to gain access and then delete the trail log.

The Castle Wall Analogy

A castle wall can be a helpful analogy to explain one network defense method. As humans tend to reuse time-tested systems in new ways, the castle wall example can be used in the digital space as well.

A castle wall, as you know, is a tall wall built around a large city or castle to defend the inhabitants and their precious resources. The purpose of a castle wall is to block external threats. During medieval times, cities were under constant threat of raids and attacks. The first line of defense were the city gates, and most cities were surrounded by well built castle walls as well.

If you are a Game of Thrones fan, you have seen this multiple times. Consider the scene of Daenerys’ army surrounding the capital, Kings Landing. The wall was surrounded by an open area, which made it easier for the bowman to detect threats looming miles away. The castle wall stopped their march, and they were easily detected as a threat. This can be further extended to the Trojan Horse story, where intruders hid inside a large horse, which was presented as a gift and therefore rolled into the city without concern.

The hidden paths in the tunnels leading into the city can be regarded as one vulnerability. The point is to make you understand how this scenario matches the perimeter-based firewall approach.

In some castle models, there is a moat surrounding the castle, filled with deadly alligators. This makes it even tougher for invading armies. In those, cases, there must be a drawbridge (a movable bridge) that leads to the castle gate. This drawbridge is like the ports you open in the firewall to enable application connectivity.

The Perimeter Model Defense Architecture

The perimeter model network security defense architecture has been in production a long time. The concept is straightforward and is based on the castle wall approach.

You make a line of defense using firewall appliances. Each packet entering the data center has to go through the firewall first. The firewall has security rules—firewall rules—that filter the packets. These rules can be based on layer 4 filtering or more in-depth layer 7 filtering. Both have their advantages and disadvantages.

Zone Defense

In a traditional security system, devices are separated into multiple security zones (see Figure 1-3). This isolates the spread of the attack yet requires more fine-grained control over the system, which in turn needs more security.
../images/483938_1_En_1_Chapter/483938_1_En_1_Fig3_HTML.jpg
Figure 1-3

Security zones

Multiple security zones are created based on the threat level and the sensitivity of the content.

Centralized Security Control

You need to have a bird’s-eye view of your entire IT security landscape to make better decisions, as well as to learn, analyze, and quickly respond to live threats. With the current methodology, the time it takes to isolate and respond to the attack is much more important.

The perimeter firewall method, in most common scenarios, will be a hardware-based appliance firewall. There are some virtual implementations on x86 commodity servers, but on a larger scale, it will be a Palo Alto/Checkpoint or Cisco-based firewall deployed with firewall policies. Most of these appliances can be controlled with a proprietary CLI command, and some newer firewall designs integrate IDS/IPS into the firewall, thereby providing a unified threat-management system.

In this case, blocking a vulnerable port for an entire infrastructure is as easy as blocking a bridge. In the castle wall analogy, it is similar to bringing the drawbridge up so there is no way to enter the castle directly.

IP-Based Traffic Filtering

The basic filtering mechanism in the perimeter firewall system is based on the IP address. This is a fast, simple approach, as you don’t need to inspect the packets too deeply. However, for the latest threats, you need to have L4-L7-based firewall filtering for deeper packet inspection. This has been done by an IDS/IPS system. (An IDS is a detection system and an IPS is a control system.) There are different policies and rules that are applied based on the threat level. Overall, the IP-based filtering mechanism gets the job done without too much hassle.

This advantage is something you might not realize too often. Most of the time, you won’t see the positives aspects. You can change policies/rules independent of the underlying applications. You decide everything at the perimeter level—which packet to enter and what is filtered out. Considering the complexity of an application residing in a standard enterprise data center, the filtering mechanism you deploy has to be simple and efficient. The perimeter firewall does both—it stands right in the data path and filters packets based on some easy-to-use firewall rules. You therefore don’t need to make any changes inside the applications.

Centralized Source for All Traffic Flow Logs

Logs are one of the most crucial parts of any security system. You need logs to trace the attack chain back and identify how the attack happened. You’ll also use logs for audit purposes, to track the changes made in the system. Without a centralized logging mechanism, it would be a daunting task to identify and make sense of all the security logs. Perimeter firewall logs can be the first place to look in order to trace suspected attacks. Given that most new advanced firewall systems and architectures have very sophisticated logging mechanisms, these firewall logs remain very crucial in today’s world.

IDS/IPS Use Cases

As mentioned, IDS/IPS in earlier days were used as separate appliances. Now most firewalls try to integrate both into the firewall appliance. That means that enabling IPS/IDS is as easy as checking a box in the firewall configuration software. The advantage here is you can always scan the traffic flow against an updated database of known attack types and vulnerabilities. This will help you identify a particular traffic pattern and match it against a known attack chain. Then you can quarantine the traffic before it becomes a full-fledged attack on the servers.

Problems with the Perimeter Model

Just to be clear, there are some clear advantages to this model. As it turns out, like any other system, a real advanced threat-management system has to identify and freeze all forms of attack, even ones that emerge daily.

This section discusses some problems that the perimeter-based defense system faces. To note, most of these disadvantages were not issues until recently. IT infrastructure changes exponentially. New software architecture, and along with that new vulnerabilities, have made it hard for the current system to keep up.

Single Point of Attack/Failure

Going back to the castle wall analogy, it is clear that once the wall is broken, the city defense mechanism will collapse. There is not much defense inside the city/castle to prevent the attackers from creating mayhem. All the residents’ resources were gathered and used to defend the fort and to weaponize the wall. This is the first and last line of defense in almost all traditional models.

Using the same logic here, what happens when the perimeter firewall is somehow compromised? Say an intruder was able to successfully take control of the perimeter firewall appliance. Now he can change any firewall rules and even create new rules to allow and deny certain types of traffic. In the wrong hands, this will result in widespread destruction.

Even though you might use a host-based firewall mechanism and antivirus software scanners inside the application infrastructure, the main idea is that all these are built on top of the perimeter firewall. For example, when the SMB port is blocked/opened in the firewall, all the additional security tools that were added to aide the security system will take this action for granted. These tools won’t go back and ask that firewall if this port is really blocked in your ruleset or not.

Multiple audits and specialized care have been provided on securing the security infrastructure. Yet the facts remains that this can act as a single point of failure (SPOF).

Protects North-South Traffic

The North-South traffic is the traffic flow going outside the data center. As with any defense mechanism, there are different security zones in a perimeter-based firewall. Utmost importance is given to the North-South traffic, as it is the critical traffic coming from unknown sources. This was fine, at least until recently.

There are some studies and surveys that indicate that around 77% of traffic stays inside the data center. Since the invention of the microservices architecture and CI/CD design flows, you must ask if you are doing enough to track these traffic inside the data centers. A perimeter-based security system does very little to control this. This means an insider or anyone who has access inside your network can easily navigate through various systems without encountering filters or blocks.

In the current scenario, the East-West traffic should get equal importance to any North-South traffic.

Firewall Rule Management Is Not Automated

If you have been involved in setting up a greenfield deployment for a firewall system, you might have realized how time-consuming this task is. More than that, consider the case of brownfield additions. Say a company decides to migrate from one firewall vendor to another. There are very few tools available that can successfully export and import the firewall policies. Standardization in creating firewall logs is very minimal, which can mean the vendor is locked in forever.

Gradually creating automated policies as you add virtual machines to the data center is the need of the hour.

Expensive and Appliance Based

The appliance model may have its own merits, but on a larger scale in a data center with multiple vendor hardware systems, it can end up being an isolated island. The firmware needs to be updated and all ROM settings have to be intact per the best practice of the vendor. In addition, any security vulnerabilities regarding that particular firmware and vendor software have to be frequently tracked and constantly updated. This may seem easy at first, but as the infrastructure grows and more and more hardware is used to meet capacity and demand, this will likely end up a full-time job, which most engineers hate to do.

Why Now and What Has Changed?

This section discusses what happened.

The increase in East-West traffic can be mainly attributed to the architecture changes that happened in the latest software development models. Companies are taking a more aggressive approach toward releasing software and features. Given the current competitive market, web-based applications have to roll out useful and innovative features to their customers constantly. This was not the case before, as most web applications incorporated fewer changes and features were just added yearly. This gave the IT department enough time to change the security architecture accordingly.

New features and software versions mean additional servers and new applications. All these changes have to be reflected in the security systems as well. If you are aggressively adding new features and are not updating the security policies at the same time, this can end badly.

As with every new feature, new ports need to be opened. There is a chance of using new software tools as well. Both of those mean new vulnerabilities. You need to treat the entire IT stack as a well-coordinated system and understand how a change in one component might affect the system as a whole. When you add a new server, you have to add security/network policies at the same time. It’s the same principle.

The microservices architecture is the latest way of doing things faster and better. Netflix is often considered the frontrunner of this type of architecture, and soon there are a lot of companies implementing it.

So, what are microservices? Traditionally, engineers built everything around a monolithic software system and tied the processes to an application. They mostly ran on the same server. According to the bandwidth, you could scale up the servers with more CPU, more RAM, more NICS, etc. There is a big issue with this design, as it makes the server a critical point in the entire data center, so you have to make sure that the server is up and running all the time. If the server reboots or crashes, your application is not available.

As per Murphy’s Law, everything that can go wrong will go wrong. So one day, this server is going to fail. In traditional architectures, the common way of preventing this is to add another passive server to take over the primary ones in case of any failures. Previously, you needed only one server, but now your critical server list increased to two. You also need monitoring systems and regular DR drills to make sure everything is working fine. This means additional effort and resources, but the problem remains as it is.

The flaw with this kind of design is that it doesn’t take failures into account. Hardware will fail, disks will crash, networks will get disconnected. These things happen on a weekly, if not on a daily, basis in any data center. If the design has not taken this into account, you are in for many nasty surprises.

The competing architecture that emerged as an alternative is based on the principle that takes failures as normal. The microservices architecture splits the monolithic software structure into small services, and these services communicate with each other through REST/RPC interfaces. This can result in an explosion of traffic inside the DC, which means that filtering unwanted traffic inside the DC is a priority. This is called a pet vs. cattle design approach to IT. The microservices based architecture treats your servers similar to cattle. A certain controlled amount of loss won’t affect the overall functioning of the systems, meanwhile, you have to give utmost care to your pet servers to keep them running.

Virtual Machines and Containers

Modern applications reside inside virtual machines or containers. Containers are gaining in popularity, as they are the easiest way to deploy software based on the microservices architecture. A container can be started and killed faster compared to the longer boot time and additional unwanted software loaded as part of normal operating systems. Containers are very specific. One feature or function is typically implemented on a container and can run on any available computer. Software security built around the operating system needs to reliably change according to these new requirements and make sure the system as a whole is effectively filtering out unwanted traffic and threats.

Changes in the Cloud and the Need for End-to-End Automation

Another factor that adds to the latest trend is the exponential adoption of the cloud. The cloud changes the way we do things and, along with that, it changes the security settings. The public/private cloud system contains security groups that help filter traffic on a VNIC level. I discuss this in more detail later, including how VMware implements Zero Trust security.

Infrastructure automation, as well as tools like Terraform and Cloud Formation, are gaining traction. Immutable infrastructure, as people call this model, is the process of bringing up new load balancers, security groups, networks, and volumes with every new deployment. This enables you to automate the entire infrastructure creation, which would have taken months based on a traditional build. As you might have noted, creating security groups and policies is part of the automation process, so any security tools used inside the DC have to play nice with the infrastructure automation tools. Otherwise, there might be dangerous scenarios where there is a mismatch of security VM migration, even across locations. This can be achieved with an L2 extension and there are multiple ways of implementing this type of architecture using VMware. One main issue you have to keep in mind is that your security policies also have to move along with the change in locations.

Isolating threats and creating better quarantine policies are parts of the new generation of security systems.

Summary

This chapter covered the perimeter security model in practice and discussed its advantages and disadvantages. It explained how a network attack can progress and included a discussion of some real-life incidents. The intent here is to get a feel for the current defense architecture. The next chapter explains in more detail how this perimeter security system is not sufficient and how a Zero Trust network can provide much better security.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.223.172.224