It uses crowd-sourced data to collect IP information across the globe, which is shared with the community.
It offers code designs that are useful to look at and learn from
The GeoIP database is interesting on its own.
The chapter is broken down into the installation part and the learning part. In the installation part, you will look at installing CrowdSec to understand how it works. In the learning section, you will look deeply into how CrowdSec implements something that you can learn from by looking at sample code.
Source Code
The source code for this chapter is available from the https://github.com/Apress/Software-Development-Go repository.
CrowdSec Project
The documentation at https://doc.crowdsec.net/docs/intro explain it nicely:
CrowdSec is an open-source and lightweight software that allows you to detect peers with malevolent behaviors and block them from accessing your systems at various levels (infrastructural, system, applicative).
CrowdSec, as an open source security tool, provides quite a number of features that sit nicely in a cloud environment. The thing that is intriguing about the tool is the data that is collected by the community. This crowd-sourced data allows CrowdSec to determine whether a certain IP address has to be banned or should be allowed into your infrastructure.
There are many architectures and code designs that you are going to learn from the project, which you will explore more in the “Learning From CrowdSec” section.
Using CrowdSec
I will not go through the complete installation process of CrowdSec. Rather, I will cover the steps of a bare minimum installation that will allow you to understand what you need for the section “Learning From CrowdSec.” The objective of this installation is to get to a point to see the community data that is collected by a central server replicated to a local database.
- Download the release from GitHub. For this section, use v1.4.1 for Linux, downloading it using the following command:wget https://github.com/crowdsecurity/crowdsec/releases/download/v1.4.1/crowdsec-release.tgz
- Once downloaded, use gunzip and tar to unzip as follows:gunzip ./crowdsec-release.tgz && tar -xvf crowdsec-release.tar
- A new directory named crowdsec-v1.4.1 will be created, as shown:└── crowdsec-v1.4.1├── cmd├── config├── plugins├── test_env.ps1├── test_env.sh└── wizard.sh
Change your directory to crowdsec-v1.4.1 and run the test_env.sh command.
The directory contains a variety of files including the CrowdSec command line tools crowdsec and cscli along with a folder called data that you will look at in the next section in more detail. The database with extension .mmdb is the database that you will look at in detail in the “GeoIP Database” section.
crowdsec.db
Notice the last log message that says added 8761 entries, which means that it has added 8761 entries into your database. If you are not getting this message, rerun the crowdsec command.
IP addresses that are banned
Date until when a particular IP is banned
Scenarios when an IP address is detected
You have learned briefly how to set up CrowdSec and you have seen the data it uses. In the next section, you will look at parts of CrowdSec that are interesting. You will look at how certain things are implemented inside CrowdSec and then look at a simpler code sample of how to do it.
Learning From CrowdSec
CrowdSec as a project is quite complex and it contains a lot of different things that are very interesting to learn from. In this section, you will pick up a few topics that are used inside CrowdSec that are useful to learn. These topics can also be applied when designing your own software with Go.
System Signal Handling
As a system, CrowdSec provides an extensive list of features that are broken down into several different modules. The reason for features to be broken down into modules is to make it easy for development, maintenance, and testing. When building a system, one of the key things to remember is to make sure all the different modules can be gracefully terminated and all resources such as memory, network connections, and disk space are released. To make sure that different parts of the system shut down properly, you need some sort of coordinated communication to understand when modules need to prepare for the shutdown process.
Imagine a scenario where you are designing an application and it is terminated by the operating system because of some resource constraint. The application must be aware of this and have the capability to shut down all the different modules independently before shutting itself down permanently. You will look at an example on how this is done using the code sample in the chapter14/signalhandler folder.
SIGHUP: The operating system sends this signal when the terminal used to execute the application is disconnected, closed, or broken.
SIGTERM: This is a generic signal that is used by the operating system to signal terminating a process or application.
SIGINT: This is also referred to as a program interrupt and this signal occurs when the Ctrl+C combination is detected.
The code listens to all these signals to ensure that if any of them are detected, it will do its job to shut itself down properly.
The signalChan variable is a channel that accepts os.Signal and it is passed as parameter when calling signal.Notify(). The goroutine takes care of handling the signal received from the library in a for{} loop (step 2). Receiving a signal (step 6) means that there is an interruption, so the code must take the necessary steps to start the shutdown process (step 7).
The loop100Times function runs inside a for{} loop where it checks the channel condition inside the select{} statement. To make it easy to understand, basically the for{ select {} } block of code translate to the following:
Is there any value to read from the stop channel? if there is something, processes must stop.
Otherwise, just print to the console and increment the counter.
The same logic is used inside the loop1000Times function, so it works exactly the same. Both functions will stop processing and will print the counter value to the terminal once the stop channel is closed. The application has achieved the state of shutting down itself gracefully by informing the different parts of the code that it is shutting down.
Handling Service Dependencies
Complex applications like CrowdSec have multiple services that run at the same time or at scheduled times. In order for services to run properly, there needs to be service coordination that takes care of the dependencies between services.
In Figure 14-4, the apiReady channel is the central part of the service coordination when CrowdSec starts up. The diagram shows that the apiServer.Run function sends a signal to the apiReady channel, which allows the other service, servePrometheus, to run the server listening on port 6060.
serviceBDone: This channel is used to inform that serviceB has done its job.
alldone: This channel is used to inform that serviceA has done its job so the application can exit.
GeoIP Database
CrowdSec uses a GeoIP database that contains geographical information of an IP address. This database is downloaded as part of setting up the test environment discussed in the “Using CrowdSec” section.
In this section, you will look into this database and learn how to read the data from the database. One of the use cases for this database is the ability to build a security tool for your infrastructure to label each incoming IP, which is useful to monitor and understand the incoming traffic to your infrastructure. The GeoIP database comes from the following website: https://dev.maxmind.com/geoip/geolite2-free-geolocation-data?lang=en#databases. Have a read through the website to get an understanding of the licensing
The code reads the database to get all IP addresses in the 2.0.0.0 IP range and prints all the IP addresses found in that range along with other country- and continent-related information. Let’s go through the code and understand how it uses the database.
The data is stored in a single file, which is efficiently packed together, so in order to read the database, you must to use another library. Use the github.com/oschwald/maxminddb-golang library. The documentation of the library can be found at https://pkg.go.dev/github.com/oschwald/maxminddb-golang.
networks.Next() loops through the records found and reads all geographical information from the database by calling the networks.Network(..) function, which populates the rec variable.
Once the JSON has been unmarshalled back to the r variable, the code prints out the information into the console.
Summary
In this chapter, you not only looked at the crowd source nature of data collection used by CrowdSec and how the community benefits from it, you also learned how to use it in your application.
You learned how to use channels to inform applications when system signals are sent by the operating system. You also looked at using channels to handle service dependencies during startup. Lastly, you looked at how to read a GeoIP database, which is useful to know when you want to use the information in your infrastructure for logging or monitoring IP traffic purposes.