Threat intelligence

As part of a threat hunting architecture and best practice, enriching the ingested data can add insightful information (such as the geolocation of IP addresses), but it can also identify at a glance if an IP address is on a known threat list. This is part of threat intelligence. Enhancing threat information is a process that consists of using a threat database and tagging data that has been ingested or already indexed data as a threat if it exists in the given database.

There are many threat intelligence databases and data feeds that can be used out there. But whichever you are using, make sure that the threat information is embedded into each document you index into Elasticsearch, whether you doing either of the following:

Using your own enrichment process to tag the data once in Elasticsearch
Using Logstash at ingestion time to tag the data

The second strategy can be easily built using either the translate plugin or the lookup features in Logstash. The translate feature can be used for use cases where the data being used for enrichment can be stored in a file and is fairly static. This method is explained in a blog post (https://www.elastic.co/blog/bro-ids-elastic-stack) as an example and consists of creating a dictionary against which the data is compared over time. The following configuration gives an idea of what it consists of in Logstash:

filter { 
   translate { 
   field => "evt_dstip" 
   destination => "tor_exit_ip" 
   dictionary_path => "/path/to/yaml" 
  } 
}

The preceding configuration shows an example of a lookup for the evt_dstip field against a dictionary. If a match is found in the dictionary, the tor_exit_ip field is populated with the content of the dictionary. You can set whatever content you need in the dictionary as long as you keep in mind that this will be the content that's used to do reporting and also to be leveraged by ML.

While this is a very handy feature, its scalability is limited by the fact that it relies on a dictionary file. This is where the memcahed plugin brings more value as it relies on a scalable cache that can hold millions of records that can change over time. For the purpose of this book, we will not dig too much into the details here, but you can find a comprehensive description of how to set this up in the Elastic blog at https://www.elastic.co/blog/elasticsearch-data-enrichment-with-logstash-a-few-security-examples.

The important bit for us is to understand how it is done in Logstash, as the following example shows:

input {
  stdin {
    codec => json
  }
}
filter {
  memcached {
     hosts => ["localhost:11211"]
     get => {
        "%{ip}" => "threat_src"
     }
  }
}
output {
  stdout {
     codec => rubydebug
  }
}

As explained in the blog, for any input message, the ip field will be looked up in memcached and populate the threat_src field accordingly if the IP exists. So, even if Beats uses an out-of-the-box data model, we could still use a Logstash instance to enrich the data and add more value to our ingestion architecture, and have the following type of architecture:

Table of Contents for Threat intelligence

Create new playlist

Sign In

Sign Up

Table of Contents for
Threat intelligence