Library Inventory

We’ll start by taking an inventory of the third-party dependencies in the software your organization builds. An accurate inventory is the foundation of a worthwhile patching process. You can’t patch it if you don’t know you’re using it.

One day in the future, and I can’t tell you when, you’re going to come to work and find out that there’s a terrible vulnerability in some widely used piece of software. We’ve seen this happen many times in the past, and even though we don’t know which software or when, we know it will happen again. How will you respond when this happens?

If you wait for the announcement to inventory your third-party software, you’ll have to learn as you go. All the while, customers will flood your support channels and you’ll race to find and patch impacted systems. This is error-prone and stressful.

On the other hand, if you have an accurate inventory of the software you use (or at least a well-established process for finding it), you’ll be able to start your response much sooner. If you’re lucky, you’ll know right away that you’re not impacted. If you’re less lucky, you’ll be able to jump into remediation right away rather than wasting valuable time figuring out that you’re impacted.

If you’re a developer, you’ll know how library dependencies are managed in your organization. If you’re not a developer, you’ll need to work with the developers in your organization to carry this out. The specifics of how to do this vary significantly based on what languages and build tools you use, but the idea is the same regardless. We need to do the following:

  1. Find the supported versions of your code using source control.
  2. Find the direct library dependencies your code has.
  3. Find the transitive library dependencies your code has—that is, find the dependencies of your dependencies.

Getting your third-party libraries under control is likely to be a lengthy process. You may need to address it in stages. You may also have to put in a fair amount of effort to find all of the codebases your organization has. It’s not unusual for people to forget small projects from the past that haven’t needed active development for a while. Don’t be surprised if it takes a few iterations to even get a full list of all of the applications developed by your organization.

A progression like the following isn’t unusual:

  1. Find all the codebases.
  2. For each codebase, manually put together a list of the direct and transitive dependencies.
  3. Find another codebase you overlooked and manually put together a list of its direct and transitive dependencies. Now you’ve found all the codebases.
  4. Write a script to automate dependency detection.
  5. Find another codebase that no one told you about.
  6. Take the automated script and integrate it into each project’s continuous integration system.
  7. Find another codebase. No, really. This is all of them this time.
  8. Document the script and train the build team in its use.
  9. The next time a project is started, have the build team add the dependency-finding script to the build process.
  10. Find one more codebase.

Source Control

You’ll have to get comfortable with your source control system. At the very least, you’ll need to be able to look at the latest production version of your code. If you support multiple, older versions of your code, you’ll need to be able to look at each of the older supported versions as well.

The specifics of how to look at the relevant version of your code via source control are beyond the scope of this book. If you need more information on how to work with your source control system, excellent guides are available for CVS, Subversion, and Git.

  • Pragmatic Guide to Subversion[12]
  • Pragmatic Guide to Git[13]
  • Pragmatic Version Control Using Git[14]
  • Pragmatic Version Control Using CVS[15]

We’ve discussed how dependency management works in theory. How would this look in practice? It depends on which programming languages you’re using. Let’s take a look at how dependency management looks in two popular programming languages: Python and JavaScript.

Python

Python has multiple ways of managing third-party dependencies. We’re going to take a look at managing dependencies using requirements.txt. Requirements.txt may spell out transitive dependencies, but it doesn’t have to. So we’ll start with a discussion of requirements.txt and what it can tell us. We’ll finish up with two approaches we can use if we don’t have all our transitive dependencies spelled out for us.

Finding Dependencies in requirements.txt

The requirements.txt file has a couple of advantages from our point of view. First, it’s very straightforward to read. It’s a simple text file with one dependency per line. A second advantage is that a requirements.txt file can be generated automatically from a Python environment. When a requirements.txt file is generated this way, it specifies exact version numbers and includes all the transitive dependencies. This combination is exactly what we’re looking for when we’re hunting for dependencies with vulnerabilities.

An autogenerated requirements.txt file might look like this:

 certifi==2017.11.5
 chardet==3.0.4
 idna==2.6
 pipdeptree==0.10.1
 requests==2.18.4
 urllib3==1.22

Every library has an exact version number, and the transitive dependencies have all been pulled in. Perfect!

The wrinkle with requirements.txt files is that they don’t have to be generated this way. They can be generated by hand, and they can specify ranges of version numbers, not just exact version numbers. So a requirements.txt could specify a dependency on a library with version >= 1.2.3. Installing with a requirements.txt file like this would install the newest version greater than or equal to 1.2.3. However, at the time of deployment, that might have meant version 1.2.4. At the time of an investigation, that could mean 1.2.8. What are the differences between 1.2.4 and 1.2.8? Who knows? 1.2.8 could have fixed old vulnerabilities, introduced new vulnerabilities, or both.

A hand-edited requirements.txt file might look like this:

 certifi==2017.11.5
 chardet==3.0.4
 idna==2.6
 pipdeptree==0.10.1
 requests>=2
 urllib3==1.22

Note the version specified for the requests library. In this case, we don’t know what version of the requests library is installed on any given install of our program. It would depend on the latest version available at deploy time.

Finding Dependencies in an Installed Instance

If we investigate a Python project that doesn’t give us exact version numbers for each dependency, we don’t have a way to find out the version numbers that are used in practice just by looking at the files checked into source control. We’ll have to look at a deployed instance instead. The specifics of this will depend on the deployment environment and the install process used.

One option for investigating the deployed libraries is to use pip. If pip is installed, running the command pip freeze will generate output like this:

 certifi==2017.11.5
 chardet==3.0.4
 idna==2.6
 pipdeptree==0.10.1
 requests==2.18.4
 urllib3==1.22

This may look familiar. It’s the same as the ideal requirements.txt file we looked at in the previous section. Just be sure to use the pip executable that corresponds to the Python executable actually used in production.

A second option for investigating deployed libraries is to look into the site-packages directory of the Python install that’s used in production. There will be a directory for each dependency, both direct and transitive. As with using pip, it’s important to find the Python install that’s used in practice.

JavaScript

There are many different ways to track JavaScript dependencies. We’ll cover npm because it’s one of the most popular package managers. As was the case with Python dependency management, you’ll need to talk to your developers to find out how they’re managing dependencies if you’re not a JavaScript developer yourself.

Package.json is npm’s configuration file. It specifies dependencies in addition to many other facets of a package. It’s similar to Python’s requirements.txt in that it lists direct dependencies but does not list transitive dependencies. In order to find transitive dependencies, you need to use npm. As was the case with Python, you’ll need to install your software in order to find transitive dependencies. They aren’t listed in your package’s package.json; they are calculated by looking at the package.json for each package listed in your package.json. Your best bet will be to work with your developers to install your package and then use npm list to show a tree of the dependencies. You can expect output like the following from npm list:

images/npm_list_output.png

We can see from this example that npm shows the tree structure of the transitive dependencies.

That’s useful, but we still need to go find which of those libraries have known vulnerabilities. Fortunately, npm gives us a way to do that—npm audit. Running npm audit produces output like this:

  === npm audit security report ===
 
 # Run npm install [email protected] to resolve 2 vulnerabilities
 SEMVER WARNING: Recommended action is a potentially breaking change
 
 │ Moderate │ No Charset in Content-Type Header │
 │ Package │ express │
 │ Dependency of │ express │
 │ Path │ express │
 │ More info │ https://nodesecurity.io/advisories/8 │
 
 
 │ Low │ methodOverride Middleware Reflected Cross-Site Scripting │
 │ Package │ connect │
 │ Dependency of │ express │
 │ Path │ express > connect │
 │ More info │ https://nodesecurity.io/advisories/3 │
 
 
 found 2 vulnerabilities (1 low, 1 moderate) in 4 scanned packages
  2 vulnerabilities require semver-major dependency updates.

Pretty nice. It runs almost instantly, and it shows each vulnerability along with helpful context like a description of the vulnerability, the severity of the vulnerability, and a URL for more information.

Now that we’ve seen the basics of manually scanning for third-party library vulnerabilities, let’s take a quick look at a couple of tools that aren’t language-specific that can help automate this work.

OWASP Dependency-Check

OWASP has a free, open source tool called Dependency-Check that can help automate the detection of vulnerable third-party libraries.[16] This tool supports Java and .NET, with experimental support for Ruby, Node.js, Python, and C/C++ codebases. One of the nice features of this tool is that it can parse project files that you probably already use for managing your builds, such as pom.xml files in Java codebases and .nuspec files in .NET codebases. So it leverages the work you have already done in order to figure out your dependencies: you do not have to map out your dependencies specifically for the tool. Once it has parsed out the dependencies, it queries the CVE database (which we discuss in What Is a CVE?) to see if any of the libraries you use have published vulnerabilities. This tool is meant to run during your build process. That way, you can fail builds that use vulnerable libraries and stop vulnerable libraries from even making it into your test environments.

Detecting Vulnerable Libraries in Your Source Repository

There are also commercial solutions that integrate more closely with your source control and build artifact repositories. For some organizations, this may be an easier point at which to automate library vulnerability detection. Two examples are JFrog’s Xray[17] and GitLab’s Auto Dependency Scanning.[18] These tools work similarly: During your build process, they look for vulnerabilities in the libraries you depend on. If they find any, they can fail your build. So you do not even have the opportunity to ship with vulnerable libraries.

The important thing is not really which tool you pick, but that somewhere in your development process you have a step that catches use of vulnerable libraries. Run this step automatically if you can, manually if you must.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.220.1.239