In this chapter, you will look at an open source security tool called Scorecard. Scorecard provides security metrics for projects you are interested in. The metrics will give you visibility on the security concerns that you need to be aware of regarding the projects.
You will learn how to create GitHub tokens using your GitHub account. The tokens are needed by the tool to extract public GitHub repository information. You will walk through the steps of installing and using the tool. To understand the tool better, you will look at the high-level flow of how the tool works and also at how it uses the GitHub API.
One of the key takeaways of this chapter is how to use the GitHub API and the information that can be extracted from repositories hosted on GitHub. You will learn how to use GraphQL to query repository data from GitHub using an open source library.
Source Code
The source code for this chapter is available from the https://github.com/Apress/Software-Development-Go repository.
What Is Scorecard?
Scorecard is an open source project that analyzes your project’s dependencies and gives ratings about them. The tool performs several checks that can be configured depending on your needs. The checks are associated with software security and are assigned a score of 0 to 10. The tool shows whether dependencies in your project are safe and also provides other security checks such as your GitHub configuration, license checking, and many other useful checks.
In the next section, you will look at setting up the GitHub token key so that you can use it to scan the GitHub repository of your choice.
Setting Up Scorecard
- 1.
Go to your GitHub repository (in my case, https://github.com/nanikjava) and click the top right icon, as shown in Figure 8-3, to access the profile page by clicking the Settings menu.
- 2.
Once you are on the Profile page, shown in Figure 8-4, click Developer settings.
- 3.
You will be brought into the apps page, as shown in Figure 8-5. Click the Personal access tokens link.
- 4.
Once you are inside the tokens page, shown in Figure 8-6, click Generate new token.
- 5.
You will see the new personal token page, shown in Figure 8-7. Fill in the Note textbox with information about what the token is used for and set the expiration to whatever you want. Finally, in the Select scopes section, select the repo tickbox; this will automatically select the reset of the repo permissions that fall under it. Once done, scroll down and click the Generate token button.
- 6.
Once the token has been generated, you will see a screen like Figure 8-8 showing the new token. Copy the token and paste it somewhere on your editor so you can use it for the next section.
In the next section, you will use the token you generated to build and run Scorecard.
Running Scorecard
Download the tool from the project GitHub repository. For this chapter, you’ll use v4.4.0; the binary can be downloaded from https://github.com/ossf/scorecard/releases/tag/v4.4.0. Once you download the archive file, unzip it to a directory on your local machine.
You have successfully run the tool to scan a GitHub repository and received an output with a high score of 8.0. A higher score indicates that the repository is doing all the right things as per the predefined checks in the tool.
In the next section, you will further explore the tool to understand how it works and go through code for different parts of the tool.
High-Level Flow
One thing that you learn from the tool is the usage of the GitHub API. The tool is used extensively by the GitHub API to perform checks by downloading information about the repository and checking that information using the predefined security checks. You are now going to take a look at how to use the GitHub API to do some GitHub exploration.
GitHub
Anyone who works with software knows about GitHub and has used it one way or another. You can find most kinds of open source software in GitHub and it is hosted freely. It has become the go-to destination for anyone who dabbles in software.
GitHub provides an API that allows external tools to interact with the services. The API opens up unlimited potential for developers to access the GitHub service to build tools that can provide value for their organization. This allows the proliferation of third-party solutions (free and paid) to be made available to the general public. The Scorecard project in this chapter is one of the tools made possible because of the GitHub API.
GitHub API
There are two kinds of GitHub APIs: REST and GraphQL (https://docs.github.com/en/graphql). There are different projects that implement both APIs, which you will look at a bit later.
The response is in JSON format. The information you are seeing is the same when you visit the Golang project page at https://github.com/golang. The GitHub documentation at https://docs.github.com/en/rest provides a complete list of REST endpoints that are accessible.
The application starts off by initializing the library by calling github.NewClient(..) and passing in http.Client, which is used to make an HTTP call to GitHub. The library package github.com/google/go-github/v38/github provides all the different functions required. In the example, you use Repositories.Get(..) to obtain information about a particular repository (golang) project (go).
You get the same response using https://api.github.com/repos/golang/go in your browser.
The other API that is provided by GitHub is called the GraphQL API (https://docs.github.com/en/graphql) and it is very different from the REST API. It is based on GraphQL (https://graphql.org/), which the website describes as follows:
GraphQL is a query language for APIs and a runtime for fulfilling those queries with your existing data. GraphQL provides a complete and understandable description of the data in your API, gives clients the power to ask for exactly what they need and nothing more, makes it easier to evolve APIs over time, and enables powerful developer tools.
Normally, when using REST API in order to get different kinds of data, you need to get it from different endpoints. Once all of the data is collected, you need to construct them into one structure. GraphQL makes it simple: you just have to define what repository data you want, and it will return the collection of data you requested as one single collection.
The output shows the information that is obtained from GitHub from the http://github.com/golang/go repository as the first 10 issues, first 10 comments, and 10 first labels. This kind of information is very useful and you will see as you walk through the code, which is performed easily by using the GraphQL API.
createdAt
forkCount
labels (the first 10 labels)
issues (the first 10 issues)
commitComments (the first 10 comments)
The strict definition uses data types that are defined in the library (e.g., githubv4.String, githubv4.Int, etc.).
The code initializes the graphqlData struct that will be populated with the information received from GitHub by the library and then it makes the call to GitHub using the graphClient.Query(..) function, passing in the newly created struct and variables defined. The variables defined in vars contain the value that will be passed to GitHub as the parameter of the GraphQL.
Once the .Query(..) function returns successfully, you can use the returned data populated inside the data variable and print it out to the console.
In the next section, you will look at how to use GitHub Explorer to work with GraphQL.
GitHub Explorer
For more reading on the different data that can be extracted using GraphQL, refer to the queries documentation at https://docs.github.com/en/graphql/reference/queries.
Summary
In this chapter, you looked at an open source project called Scorecard that provides security metrics for projects hosted on GitHub. The project measures the security of a project on a scale of 0-10 and this can also be used for projects stored locally. The major benefit of the tool is the public availability of data for projects that have been scanned by the tool. This data is useful for developers because it gives them information and insights on the security metrics of projects they are planning to use.
You also looked at how the tool works and learned how to use the GitHub API to extract repository information to perform predefined security checks.
You learned in detail about the different availability of the GitHub APIs, which are REST and GraphQL. You looked at the sample code to understand how to use each of these APIs to extract information from a GitHub repository. Finally, you explore the GitHub Explorer to understand how to construct GraphQL queries for performing query operations on GitHub.