© The Author(s), under exclusive license to APress Media, LLC, part of Springer Nature 2023
N. TolaramSoftware Development with Gohttps://doi.org/10.1007/978-1-4842-8731-6_7

7. Gosec and AST

Nanik Tolaram1  
(1)
Sydney, NSW, Australia
 

In this chapter, you will look at AST (abstract syntax tree) and learn what it is about and why it is useful. You will learn AST by looking at the different examples in this chapter to understand the conversion of Go source code to AST. You will also learn about an open source security code analysis tool called gosec. This tool uses AST to perform code static analysis and you will see how this is performed by the tool.

Source Code

The source code for this chapter is available from the https://github.com/Apress/Software-Development-Go repository.

Abstract Syntax Tree

Abstract syntax tree (also known as syntax tree) is a tree representation of the structure of the source code written in a programming language. When you write code in Go and you compiled the code, the compiler will first convert the source code internally into a data structure representing the code. This data structure will be used by the compiler as an intermediate representation and it will go through several stages before it produces machine code. Figure 7-1 shows at a high level the different steps that compiler does when compiling code.

A flow diagram represents stages of source code including scanning, parsing, generating AST, and generating machine code.

Figure 7-1

Stages of compiling source code

Let’s take a quick peek at what AST looks like in comparison to the original code. Figure 7-2 shows the comparison between the original Go code and when it is converted into AST during the compilation process.

A screenshot of the conversion of the original code representing package main including the main function and print convert with the help of an Abstract syntax tree to the final code.

Figure 7-2

Original code vs. AST

To the normal eye, the AST looks like a bunch of text, but for the compiler it is very helpful because the data structure allows it to go through different parts of the code to check for errors, warnings, and many other things.

Go provides a built-in module that makes it easy for applications to convert source code into AST, and this module is used by tools like golanci-lint (github.com/golangci/golangci-lint) for reading and linting Go source code.

What does the AST data structure look like? Figure 7-3 shows a brief view of the AST structure.

A classification chart of Ast dot node into 3 categories including ast dot declaration, import spec, and identification. ast dot declaration is further classified into Gen Decl and Fun Decl.

Figure 7-3

AST structure

Table 7-1 briefly explains the different structures.
Table 7-1

Different Structures

Ast.Node

This is the main interface that others must implement

ast.FuncDecl

The structure representing the declaration-like function, such as

func myfunc(){

}

ast.GenDecl

The structure representing a generic declaration, such as

var x = "a string"

ast.ImportSpec

The structure representing an import declaration, such as

import "go/token"

There are many real world use cases that benefit from using AST:
  • Code generators: This kind of application requires the use of AST to generate source code.

  • Static code analyzer: Tools such as gosec, which will be discussed in this chapter, use AST extensively to read source code and identify security issues.

  • Code coverage: This kind of tool requires AST to measure an application’s test coverage and uses AST to perform its operation.

Modules

The modules that you will be using in this chapter are go/parser and go/ast. The godocs can be found respectively at https://pkg.go.dev/go/parser and https://pkg.go.dev/go/ast. Each module provide different functions, as explained here:
  • go/parser: This module provides parsing capability to Go source files. The provided input can be from a string or from a filename. The result of the parsing is an AST structure of the source file.

  • go/ast: The returned value after parsing a source file is of type go/ast, and this module allows applications to traverse through the different AST structures of the source files. This module provides the AST data structure that the application will work with.

In the next section, it will be clearer how the AST works when you look at different examples.

Sample Code

You will explore different samples in this section using the different Go AST modules. The examples will give you a good idea of how to use the different AST modules and what can be done with the AST results.

Inspecting

Run the code inside the chapter7/samplecode/inspecting folder as follows:
go run main.go
You will get the following output:
2:9:    id: p
3:7:    id: c
3:11:   bl: 1.0
4:5:    id: X
4:9:    id: f
4:11:   bl: 3.14
4:17:   bl: 2
4:21:   id: c

The code creates an AST data structure for the code that is provided when calling the AST function and filters out the declared constant and variables. Let’s go through the sample code to understand what each part of the code does.

The code declares a variable named src that contains the source code. It’s simple Go code containing const and var declarations. Successfully parsing the source code will return a type of ast.File. The ast.File contains the AST data structure of the code that the code will use to traverse through.
package main
import (
  ...
)
func main() {
  src := `
package p
const c = 1.0
var X = f(3.14)*2 + c
`
  fset := token.NewFileSet()
  f, err := parser.ParseFile(fset, "", src, 0)
  ...
}
The ast.File is declared inside go/ast module that is declared as follows:
type File struct {
  Doc        *CommentGroup
  Package    token.Pos
  Name       *Ident
  Decls      []Decl
  Scope      *Scope
  Imports    []*ImportSpec
  Unresolved []*Ident
  Comments   []*CommentGroup
}
The code then uses the ast.Inspect(..) function that traverses through the AST data structure and calls the function that is declared. The simple function passed as a parameter to ast.Inspect(..) checks what kind of ast.Node it receives, filtering out only ast.BasicLint and ast.Ident. The ast.Node refers here is the same as we discussed in Figure 7-2.
package main
import (
  ...
)
func main() {
 ...
  ast.Inspect(f, func(n ast.Node) bool {
     var s string
     switch x := n.(type) {
     case *ast.BasicLit:
        s = "bl: " + x.Value
     case *ast.Ident:
        s = "id: " + x.Name
     }
     if s != "" {
        fmt.Printf("%s: %s ", fset.Position(n.Pos()), s)
     }
     return true
  })
}
The ast.Inspect(..) is the main function provided by the go/ast module that is used in traversing through the AST tree in Go. Table 7-2 explains the ast.BasicLint and ast.Ident.
Table 7-2

ast.BasicLint and ast.Ident

ast.BasicLint

Represents nodes of the basic type, which is the value of the variable or constant declared

ast.Ident

Represents an identifier. This is defined clearly in the Go language specification (https://go.dev/ref/spec#Identifiers)

Parsing a File

The sample code in this section creates an AST data structure of the main.go that prints out the different module names that are imported, the function names declared in the code, and the line number for the return statement. The code can be found inside chapter7/samplecode/parsing directory. Run the sample in terminal as follows:
go run main.go
You will see the following output:
2022/07/02 16:28:05 Imports:
2022/07/02 16:28:05   "fmt"
2022/07/02 16:28:05   "go/ast"
2022/07/02 16:28:05   "go/parser"
2022/07/02 16:28:05   "go/token"
2022/07/02 16:28:05   "log"
2022/07/02 16:28:05 Functions:
2022/07/02 16:28:05   main
2022/07/02 16:28:05 return statement found in line 36:
2022/07/02 16:28:05 return statement found in line 39:
The sample uses the same parser.ParseFile(..) and ast.Inspect(..) functions as shown here:
package main
import (
  ...
)
func main() {
  ...
  f, err := parser.ParseFile(fset, "./main.go", nil, 0)
  ...
  ast.Inspect(f, func(n ast.Node) bool {
     ret, ok := n.(*ast.ReturnStmt)
     if ok {
        ...
     }
     return true
  })
}
The function inside ast.Inspect(..) only prints nodes that are of type ast.ReturnStmt that represent return statements; anything else is ignored. The other functions that it uses to print out import information are shown here:
package main
import (
  ...
)
func main() {
  ...
  f, err := parser.ParseFile(fset, "./main.go", nil, 0)
  ...
  log.Println("Imports:")
  for _, i := range f.Imports {
     log.Println(" ", i.Path.Value)
  }
  ...
}
The returned value from ParseFile is ast.File and one of the fields in that structure is Imports, which contains all the imports declared in the source code. The code range loops through the Imports field and prints out the import name to the console. The code also prints out the declared function name, which is done by the following code:
func main() {
  ...
  for _, f := range f.Decls {
    fn, ok := f.(*ast.FuncDecl)
    ...
    log.Println(" ", fn.Name.Name)
 }
}

The Decls field contains all the declarations found in the source code and it filters out only the ast.FuncDecl type containing the function declaration.

You have looked at different AST example code and should now have a better understanding how to use it and what information you can get out of it. In the next section, you will look at how AST is used in an open source security project.

gosec

The gosec project is an open source tool (https://github.com/securego/gosec) that provides security static code analysis. The tool provides a set of secure code best practices for the Go language, and it scans your source code to check if there is any code that breaks those rules.

Use the following command to install it if you are using Go 1.16 and above:
go install github.com/securego/gosec/v2/cmd/gosec@latest
Once installed, open your terminal and change the directory to chapter7/samplecode and execute the following command:
gosec  ./...
The tool will scan your sample code recursively and print out the message on the console.
[gosec] 2022/07/02 17:00:11 Including rules: default
...
Results:
...
 - G104 (CWE-703): Errors unhandled. (Confidence: HIGH, Severity: LOW)
    22:
  > 23:         ast.Print(fset, f)
    24: }
Summary:
  Gosec  : dev
  Files  : 3
  Lines  : 105
  Nosec  : 0
  Issues : 1

The tool scans through all the .go files inside the directory recursively and, after completing the parsing and scanning process, prints out the final result. In my directory, it found one issue, which is labeled as G104. The tool is able to perform the code analysis by using the go/ast module similar to these examples.

Inside gosec

Figure 7-4 shows at a high level how gosec works.

A flowchart of high-level process represents load configuration, load rules, new analyzers, package parse, process package, report, and loop through the package.

Figure 7-4

Gosec high-level process

The tool loads up rules (step 1) that have been defined internally. These rules define functions that are called to check the code being processed. This is discussed in detail in the next section.

Once the rules have been loaded, it proceeds to process the directory given as parameter and recursively gets all the .go files that are found (step 4). This is performed by the following code (helpers.go):
func PackagePaths(root string, excludes []*regexp.Regexp) ([]string, error) {
  ...
  err := filepath.Walk(root, func(path string, f os.FileInfo, err error) error {
     if filepath.Ext(path) == ".go" {
        path = filepath.Dir(path)
        if isExcluded(filepath.ToSlash(path), excludes) {
           return nil
        }
        paths[path] = true
     }
     return nil
  })
  ...
  result := []string{}
  for path := range paths {
     result = append(result, path)
  }
  return result, nil
}
The PackagePaths(..) function uses the path/filepath Go internal module to traverse through the directory to collect all the different directories that contain .go source. After successfully collecting all the directory names, it calls the Process(..) function (analyzer.go) shown here:
func (gosec *Analyzer) Process(buildTags []string, packagePaths ...string) error {
  ...
  for _, pkgPath := range packagePaths {
     pkgs, err := gosec.load(pkgPath, config)
     if err != nil {
        gosec.AppendError(pkgPath, err)
     }
     for _, pkg := range pkgs {
        if pkg.Name != "" {
           err := gosec.ParseErrors(pkg)
           if err != nil {
              return fmt.Errorf("parsing errors in pkg %q: %w", pkg.Name, err)
           }
           gosec.Check(pkg)
        }
     }
  }
  sortErrors(gosec.errors)
  return nil
}
This function calls the gosec.load(..) function to collect all the different .go source code found inside the directory using another Go module called golang.org/x/tools.
func (gosec *Analyzer) load(pkgPath string, conf *packages.Config) ([]*packages.Package, error) {
  abspath, err := GetPkgAbsPath(pkgPath)
  ...  conf.BuildFlags = nil
  pkgs, err := packages.Load(conf, packageFiles...)
  if err != nil {
     return []*packages.Package{}, fmt.Errorf("loading files from package %q: %w", pkgPath, err)
  }
  return pkgs, nil
}
The last step, once all the filenames are collected, is to loop through the files and call ast.Walk.
func (gosec *Analyzer) Check(pkg *packages.Package) {
  ...
  for _, file := range pkg.Syntax {
     fp := pkg.Fset.File(file.Pos())
     ...
     checkedFile := fp.Name()
     ...
     gosec.context.PassedValues = make(map[string]interface{})
     ast.Walk(gosec, file)
     ...
  }
}

The ast.Walk is called with two parameters: gosec and file. The gosec is the receiver that will be called by the AST module, while the file parameter passes the file information to AST.

The gosec receiver implements the Visit(..) function that will be called by AST module when nodes are obtained. The Visit(..) function of the tool can be seen here:
func (gosec *Analyzer) Visit(n ast.Node) ast.Visitor {
  ...
  for _, rule := range gosec.ruleset.RegisteredFor(n) {
     ...
     issue, err := rule.Match(n, gosec.context)
     if err != nil {
        ...
     }
     if issue != nil {
        ...
     }
  }
  return gosec
}

The Visit(..) function calls the rules that were loaded in step 2 by calling the Match(..) function, passing in the ast.Node. The rule source checks whether the ast.Node fulfills certain conditions for that particular rule or not.

The last step, 7, is to print out the report it obtains from the different rules executed.

Rules

The tool defines rules that are basically Go code that validates the ast.Node to check if it fulfills certain conditions. The function that generates the rules is seen here (inside rulelist.go):
func Generate(trackSuppressions bool, filters ...RuleFilter) RuleList {
  rules := []RuleDefinition{
     {"G101", "Look for hardcoded credentials", NewHardcodedCredentials},
     ...
     {"G601", "Implicit memory aliasing in RangeStmt", NewImplicitAliasing},
  }
  ...
  return RuleList{ruleMap, ruleSuppressedMap}
}
The rule is defined by specific code, description and the function name. Looking at G101, you can see that the function name is NewHardCodedCredentials, which is defined as follows:
package rules
import (
 ...
)
 ...
func (r *credentials) Match(n ast.Node, ctx *gosec.Context) (*gosec.Issue, error) {
  switch node := n.(type) {
  case *ast.AssignStmt:
     return r.matchAssign(node, ctx)
  ...
  }
  ...
}
func NewHardcodedCredentials(id string, conf gosec.Config) (gosec.Rule, []ast.Node) {
   ...
  return &credentials{
     pattern:          regexp.MustCompile(pattern),
     entropyThreshold: entropyThreshold,
     ...
     MetaData: gosec.MetaData{
        ID:         id,
        What:       "Potential hardcoded credentials",
        Confidence: gosec.Low,
        Severity:   gosec.High,
     },
  }, []ast.Node{(*ast.AssignStmt)(nil), (*ast.ValueSpec)(nil), (*ast.BinaryExpr)(nil)}
}

The NewHardcodedCredentials function initializes all the different parameters that it needs to process the node. The rule has a Match(..) function that is called by gosec when it processes the AST data structure for each file that it processes.

Summary

In this chapter, you looked at what an abstract syntax tree is and what it looks like. Go provides modules that make it easy for applications to work with the AST data structure. This opens up the possibility of writing tools like static code analysers like the open source project gosec.

The sample code provided for this chapter shows how to use AST for simple things like calculating the number of global variables and printing out the package name from the import declaration. You also looked in depth at the gosec tool to understand how it uses AST to provide secure code analysis by going through the different parts of the source code.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.225.72.133