Visualization

It's also useful to have visualization when we are dealing with image data. Earlier we had converted our image pixels from a byte to a float64 using pixelWeight. It'd be instructive to also have the reverse function:

func reversePixelWeight(px float64) byte {
  return byte(((px - 0.001) / 0.999) * pixelRange)
}

Here's how to visualize 100 of the images:

// visualize visualizes the first N images given a data tensor that is made up of float64s.
// It's arranged into (rows, 10) image.
// Row counts are calculated by dividing N by 10 - we only ever want 10 columns.
// For simplicity's sake, we will truncate any remainders.
func visualize(data tensor.Tensor, rows, cols int, filename string) (err error) {
  N := rows * cols

  sliced := data
  if N > 1 {
    sliced, err = data.Slice(makeRS(0, N), nil) // data[0:N, :] in python
    if err != nil {
      return err
    }
  }

  if err = sliced.Reshape(rows, cols, 28, 28); err != nil {
    return err
  }

  imCols := 28 * cols
  imRows := 28 * rows
  rect := image.Rect(0, 0, imCols, imRows)
  canvas := image.NewGray(rect)

  for i := 0; i < cols; i++ {
    for j := 0; j < rows; j++ {
      var patch tensor.Tensor
      if patch, err = sliced.Slice(makeRS(i, i+1), makeRS(j, j+1)); err != nil {
        return err
      }

      patchData := patch.Data().([]float64)
      for k, px := range patchData {
        x := j*28 + k%28
        y := i*28 + k/28
        c := color.Gray{reversePixelWeight(px)}
        canvas.Set(x, y, c)
      }
    }
  }

  var f io.WriteCloser
  if f, err = os.Create(filename); err != nil {
    return err
  }

  if err = png.Encode(f, canvas); err != nil {
    f.Close()
    return err
  }

  if err = f.Close(); err != nil {
    return err
  }
  return nil
}

The dataset is a huge slice of images. We need to figure out how many we want first; hence, N := rows * cols. Having the number we want, we then slice using data.Slice(makeRS(0, N), nil), which slices the tensor along the first axis. The sliced tensor is then reshaped into a four-dimensional array with sliced.Reshape(rows, cols, 28,28). The way you can think about it is to have a stacked rows and columns of 28x28 images.

A primer on slicing

A *tensor.Dense acts very much like a standard Go slice; just as you can slice a[0:2], you can do the same with Gorgonia's tensors. The .Slice() method for all tensors accepts a tensor.Slice descriptor, defined as:

type Slice interface {
Start() int
End() int
Step() int
}

As such, we would have to make our own data type that fulfills the Slice interface. It's defined in the utils.go file of this project. makeRS(0, N) simply reads as if we were doing data[0:N]. Details and reasoning for this API can be found on the Gorgonia tensor Godoc page.

Then a grayscale image is created using the built-in image package: canvas := image.NewGray(rect). A image.Gray is essentially a slice of bytes and each byte is a pixel. What we need to do next is to fill up the pixels. Quite simply, we simply loop through the columns and rows in each patch, and we fill it up with the correct value extracted from the tensor. The reversePixelWeight function is used to convert the float into a byte, which is then converted into a color.Gray. The pixel in the canvas is then set using canvas.Set(x, y, c).

Following that, the canvas is encoded as a PNG. Et voilà, our visualization is done!

Now Calling the visualize in the main function as such:

func main() {
  imgs, err := readImageFile(os.Open("train-images-idx3-ubyte"))
  if err != nil {
    log.Fatal(err)
  }
  log.Printf("len imgs %d", len(imgs))

  data := prepareX(imgs)
  visualize(data, 100, "image.png")
}

This yields the following image:

Table of Contents for Visualization

Create new playlist

Sign In

Sign Up

Table of Contents for
Visualization