Creating a time series MultiPlot with Bokeh-Scala

In this second recipe on plotting using Bokeh, we'll see how to plot a time series graph with a dataset borrowed from https://archive.ics.uci.edu/ml/datasets/Dow+Jones+Index. We will also see how to plot multiple charts in a single document.

How to do it...

We'll be using only two fields from the dataset: the closing price of the stock at the end of the week, and the last business day of the week. Our dataset is comma separated. Let's take a look at some samples, as shown here:

How to do it...

Preparing our data

In contrast to the previous recipe, where we used the Breeze matrix to construct the Bokeh ColumnDataSource, we'll use the Spark DataFrame to construct the source this time. The getSource method accepts a ticker (MSFT-Microsoft and CAT-Caterpillar) and a SQLContext. It runs a Spark SQL, fetches the data from the table, and constructs a ColumnDataSource from it:

import org.joda.time.format.DateTimeFormat

object StockSource {

  val formatter = DateTimeFormat.forPattern("MM/dd/yyyy");

  def getSource(ticker: String, sqlContext: SQLContext) = {
    val stockDf = sqlContext.sql(s"select stock, date, close from stocks where stock= '$ticker'")
    stockDf.cache()

    val dateData: Array[Double] = stockDf.select("date").collect.map(eachRow => formatter.parseDateTime(eachRow.getString(0)).getMillis().toDouble)
    val closeData: Array[Double] = stockDf.select("close").collect.map(eachRow => eachRow.getString(0).drop(1).toDouble)

    object source extends ColumnDataSource {
      val date = column(dateData)
      val close = column(closeData)
    }
    source
  }
}

Earlier, we constructed SQLContext and registered the dataset as a table, like this:

val conf = new SparkConf().setAppName("csvDataFrame").setMaster("local[2]")
val sc = new SparkContext(conf)
val sqlContext = new SQLContext(sc)
val stocks = sqlContext.csvFile(filePath = "dow_jones_index.data", useHeader = true, delimiter = ',')
stocks.registerTempTable("stocks")

The only tricky thing that we do here is convert the date value into milliseconds. This is because the Plot point requires a double. We use the Joda-Time API to achieve this.

Creating a plot

Let's go ahead and create the Plot object from the source:

//Create Plot
val plot = new Plot().title(ticker).x_range(xdr).y_range(ydr).width(800).height(400)

Let's have our image's title as the ticker name of the stock and create a Document object so that we can save the final HTML by the name ClosingPrices.html:

val msDocument = new Document(microsoftPlot)
val msHtml = msDocument.save("ClosingPrices.html")

Creating a line that joins all the data points

As we saw earlier with the Diamond marker, we'll have to pass the x and the y positions of the data points. Also, we will need to wrap the Line glyph into a renderer so that we can add it to Plot:

val line = new Line().x(date).y(close).line_color(color).line_width(2)
val lineGlyph = new GlyphRenderer().data_source(source).glyph(line)

Setting the x and y axes' data range for the plot

The plot needs to know what the x and y data ranges of the plot are before rendering. Let's do that by creating two DataRange objects and setting them to the plot:

val xdr = new DataRange1d().sources(List(date))
val ydr = new DataRange1d().sources(List(close))

plot.x_range(xdr).y_range(ydr)

Drawing the axes and the grids

Drawing the axes and the grids is the same as before. We added some labels to the axis, formatted the display of the x axis, and then added them to the Plot:

val xformatter = new DatetimeTickFormatter().formats(Map(DatetimeUnits.Months -> List("%b %Y")))
val xaxis = new DatetimeAxis().plot(plot).formatter(xformatter).axis_label("Month")
val yaxis = new LinearAxis().plot(plot).axis_label("Price")
plot.below <<= (xaxis :: _)
plot.left <<= (yaxis :: _)
val xgrid = new Grid().plot(plot).dimension(0).axis(xaxis)
val ygrid = new Grid().plot(plot).dimension(1).axis(yaxis)

Adding tools

As before, let's add some tools to the image—and to the plot:

//Tools
val panTool = new PanTool().plot(plot)
val wheelZoomTool = new WheelZoomTool().plot(plot)
val previewSaveTool = new PreviewSaveTool().plot(plot)
val resetTool = new ResetTool().plot(plot)
val resizeTool = new ResizeTool().plot(plot)
val crosshairTool = new CrosshairTool().plot(plot)


plot.tools := List(panTool, wheelZoomTool, previewSaveTool, resetTool, resizeTool, crosshairTool)

Adding a legend to the plot

Since we already have the Glyph renderer for the line, all we need to do is add it to the legend. The properties of the line automatically propagate to the legend:

//Legend
    val legends = List(ticker -> List(lineGlyph))
    val legend = new Legend().plot(plot).legends(legends)

Next, let's add all the renderers that we created before to the plot:

plot.renderers <<= (xaxis :: yaxis :: xgrid :: ygrid :: lineGlyph :: legend :: _)
Adding a legend to the plot

As the final step, let's try plotting multiple plots in the same document.

Multiple plots in the document

Creating multiple plots in the same document is child's play. All that we need to do is create all our plots and then add them into a grid. Finally, instead of passing our individual plot object into the document, we pass in GridPlot:

val children = List(List(microsoftPlot, bofaPlot), List(caterPillarPlot, mmmPlot))
val grid = new GridPlot().children(children)

val document = new Document(grid)
val html = document.save("DJClosingPrices.html")
Multiple plots in the document

In this chapter, we explored two methods of visualization and built some basic graphs and charts using Scala. As I mentioned earlier, the visualization libraries in Scala are actively being developed and cannot be compared to advanced visualizations that can be generated using R or, for that sake, Tableau.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.202.27