Building a song analyzer

However, before deep diving into the recommender itself, the reader may have noticed an important property that we were able to extract out of the signal data. Since we generated audio signatures at regular time intervals, we can compare signatures and find potential duplicates. For example, given a random song, we should be able to guess the title, based on previously indexed signatures. In fact, this is the exact approach taken by many companies when providing music recognition services. To take it one step further, we could potentially provide insight into a band's musical influences, or further, perhaps even identify song plagiarism, once and for all settling the Stairway to Heaven dispute between Led Zeppelin and the American rock band Spirit http://consequenceofsound.net/2014/05/did-led-zeppelin-steal-stairway-to-heaven-legendary-rock-band-facing-lawsuit-from-former-tourmates/.

With this in mind, we will take a detour from our recommendation use case by continuing our investigation into song identification a little further. Next, we build an analyzer system capable of anonymously receiving a song, analyzing its stream, and returning the title of the song (in our case, the original filename).

Selling data science is all about selling cupcakes

Sadly, an all too often neglected aspect of the data science journey is data visualization. In other words, how to present your results back to end users. While many data scientists are content to present their findings in an Excel spreadsheet, today's end users are keen for richer, more immersive experiences. Often they want to play around, interacting with data. Indeed, providing an end user with a full, end-to-end user experience even a simple one can be a great way to spark interest in your science; making a simple proof of concept into a prototype people can easily understand. And due to the prevalence of Web 2.0 technologies, user expectations are high, but thankfully, there are a variety of free, open source products that can help, for example, Mike Bostock’s D3.js, is a popular framework that provides a toolkit for creating just such user interfaces.

Selling data science without rich data visualization is like trying to sell a cake without icing, few people will trust in the finished product. Therefore, we will build a user interface for our analyzer system. But first, let's get the audio data out of Spark (our hashes are currently stored in memory inside an RDD) and into a web-scale datastore.

Using Cassandra

We need a fast, efficient, and distributed key-value store to keep all our hash values. Although many databases are fit for this purpose, we'll choose Cassandra in order to demonstrate its integration with Spark. First, import the Cassandra input and output formats using the Maven dependency:

<dependency>
  <groupId>com.datastax.spark</groupId>
  <artifactId>spark-cassandra-connector_2.11</artifactId>            
  <version>2.0.0</version>
</dependency> 

As you would expect, persisting (and retrieving) RDDs from Spark to Cassandra is relatively trivial:

import com.datastax.spark.connector._
 
 val keyspace = "gzet"
 val table = "hashes"
 
 // Persist RDD
 hashRDD.saveAsCassandraTable(keyspace, table)
 
 // Retrieve RDD
 val retrievedRDD = sc.cassandraTable[HashSongsPair](
  keyspace,
  table
)

This will create a new table hashes on keyspace gzet, inferring the schema from the HashSongsPair object. The following is the equivalent SQL statement executed (provided here for information only):

CREATE TABLE gzet.hashes (
  id text PRIMARY KEY,
  songs list<bigint>
)

Using the Play framework

As our Web UI will front the complex processing required to transform a song into frequency hashes, we want it to be an interactive web application rather than a simple set of static HTML pages. Furthermore, this must be done in the exact same way and with the same functions as we did using Spark (that is, the same song should generate the same hashes). The Play framework (https://www.playframework.com/) will allow us to do this, and Twitter's bootstrap (http://getbootstrap.com/) will be used to put the icing on the cake, for a more professional look and feel.

Although this book is not about building user interfaces, we will introduce some concepts related to the Play framework, as if used well it can provide a source of great value for data scientists. As always, the full code is available in our GitHub repository.

First, we create a data access layer, responsible for handling connections and queries to Cassandra. For any given hash, we return the list of matching song IDs. Similarly, for any given ID, we return the song name:

val cluster = Cluster
  .builder()
  .addContactPoint(cassandraHost)
  .withPort(cassandraPort)
  .build()
val session = cluster.connect()
 
 def findSongsByHash(hash: String): List[Long] = {
   val stmt = s"SELECT songs FROM hashes WHERE id = '$hash';"
   val results = session.execute(stmt)
   results flatMap { row =>
     row.getList("songs", classOf[Long])
   }
   .toList
 }

Next, we create a simple view, made of three objects, a text field, a file Upload, and a submit button. These few lines are enough to provide our user interface:

<div>
   <input type="text" class="form-control">
   <span class="input-group-btn">
     <button class="btn-primary">Upload</button>
     <button class="btn-success">Analyze</button>
   </span>
</div>

Then we create a controller that will handle both GET and POST HTTP requests through the index and submit methods, respectively. The latter will process the uploaded file by converting a FileInputStream into an Audio case class, splitting it into 20 millisecond chunks, extracting the FFT signatures (hashes) and querying Cassandra for matching IDs:

def index = Action { implicit request =>
   Ok(views.html.analyze("Select a wav file to analyze"))
 }
 
 def submit = Action(parse.multipartFormData) { request =>
   request.body.file("song").map { upload =>
     val file = new File(s"/tmp/${UUID.randomUUID()}")
     upload.ref.moveTo(file)
     val song = process(file)
     if(song.isEmpty) {
       Redirect(routes.Analyze.index())
         .flashing("warning" -> s"No match")
     } else {
       Redirect(routes.Analyze.index())
         .flashing("success" -> song.get)
     }
   }.getOrElse {
     Redirect(routes.Analyze.index())
       .flashing("error" -> "Missing file")
   }
 }
 
 def process(file: File): Option[String] = {
   val is = new FileInputStream(file)
   val audio = Audio.processSong(is)
   val potentialMatches = audio.sampleByTime().map {a =>
     queryCassandra(a.hash)
   }
   bestMatch(potentialMatches)
 }

Finally, we return the matching result (if any) through a flashing message and we chain both the view and controller together by defining our new routes for our Analyze service:

GET      /analyze      controllers.Analyze.index
POST     /analyze      controllers.Analyze.submit

The resulting UI is reported in Figure 5, and works perfectly with our own music library:

Using the Play framework
Figure 5: Sound analyser UI

The following Figure 6 shows the end-to-end process:

Using the Play framework
Figure 6: Sound analyser process

As mentioned, the Play framework shares some pieces of code with our offline Spark job. This is made possible because we are programming in a functional style and have applied a good separation of concerns. While the Play framework does not work natively with Spark (in terms of RDDs and Spark context objects), because they are not dependent on Spark, we can use any of the functions we created earlier (such as the ones in the Audio class). This is one of the many advantages of functional programming; functions, by definition, are stateless and represent a key component in the adoption of a hexagonal architecturehttp://wiki.c2.com/?HexagonalArchitecture. Isolated functions can always be called by different actors, whether it is inside of an RDD or within a Play controller.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.222.167.183