Chapter 1. Working with Matrices

In this chapter, we will explore an elementary yet elegant mathematical data structure—the matrix. Most computer science and mathematics graduates would already be familiar with matrices and their applications. In the context of machine learning, matrices are used to implement several types of machine-learning techniques, such as linear regression and classification. We will study more about these techniques in the later chapters.

Although this chapter may seem mostly theoretical at first, we will soon see that matrices are a very useful abstraction for quickly organizing and indexing data with multiple dimensions. The data used by machine-learning techniques contains a large number of sample values in several dimensions. Thus, matrices can be used to store and manipulate this sample data.

An interesting application that uses matrices is Google Search, which is built on the PageRank algorithm. Although a detailed explanation of this algorithm is beyond the scope of this book, it's worth knowing that Google Search essentially finds the eigen-vector of an extremely massive matrix of data (for more information, refer to The Anatomy of a Large-Scale Hypertextual Web Search Engine). Matrices are used for a variety of applications in computing. Although we do not discuss the eigen-vector matrix operation used by Google Search in this book, we will encounter a variety of matrix operations while implementing machine-learning algorithms. In this chapter, we will describe the useful operations that we can perform on matrices.

Introducing Leiningen

Over the course of this book, we will use Leiningen (http://leiningen.org/) to manage third-party libraries and dependencies. Leiningen, or lein, is the standard Clojure package management and automation tool, and has several powerful features used to manage Clojure projects.

To get instructions on how to install Leiningen, visit the project site at http://leiningen.org/. The first run of the lein program could take a while, as it downloads and installs the Leiningen binaries when it's run for the first time. We can create a new Leiningen project using the new subcommand of lein, as follows:

$ lein new default my-project

The preceding command creates a new directory, my-project, which will contain all source and configuration files for a Clojure project. This folder contains the source files in the src subdirectory and a single project.clj file. In this command, default is the type of project template to be used for the new project. All the examples in this book use the preceding default project template.

The project.clj file contains all the configuration associated with the project and will have the following structure:

(defproject my-project "0.1.0-SNAPSHOT"
  :description "FIXME: write description"
  :url "http://example.com/FIXME"
  :license 
  {:name "Eclipse Public License"
   :url "http://www.eclipse.org/legal/epl-v10.html"}
  :dependencies [[org.clojure/clojure "1.5.1"]])

Tip

Downloading the example code

You can download the example code files for all Packt books you have purchased from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

Third-party Clojure libraries can be included in a project by adding the declarations to the vector with the :dependencies key. For example, the core.matrix Clojure library package on Clojars (https://clojars.org/net.mikera/core.matrix) gives us the package declaration [net.mikera/core.matrix "0.20.0"]. We simply paste this declaration into the :dependencies vector to add the core.matrix library package as a dependency for our Clojure project, as shown in the following code:

  :dependencies [[org.clojure/clojure "1.5.1"]
                 [net.mikera/core.matrix "0.20.0"]])

To download all the dependencies declared in the project.clj file, simply run the following deps subcommand:

$ lein deps

Leiningen also provides an REPL (read-evaluate-print-loop), which is simply an interactive interpreter that contains all the dependencies declared in the project.clj file. This REPL will also reference all the Clojure namespaces that we have defined in our project. We can start the REPL using the following repl subcommand of lein. This will start a new REPL session:

$ lein repl
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.218.157.34