Chapter 2. The Clojure Environment

"Hello World" in Clojure

To start programming in Clojure immediately, simply open a Clojure REPL, which stands for Read Evaluate Print Loop. The REPL is a simple yet powerful way to create programs interactively as well as interact with already running programs.

The simplest way to start the REPL is to start it directly from the system command line[3]. To do this, navigate to the system directory where you have installed Clojure, the one that contains the "clojure-1.0.0.jar" file. Then type the following to start Clojure:

java -jar clojure-1.0.0.jar

This starts up the Java virtual machine and loads the Clojure environment. As soon as the REPL comes up, you should see the following prompt:

user=>

This indicates that the REPL is ready to accept input. To write your first program, just type the following at the prompt:

user=> (println "Hello World")

Press the enter key, and the REPL should display the following:

Hello World
nil
user=>

What exactly is happening here? The acronym REPL itself gives a clue.

  • Read: Clojure reads what you typed, (println "Hello World"), and parses it as a Clojure form, making sure it is valid Clojure syntax.

  • Evaluate: Clojure compiles the provided form and evaluates it. In this case, it is a call to a build in function, println, with one literal parameter, "Hello World". Clojure executes the function, which prints "Hello World" to the standard system output.

  • Print: Clojure prints the value returned from the println function. In this case, it is nil, (the same as Java's null, meaning the absence of any value, or "nothing"), because println is not a function which returns a value.

  • Loop: Clojure returns back to the input prompt, ready for you to type in another form.

This is different from how most other programming languages work. In most languages, writing, compiling, and running programs are very distinct steps. Clojure does allow you to separate these steps, should you want to, but most Clojure programmers prefer to use the REPL to do integrated development, writing, and running their code at the same time. This can greatly speed development time. It allows developers to see what their code does instantly in the context of an already-running program without any of the overhead of the time needed to stop the program, edit the code, recompile, and start it up again. This organic, bottom-up style of coding soon starts to feel extremely natural, and returning to a static development environment soon feels slow and cumbersome.

Compared to other "scripting" languages which also provide real-time evaluation, however, Clojure's on-the-fly capabilities are much more robust. When evaluating a line in the REPL, it is not just evaluated, but actually compiled, and added to the program state of a running program on an equal footing with its pre-existing code. Nor is the REPL only a special debug feature: dynamic code is always inherent to the language. It is entirely possible, and not uncommon, to connect to a remote, production instance of Clojure, open a REPL, inspect the application state, diagnose a problem, and tweak the source code to fix a bug while the program is running for a zero-downtime code fix.

In theory, it is possible to open a REPL from scratch, and write an entire, sophisticated program from the ground up as it runs without ever stopping or restarting it.

Clojure Forms

The fundamental unit of a Clojure program is not the line, the keyword, or the class, but the form. In Clojure, a form is any unit of code that is can be evaluated to return a value. When you type something in the REPL, it must be a valid form and Clojure source files contain a series of forms. There are four basic varieties of forms.

Literals

Literals are forms which resolve to themselves. Examples of literals are strings, numbers, and characters that you enter directly into the code. You can verify that literals resolve to themselves by trying it out in the REPL:

user=> "I'm a string! "
"I'm a string!"

When you type a simple, double quoted string to evaluate it, the value returned is simply the string itself. The same thing is true for numbers.

user=> 3
3

Symbols

Symbols are forms which resolve to a value. They may be thought of as roughly similar to variables, although this is not technically accurate since they are not actually variable in the same way variables in most languages are. In Clojure, symbols are used to identify function arguments, and globally or locally defined values. Symbols and their resolution are discussed in more detail in the following sections.

Composite Forms

Composite forms use symmetrical parenthesis, brackets, or braces to make groups of other forms. When evaluated, their value depends on what type of form they are—brackets evaluate to a vector and braces to a map. Chapter 4 discusses these types in detail.

Of special interest here, however, are composite forms which use parenthesis. These indicate a list, and lists in Clojure, have a special meaning. It is, after all, a dialect of Lisp, which derives its name from "LIST Processing."

In Clojure (and all Lisps), lists are evaluated as function calls. When a list is evaluated, it is the same as calling a function, and the evaluated value of the form is the return value from that function. The first item in the list is the function to call, and the rest of the items are arguments to pass to the function. For example, the Clojure form (A B C), when evaluated, means "call A, with B and C as its arguments." In other programming languages, this might be written A(B, C).

This may seem very foreign to programmers without a Lisp background. However, within the context of Clojure's capabilities, the benefits are considerable. Entire programs are just lists, and lists of lists, and so on. Code is data, and data can be code. In Chapter 12, you will see how this can be leveraged to very easily create code that writes code.

Special Forms

Special forms are a particular type of composite form. For most purposes, they are used very similarly to a function call. The difference is that the first form of a special form is not a function defined somewhere, but a special system form that's built into Clojure.

Special forms are the most basic building blocks of a Clojure program, and are used to control program flow, bind vars, and define functions among other things. The important thing to remember is that, like function calls, the first form in the list identifies the special form being used and the other forms in the list are like arguments to the special form. In order to see examples each these types of forms, let's make the Hello World program a bit more complicated; you'll use two forms, instead of just one. At the REPL, type the following, and press enter:

user=> (def message "Hello, World!")

At the next prompt, type the following:

user=> (println message)

You should see the same output as the first Hello World program:

Hello, World
nil

This simple program, only two forms, contains each type of the forms previously discussed.

Analyzing the first form, (def message "Hello, World!"), you see first that it is enclosed in parenthesis. Therefore, it is a list, and will be evaluated as a function application or a special form. There are three items in the list: def, message and "Hello, World!". The first item on the list, def, will be the function or special form that is called. In this case, it's a special form. But like a function, it takes two parameters—the var to define, and the value to which to bind it. Evaluating this form creates a var which establishes a binding of the value "Hello, World!" to the symbol message.

The second form (println message) is also a list and this time it's a normal function application. It has two component forms—each of them is a symbol. The symbol println resolves to the println function, and the symbol message resolves to the string "Hello, World!", because of the var binding established in the previous form.

The net result, then, is the same as in the first Hello World program: the println function is called with an argument of "Hello, World!"

Writing and Running Source Files

As handy as the REPL is, in order to do any real development there is also the need to save source code and be able to run it multiple times without retyping it. Clojure, of course, provides this facility.

By convention, Clojure source code files have the extension *.clj. In a normal Clojure program, there is no need to explicitly compile your source files—they are automatically compiled as they are loaded, just like individual forms entered into the REPL. If you need to pre-compile your Clojure to standard Java *.class files, (for example, to run on a nonstandard Java environment like a mobile phone), it is entirely possible, and handled by Clojures AOT (Ahead Of Time) compilation features. These are discussed in Chapter 10.

To run the example Hello World program from a *.clj file, create a new file called "hello-world.clj" in any plain-text editor, containing the following code in Listing 2-1.

Example 2-1. hello-world.clj

(def message1 "Hello, World!")
(def message2 "I'm running Clojure code from a file.")
(println message1)
(println message2)

There are two ways to run this file. The simplest, most often used for development, is to open up a REPL and type the following, (substituting the actual path of your *.clj file, and using forward slashes in accordance with the Java convention):

user=> (load-file "c:/hello-world.clj")

You should see the following output:

Hello, World!
I'm running Clojure code from a file.
nil

The load-file function takes a single parameter: a string representation of a file-system path. It then loads the file found at the path, and executes each form in the file sequentially, just as if it had been typed it in the REPL, and returns the return value of the last form in the file. You can see nil, the return value of println as the last line of the output. All the symbols defined in the file are still available. Try typing a symbol defined in the file at the REPL and it will resolve to the value which was bound to it:

user=> message1
"Hello, World!"

Another way to execute a Clojure file is directly from the system command line. This approach spawns a new Clojure runtime in a new instance of the Java virtual machine and then immediately loads the selected file. It is the normal method of running a Clojure program outside of development (unless you've packaged the Clojure into *.class files or a Jar package). To run a Clojure file this way, just enter the following at the command line:

java –jar c:/clojure-1.0.0.jar c:/hello-world.clj

Those familiar with Java will recognize this as a standard Java invocation. The –jar c:/clojure-1.0.0.jar parameter ensures that the Clojure runtime library is in the current classpath. Modify the path to reflect the actual location of your Clojure jar file that came with your Clojure installation. The last parameter is the path to the script you want to run.

This command starts the Clojure runtime, loads the hello-world.clj file, and sequentially evaluates each of its forms. In this case, the only results you see in the system console are those printed to the standard system output:

Hello, World!
I'm running Clojure code from a file.

Vars, Namespaces, and the Environment

As alluded to in the first chapter, a Clojure program is a living, organic entity that can evolve without needing to be shut down and rerun. This is due primarily to the existence of the REPL, and the capability it provides to evaluate forms in the context of an existing program. But how exactly does this work?

When you start a Clojure program, either by opening up a new REPL or running a source file directly, you are creating a new global environment. This environment lasts until the program is terminated, and contains all the information the program needs to run, including global Vars, (names bound to values). See Figure 2-1 for a diagram of what the environment looks like. Whenever you use def to define a Var, or define a function (covered in Chapter 3), it is added (or interned) to the global environment. After it is interned, it is available for reference from anywhere within the same environment. You can see this at work in the Hello World example, where you created a var binding the symbol message to a string value, and used it in a subsequent form.

Vars can be defined and bound to symbols using the def special form. It has the following syntax:

(def var-name var-value)

var-name is the name of the var to create, and var-value is its value. var-value can be any Clojure form, which will be evaluated and the resulting value bound to the var. Then, whenever the var-name symbol is used within the global Clojure environment, it will resolve to the var value.

Warning

Be sure to define your dependencies in the proper order. Because of the way Clojure references Vars, a var must be defined before a symbol referring to it can be evaluated. Normally this isn't an issue, but it can result in some "gotchas" if you do a lot of work in the REPL. Because you will often define things in the REPL in a different order from how you order them in a source file, and because once they are entered in the REPL they remain available for the life of the program. As you work, you may not notice until you stop and rerun the program that you've defined a dependency out of order. It's an easy problem to fix, and, easy to avoid once you're aware of it, but it does give most beginning Clojure programmers several moments of confusion as they get errors trying to run a program that previously seemed to run just fine.

The Clojure environment

Figure 2-1. The Clojure environment

Symbols and Symbol Resolution

Symbols are ubiquitous in Clojure, and it is worth taking some time to understand what they really are and how they work. Broadly stated, a symbol is an identifier that resolves to a value. They can be defined either on the local level (for example, function arguments or local bindings), or globally (using Vars). Just about anything you see in Clojure code that is not either a literal or a basic syntactic character (quotes, parenthesis, braces, brackets, etc.) is probably a symbol. This covers what are often thought of as variables in other languages, but also a good deal more:

  • All function names in Clojure are symbols. When a function is called as part of a composite form, it first resolves the symbol to get the function and then applies it.

  • Most operators (comparison, mathematic, etc.) are symbols, which resolve to a special, built-in, optimized function. They are resolved and applied in the same way as functions with additional performance optimizations.

  • Macro names are symbols. Without going into detail at this time, macros are like functions, only applied at compile-time rather than run-time. See Chapter 12 for an in-depth discussion of macros.

Symbol Names

Symbol names are case sensitive, and user-defined symbols have the following restrictions:

  • May contain any alphanumeric character, and the characters *, +, !, -, _, and ?.

  • May not start with a number.

  • May contain the colon character :, but not at the beginning or end of the symbol name, and may not repeat.

According to these rules, examples of legal symbol names include symbol-name, symbol_name, symbol123, *symbol*, symbol!, symbol?, and name+symbol. Examples of illegal symbol names would be 123symbol, :symbol:, symbol//name, etc.

By convention, symbol names in Clojure are usually lower-case, with words separated by the dash character (-). If a symbol is a constant or global program setting, it often begins and ends with the star character (*). For example, a program might define (def *pi* 3.14159).

Symbol Resolution and Scope

When you use a symbol name as a form in your code, Clojure evaluates the symbol and returns the value bound to it. How this resolution happens depends on the scope of a symbol, and whether it is user-defined or refers to a special or built-in form.

Clojure uses the following steps in resolving symbols:

  1. Clojure determines if the symbol refers to a special form. If so, it uses it accordingly.

  2. Next, Clojure checks if the symbol is locally bound. Typically, local binding means it is a function argument or defined with let (discussed in Chapter 3). If it finds a local value, it uses it. Note that this implies that if there is a locally defined symbol and a var with the same name, evaluating the symbol name will return the value of the local symbol. Local symbols override Vars of the same name.

  3. Clojure searches the global environment for a var with the name of the symbol, and returns the value of the var if it finds one.

  4. If no value for the symbol name was found in the previous steps, Clojure returns an error: java.lang.Exception: unable to resolve symbol <symbol> in this context (NO_SOURCE_FILE:0). The NO_SOURCE_FILE part will be replaced with an actual file name, unless you are running from the REPL.

Namespaces

When you define a var using def, you are establishing a global binding for that symbol name to that value. However, truly global variables and symbols have long been known to be a bad idea. In a large program, it is far too easy for definitions in one part of a program to inadvertently collide with those in another, leading to difficult, extremely hard-to-find bugs.

For this reason, Vars in Clojure are all scoped by namespace. Every Var has a namespace as a (sometimes implicit) part of its name. When using a symbol to refer to a var, you can use a forward slash before the symbol name itself to specify the namespace.

To see this, look closely at a symbol definition in the REPL.

user=> (def first-name "Luke")
#'user/first-name
user=> user/first-name
"Luke"

Notice the prompt itself: user=>. The string user in the prompt actually refers to the current namespace. If you were working in a different namespace, it would say something different. There's nothing special about the user namespace—it's just the default. You haven't actually just defined first-name, you've defined user/first-name which you can then use to evaluate the symbol. Since you're already in the user namespace, using just first-name will also work.

Declaring Namespaces

To declare a namespace, use the ns form. ns takes a number of parameters, some of them quite advanced. In its simplest form, you can pass it one parameter, a namespace name. If the namespaces doesn't already exist, it will create it, and set it as the current namespace. If there is already a namespace of that name, it will just switch to it as the current namespace.

user=> (ns new-namespace)
nil
new-namespace=>

Now, when you define a Var, it will be put into the new-namespace namespace, instead of user.

Referencing Namespaces

To reference a var in a different namespace, simply use its fully-qualified name. Observe the following REPL session:

user=> (def my-number 5)
#'user/my-number
user=> (ns other-namespace)
nil
other-namespace=> my-number
java.lang.Exception: Unable to resolve symbol: my-number in this context...
other-namespace=> user/my-number
5

Here you first define a var in the default user namespace. Then, you create a new namespace and switch to it. When you try to evaluate my-number, it causes an error—it can't find it in the current namespace. When you use the fully qualified name, however, it resolves the var and returns the value you originally bound to it. You can only evaluate Vars using fully-qualified names, though. To define a symbol within a namespace, you actually have to be in the namespace you want to create it in.

Sometimes, if you're depending heavily on another namespace, it's too much trouble to fully qualify every reference you need to make to a var in that namespace. For this scenario, Clojure provides the capability to make a namespace "include" another, using the :use parameter of ns. For example, to declare a namespace that imports all the symbols in Clojure's built-in XML library, you could do this:

user=> (ns my-namespace
            (:use clojure.xml))
my-namespace=>

Now, all the XML-related symbols are available in my-namespace. The (:use clojure.xml) form specifies that the clojure.xml namespace is to be loaded, and the symbols defined in it also imported into my-namespace. This is also very useful for dependency management: rather than requiring that you manually load clojure.xml before using it, you can use :use to specify it as a dependency on a namespace you declare. Clojure then loads it as part of the namespace declaration, if it wasn't already loaded, ensuring it is always available within your new namespace.

In addition to :use, Clojure provides another keyword you can use in ns, :require. The usage is identical to :use, the difference being that it only ensures the required namespaces is loaded and available—it doesn't actually import the symbols. You can also use :require to specify a list of namespaces to include. Here you include both Clojure's XML library and its set library at once:

user=> (ns my-namespace
            (:require clojure.xml
                       clojure.set))
my-namespace=>

Additionally, you can enclose the namespace in square brackets and use the :as keyword to specify a shorter alias for the namespace:

user=> (ns my-namespace
            (:require [clojure.xml :as xml]))
my-namespace=> xml/parse
my-namespace=> #<xml$parse_7630 clojure.xml$parse_7630@1484105>

Don't worry about the messy value; it's Clojure's string representation of a function, and indicates that Clojure was able to resolve the xml/parse symbol.

Structuring Source Files

How can you use namespaces to structure your source code and keep it organized? It is not difficult. By convention, each Clojure source file has its own namespace—a ns declaration ought to be the first form within any Clojure file. This makes it easy to manage namespaces and files. It is also similar to the Java convention of one class per file. In fact, it may be helpful for Java programmers to think of namespaces as classes. They certainly do provide ability to group relevant code together the same way classes do.

To help Clojure find namespaces when they are referenced with :use or :require, there is a particular naming convention to follow. The namespace declared in a file must match the name and location of a file within the class path. So, for example, if you have a Clojure source file at "x/y/z.clj", it ought to contain the declaration for the namespace x.y.z. When you reference x.y.z, it will know in which path and file to search for that namespace. Again, this is very similar to the Java package scheme.

Summary

This is all the knowledge that is really needed to run Clojure programs. Of course, you will want to learn some tools to help make it easier to manage and run source files. Particularly, classpaths can be painful to manage, and tools like Eclipse or Netbeans ease this burden. Another useful feature provided by most Clojure environments is the ability to open up a file, and selectively evaluate individual forms, rather than always loading the entire file. This is remarkably valuable for rapid development, testing, and debugging of existing applications.

The important fact to remember, no matter which tool you use, is that Clojure programs consist entirely of a set of forms, which are themselves either literals, special forms, symbols, or composited of other forms. Keeping this in mind is a big step towards understanding Clojure program structure.

Also, it is important to understand symbols. Symbols are the means by which identifiers in source code are linked to actual values, and it is helpful to have a clear grasp of how they are assigned and are resolved.

Vars are frequently used in conjunction with Symbols. Vars represent a binding of a name to a value in the Clojure environment, and are scoped by namespace.

Finally, on a high level, when a program gets too big for one source file break it into multiple files and give each one a separate namespace. You can then use the namespace dependency features to ensure that symbols are always defined where they are needed.



[3] This is the simplest way to use Clojure, but it is by no means the best. As your programs grow in size and complexity, you will almost certainly need to move to a more complete Clojure development environment that will provide help with file and classpath management, syntax highlighting, debugging, and other essential features. Plugins exist for Emacs, VI, Netbeans, Eclipse, Intellij IDEA and other editors, which provide these and a variety of other capabilities.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
52.15.55.18