Using low-level communications

Julia's native parallel computing model is based on two primitives: remote calls and remote references. At this level, we can give a certain worker a function with arguments to execute with remotecall, and get the result back with fetch. As a trivial example in the following code, we call upon worker 2 to execute a square function on the number 1000:

r1 = remotecall(x -> x^2, 2, 1000)

This returns Future(2, 1, 15, nothing).

The arguments are: the worker ID, the function, and the function's arguments. Such a remote call returns immediately, thus not blocking the main worker (the REPL in this case). The main process continues executing while the remote worker does the assigned job. The remotecall function returns a variable, r1, of the Future type, which is a reference to the computed result, which we can get using fetch:

fetch(r1)

This returns 1000000.

The call to fetch will block the main process until worker 2 has finished the calculation. The main processor can also run wait(r1), which also blocks until the result of the remote call becomes available. If you need the remote result immediately in the local operation, use the following command:

remotecall_fetch(sin, 5, 2pi)

Which returns -2.4492935982947064e-16.

This is more efficient than fetch(remotecall(..)).

You can also use the @spawnat macro, which evaluates the expression in the second argument on the worker specified by the first argument:

r2 = @spawnat 4 sqrt(2) # lets worker 4 calculate sqrt(2)
  fetch(r2)  # returns 1.4142135623730951

This is made even easier with @spawn, which only needs an expression to evaluate, because it decides for itself where it will be executed: r3 = @spawn sqrt(5) returns RemoteRef(5,1,26) and fetch(r3) returns 2.23606797749979.

To execute a certain function on all the workers, we can use a comprehension:

r = [@spawnat w sqrt(5) for w in workers()]
fetch(r[3]) # returns 2.23606797749979

To execute the same statement on all the workers, we can also use the @everywhere macro:

@everywhere println(myid()) 1
        From worker 2: 2
        From worker 3: 3
        From worker 4: 4
        From worker 7: 7
        From worker 5: 5
        From worker 6: 6
        From worker 8: 8
        From worker 9: 9

All the workers correspond to different processes; they therefore do not share variables, for example:

x = 5 #> 5
@everywhere println(x) #> 5
  # exception on worker 2: ERROR: UndefVarError: x not defined ...
       ...and 11 more exception(s)

The x variable is only known in the main process, all the other workers return the ERROR: x not defined error message.

@everywhere can also be used to make the data, such as the w variable, available to all processors, for example, @everywhere w = 8.

The following example makes a defs.jl source file available to all the workers:

@everywhere include("defs.jl")

Or, more explicitly, a fib(n) function, as follows:

@everywhere function fib(n)
  if (n < 2) then
    return n
  else return fib(n-1) + fib(n-2)
  end
end

In order to be able to perform its task, a remote worker needs access to the function it executes. You can make sure that all workers know about the functions they need by loading the functions.jl source code with include, making it available to all workers:

include("functions.jl")

In a cluster, the contents of this file (and any files loaded recursively) will be sent over the network.

A best practice is to separate your code into two files: one file (functions.jl) that contains the functions and parameters that need to be run in parallel, and the other file (driver.jl) that manages the processing and collects the results. Use the include("functions.jl") command in driver.jl to import the functions and parameters to all processors.

An alternative is to specify that the files load on the command line. If you need the file1.jl and file2.jl source files on all the n processors at startup time, use the julia -p n -L file1.jl -L file2.jl driver.jl syntax, where driver.jl is the script that organizes the computations.

Data-movement between workers (such as when calling fetch) needs to be reduced as much as possible in order to get performance and scalability.

If every worker needs to know the d variable, this can be broadcast to all processes with the following code:

for pid in workers() 
    remotecall(pid, x -> (global d; d = x; nothing), d) 
end

Each worker then has its local copy of data. Scheduling the workers is done with tasks (refer to the Tasks section of Chapter 4, Control Flow), so that no locking is required; for example, when a communication operation such as fetch or wait is executed, the current task is suspended, and the scheduler picks another task to run. When the wait event completes (for example, the data shows up), the current task is restarted.

In many cases, however, you do not have to specify or create processes to do parallel programming in Julia, as we will see in the next section.

Table of Contents for Using low-level communications

Create new playlist

Sign In

Sign Up

Table of Contents for
Using low-level communications