One of the questions I encounter the most when speaking at conferences is, “What is a use case for an Actor-based application?” That depends on what you’re trying to accomplish, but if you want to build an application that can manage concurrency, scale outwardly across nodes, and be fault tolerant, actors are a good fit for this role.
In a domain-driven actor application, actors live and die to represent the state of the world in a live cache, where the mere existence of these actors and their encapsulation of state show the data for your application. They are frequently used in systems where information is provisioned to multiple other servers, which happens in an eventual consistency fashion. This implies that it is plausible that an actor attempting to supply another server may not be able to do so at a given point, and therefore must try until it can.
For example, imagine a large financial institution trying to keep a real-time view of all of its customers, with all of their accounts and all of the investments that customer owns via each account at a given time. This information can be created and maintained live through actor-supervisor hierarchies.
This kind of real-time domain modeling, where you are in essence creating a cache that also contains behavior, is enabled by the lightweight nature of Akka actors. Because Akka actors share resources (such as threads), each instance only takes about 400 bytes of heap space before you begin adding state for your domain. It is plausible that one server could contain the entire business domain for a large corporation represented in Akka actors.
The added benefit of using actors for this kind of domain modeling is that they also introduce fault tolerance: you have the ability to use Akka’s supervision strategies to ensure high uptime for your system, as opposed to simple caches of domain objects where exceptions have to be handled at the service layer. An example can be found in Figure 1-1.
And this truly can fit into Eric Evans’ “Domain-Driven Design” paradigm. Actors can represent concepts described in the domain-driven approach, such as entities, aggregates, and aggregate roots. You can design entire context bounds with actors. When we get to the use case to show patterns, I’ll show you how.
When you build a hierarchy of domain objects represented as actors, they need to be notified about what is happening in the world around them. This is typically represented as messages passed as “facts” about an event that has occurred. While this is not a rule per se, it is a best practice to keep in mind. The domain should be responding to external events that change the world that it is modeling, and it should morph itself to meet those changes as they occur. And if something happens that prevents the domain actors from representing those changes, they should be written to eventually find consistency with them:
// An example of a fact message
case
class
AccountAddressUpdated
(
accountId
:
Long
,
address
:
AccountAddress
)
In this scenario, actors are stateless and receive messages that contain state, upon which they will perform some pre-defined action and return a new representation of some state. That is the most important differentiation between worker actors and domain actors: worker actors are meant for parallelization or separation of dangerous tasks into actors built specifically for that purpose, and the data upon which they will act is always provided to them. Domain actors, introduced in the previous section, represent a live cache where the existence of the actors and the state they encapsulate are a view of the current state of the application. There are varying strategies for how this can be implemented, each with its own benefits and use cases.
In Akka, routers are used to spawn multiple instances of one actor type so that work can be distributed among them. Each instance of the actor contains its own mailbox, and therefore this cannot be considered a “work-stealing” implementation. There are several strategies that can be used for this task, including the following sections.
Random is a strategy where messages are distributed to the actors in a random fashion, which isn’t one I favor. There was a recent discussion about a startup using Heroku Dynos (virtual server instances) where requests were distributed to each dyno randomly, which meant that even if users scaled up the number of dynos to handle more requests, they had no guarantee that the new endpoints would get any requests and the load would be distributed. That said, random routees are the only ones that do not incur a routing bottleneck, as nothing must be checked before the message is forwarded. And if you have a large number of messages flowing through your router, that can be a useful tradeoff.
Look at Figure 1-2. If I have five routees and use a random strategy, one routee may have no items in its mailbox (like #3), while another routee might have a bunch (#2). And the next message could also be routed to routee #2 as well.
Round robin is a strategy where messages are distributed to each actor instance in sequence as though they were in a ring, which is good for even distribution. It spreads work sequentially amongst the routees and can be an excellent strategy when the tasks to be performed by all routees are always the same and CPU-bound. This assumes that all considerations between the routees and the boxes on which they run are equal: thread pools have threads to use for scheduling the tasks, and the machines have cores available to execute the work.
In Figure 1-3, the work has been distributed evenly, and the next message will go to routee #3.
Smallest mailbox is a strategy which will distribute a message to the actor instance with the smallest mailbox. This may sound like a panacea, but it isn’t. The actor with the smallest mailbox may have the least work because the tasks it is being asked to perform take longer than the other actors’. And by placing the message into its mailbox, it may actually take longer to be processed than had that work been distributed to an actor with more messages already enqueued. Like the round-robin router, this strategy is useful for routees that always handle the exact same work, but the work is blocking in nature: for example, IO-bound operations where there can be varying latencies.
The smallest mailbox strategy does not work for remote actors. The router does not know the size of the mailbox with remote routees.
In Figure 1-4, the work will be distributed to routee #4, the actor with the least number of messages in its mailbox. This happens regardless of whether it will be received and handled faster than if it were sent to #1, which has more items but work that could take less time.
Broadcast is a strategy where messages are sent to all instances of the actor the router controls. It’s good for distributing work to multiple nodes that may have different tasks to perform or handling fault tolerance by handing the same task to nodes that will all perform the same work, in case any failures occur.
Since all routees under the router will receive the message, their mailboxes should theoretically be equally full/empty. The reality is that how you apply the dispatcher for fairness in message handling (by tuning the “throughput” configuration value) will determine this. Try not to think of routers where the work is distributed evenly as bringing determinism to your system: it just means that work is evenly spread but could still occur in each routee at varying times. See Figure 1-5 for an example.
This is a strategy where messages are sent to all instances of the actor the router controls, but only the first response from any of them is handled. This is good for situations where you need a response quickly and want to ask multiple handlers to try to do it for you. In this way, you don’t have to worry about which routee has the least amount of work to do, or even if it has the fewest tasks queued, since those tasks won’t take longer than another routee that already has more messages to handle.
This is particularly useful if the routees are spread among multiple JVMs or physical boxes. Each of those boxes might be utilized at varying rates, and you want the work to be performed as quickly as possible without trying to manually figure out which box is currently doing the least work. Worse, even if you did check to see if a box was the least busy, by the time you figured out which box it was and sent the work, it could be loaded down chewing through other work.
In Figure 1-6, I’m sending the work across five routees. I only care about whichever of the five completes the work first and responds. This trades some potential network latency (if the boxes are more than one physically close hop away) and extra CPU utilization (as each of the routees has to do the work) for getting the response the fastest.
This is a new routing strategy, recently added in Akka 2.1. In some cases, you want to be certain that you understand which routee will handle specific kinds of work, possibly because you have a well-defined Akka application on several remote nodes and you want to be sure that work is sent to the closest server to avoid latency. It will also be relevant to cluster aware routing in a similar way. This is powerful because you know that, by hash, work will most likely be routed to the same routee that handled earlier versions of the same work. Consistent hashing, by definition, does not guarantee even distribution of work.
I mentioned earlier that each actor in a router cannot share mailboxes, and therefore work stealing is not possible even with the varying strategies that are available. Akka used to solve this problem with the BalancingDispatcher
, where all actors created with that dispatcher share one mailbox, and in doing so, can grab the next message when they have finished their current work. Work-stealing is an extremely powerful concept, and because the implementation required using a specific dispatcher, it also isolated the workers on their own thread pool, which is extremely important for avoiding actor starvation.
However, the BalancingDispatcher
has been found to be quirky and not recommended for general usage, given its exceptional and somewhat unexpected behavior. It is going to be deprecated shortly in lieu of a new router type in an upcoming version of Akka to handle work-stealing semantics, but that is as yet undefined. The Akka team does not recommend using BalancingDispatcher
, so stay away from it.
When you are distributing work among actors to be performed, you typically will send commands that the actors can respond to and thus complete the task. The message includes the data required for the actor to perform the work, and you should refrain from putting state into the actor required to complete the computation. The task should be idempotent—any of the many routee instances could handle the message, and you should always get the same response given the same input, without side effects:
// An example of a command message
case
class
CalculateSumOfBalances
(
balances
:
List
[
BigDecimal
])
18.225.55.38