Welcome to Effective Akka. In this book, I will provide you with comprehensive information about what I’ve learned using the Akka toolkit to solve problems for clients in multiple industries and use cases. This is a chronicle of patterns I’ve encountered, as well as best practices for developing applications with the Akka toolkit.
This book is for developers who have progressed beyond the introductory stage of writing Akka applications and are looking to understand best practices for development that will help them avoid common missteps. Many of the tips are relevant outside of Akka as well, whether it is using another actor library, Erlang, or just plain asynchronous development. This book is not for developers who are new to Akka and are looking for introductory information.
The first question that has to be addressed is, what problems is Akka trying to solve for application developers? Primarily, Akka provides a programming model for building distributed, asynchronous, high-performance software. Let’s investigate each of these individually.
Building applications that can scale outward, and by that I mean across multiple JVMs and physical machines, is very difficult. The most critical aspects a developer must keep in mind are resilience and replication: create multiple instances of similar classes for handling failure, but in a way that also performs within the boundaries of your application’s nonfunctional requirements. Note that while these aspects are important in enabling developers to deal with failures in distributed systems, there are other important aspects, such as partitioning functionality, that are not specific to failure. There is a latency overhead associated with applications that are distributed across machines and/or JVMs due to network traffic as communication takes place between systems. This is particularly true if they are stateful and require synchronization across nodes, as messages must be serialized/marshalled, sent, received, and deserialized/unmarshalled for every message.
In building our distributed systems, we want to have multiple servers capable of handling requests from clients in case any one of them is unavailable for any reason. But we also do not want to have to write code throughout our application focused only on the details of sending and receiving remote messages. We want our code to be declarative—not full of details about how an operation is to be done, but explaining what is to be done. Akka gives us that ability by making the location of actors transparent across nodes.
Asynchrony can have benefits both within a single machine and across a distributed architecture. In a single node, it is entirely possible to have tremendous throughput by organizing logic to be synchronous and pipelined. The Disruptor Pattern by LMAX is an excellent example of an architecture that can handle a great deal of events in a single-threaded model. That said, it meets a very specific use case profile: high volume, low latency, and the ability to structure consumption of a queue. If data is not coming into the producer, the disruptor must find ways to keep the thread of execution busy so as not to lose the warmed caches that make it so efficient. It also uses pre-allocated, mutable states to avoid garbage collection—very efficient, but dangerous if developers don’t know what they’re doing.
With asynchronous programming, we are attempting to solve the problem of not pinning threads of execution to a particular core, but instead allowing all threads access in a varying model of fairness. We want to provide a way for the hardware to be able to utilize cores to the fullest by staging work for execution. This can lead to a lot of context switches, as different threads are scheduled to do their work on cores, which aren’t friendly to performance, since data must be loaded into the on-core caches of the CPU when that thread uses it. So you also need to be able to provide ways to batch asynchronous execution. This makes the implementation less fair but allows the developer to tune threads to be more cache-friendly.
This is one of those loose terms that, without context, might not mean much. For the sake of this book, I want to define high performance as the ability to handle tremendous loads very fast while at the same time being fault tolerant. Building a distributed system that is extremely fast but incapable of managing failure is virtually useless: failures happen, particularly in a distributed context (network partitions, node failures, etc.), and resilient systems are able deal with them. But no one wants to create a resilient system without being able to support reasonably fast execution.
You may have heard discussion, particularly around Typesafe, of creating reactive applications. My initial response to this word was to be cynical, having heard plenty of “marketecture” terms (words with no real architectural meaning for application development but used by marketing groups). However, the concepts espoused in the Reactive Manifesto make a strong case for what features comprise a reactive application and what needs to be done to meet this model. Reactive applications are characteristically interactive, fault tolerant, scalable, and event driven. If any of these four elements are removed, it’s easy to see the impact on the other three.
Akka is one of the toolkits through which you can build reactive applications. Actors are event driven by nature, as communication can only take place through messages. Akka also provides a mechanism for fault tolerance through actor supervision, and is scalable by leveraging not only all of the cores of the machine on which it’s deployed, but also by allowing applications to scale outward by using clustering and remoting to deploy the application across multiple machines or VMs.
In this book, we will use an example of a large financial institution that has decided that using existing caching strategies no longer meet the real-time needs of its business. We will break down the data as customers of the bank, who can have multiple accounts. These accounts need to be organized by type, such as checking, savings, brokerage, etc., and a customer can have multiple accounts of each type.
The following typographical conventions are used in this book:
Constant width
Constant width bold
Constant width italic
This icon signifies a tip, suggestion, or general note.
This icon indicates a warning or caution.
Supplemental material (code examples, exercises, etc.) is available for download at http://examples.oreilly.com/9781449360078-files/.
This book is here to help you get your job done. In general, if this book includes code examples, you may use the code in this book in your programs and documentation. You do not need to contact us for permission unless you’re reproducing a significant portion of the code. For example, writing a program that uses several chunks of code from this book does not require permission. Selling or distributing a CD-ROM of examples from O’Reilly books does require permission. Answering a question by citing this book and quoting example code does not require permission. Incorporating a significant amount of example code from this book into your product’s documentation does require permission.
We appreciate, but do not require, attribution. An attribution usually includes the title, author, publisher, and ISBN. For example: “Effective Akka by Jamie Allen (O’Reilly). Copyright 2013 Jamie Allen, 978-1-449-36007-8.”
If you feel your use of code examples falls outside fair use or the permission given above, feel free to contact us at [email protected].
Safari Books Online is an on-demand digital library that delivers expert content in both book and video form from the world’s leading authors in technology and business.
Technology professionals, software developers, web designers, and business and creative professionals use Safari Books Online as their primary resource for research, problem solving, learning, and certification training.
Safari Books Online offers a range of product mixes and pricing programs for organizations, government agencies, and individuals. Subscribers have access to thousands of books, training videos, and prepublication manuscripts in one fully searchable database from publishers like O’Reilly Media, Prentice Hall Professional, Addison-Wesley Professional, Microsoft Press, Sams, Que, Peachpit Press, Focal Press, Cisco Press, John Wiley & Sons, Syngress, Morgan Kaufmann, IBM Redbooks, Packt, Adobe Press, FT Press, Apress, Manning, New Riders, McGraw-Hill, Jones & Bartlett, Course Technology, and dozens more. For more information about Safari Books Online, please visit us online.
Please address comments and questions concerning this book to the publisher:
O’Reilly Media, Inc. |
1005 Gravenstein Highway North |
Sebastopol, CA 95472 |
800-998-9938 (in the United States or Canada) |
707-829-0515 (international or local) |
707-829-0104 (fax) |
We have a web page for this book, where we list errata, examples, and any additional information. You can access this page at http://oreil.ly/effective-akka.
To comment or ask technical questions about this book, send email to [email protected].
For more information about our books, courses, conferences, and news, see our website at http://www.oreilly.com.
Find us on Facebook: http://facebook.com/oreilly
Follow us on Twitter: http://twitter.com/oreillymedia
Watch us on YouTube: http://www.youtube.com/oreillymedia
Thanks to my wife, Yeon, and children Sophie, Layla, and James—I couldn’t have done this without their love, help, and support. And to my parents, Jim and Toni Allen, who displayed tremendous patience with me while I figured out what I was going to do with my life. Finally, thanks to Jonas Bonér, Viktor Klang, Roland Kuhn, Dragos Manolescu, and Thomas Lockney for their help and guidance.
18.222.82.221