Chapter 7. Recap and Where to Go from Here

Fast data is the natural evolution of big data to a stream-oriented workflow that allows for more rapid information extraction and exploitation, while still enabling classic batch-mode analytics, data warehousing, and interactive queries.

Long-running streaming jobs raise the bar for a fast data architecture’s ability to stay resilient, scale up and down on demand, remain responsive, and be adaptable as tools and techniques evolve.

There are many tools with various levels of support for sophisticated stream processing semantics, other features, and deployment scenarios. I didn’t discuss all the possible engines. I omitted those that appear to be declining in popularity, such as Storm and Samza, as well as newer but still obscure options. There are also many commercial tools that are worth considering. However, I chose to focus on the current open source choices that seem most important, along with their strengths and weaknesses.

I encourage you to explore the links to additional information throughout this report and in the next section. Form your own opinions and let me know what you discover and the choices you make. You can reach me through email, [email protected], and on Twitter, @deanwampler.

At Lightbend, we’ve been working hard to build tools, techniques, and expertise to help our customers succeed with fast data. Please visit us at lightbend.com/fast-data-platform for more information.

Additional References

The following references, some of which were mentioned already in the report, are very good for further exploration:

  • Tyler Akidau, “The World Beyond Batch: Streaming 101”, August 5, 2015, O’Reilly.

  • Tyler Akidau, “The World Beyond Batch: Streaming 102”, January 20, 2016, O’Reilly.

  • Tyler Akidau, Slava Chernyak, and Reuven Lax, Streaming Systems: The What, Where, When and How of Large-Scale Data Processing (Sebastopol, CA: O’Reilly, 2018).

  • Martin Kleppmann, Making Sense of Stream Processing (Sebastopol, CA: O’Reilly, 2016).

  • Martin Kleppmann, Designing Data-Intensive Applications (Sebastopol, CA: O’Reilly, 2017).

  • Gwen Shapira, Neha Narkhede, and Todd Palino, Kafka: The Definitive Guide (Sebastopol, CA: O’Reilly, 2017).

  • Michael Nash and Wade Waldron, Applied Akka Patterns: A Hands-on Guide to Designing Distributed Applications (Sebastopol, CA: O’Reilly, 2016).

  • Jay Kreps, I Heart Logs (Sebastopol, CA: O’Reilly, 2014).

  • Justin Sheehy, “There Is No Now,” ACM Queue 13, no. 3 (2015), https://queue.acm.org/detail.cfm?id=2745385.

Other O’Reilly-published reports authored by Lightbend engineers and available for free at lightbend.com/ebooks:

  • Gerard Maas, Stavros Kontopoulos, and Sean Glover, Designing Fast Data Application Architectures (Sebastopol, CA: O’Reilly, 2018).

  • Boris Lublinsky, Serving Machine Learning Models: A Guide to Architecture, Stream Processing Engines, and Frameworks (Sebastopol, CA: O’Reilly, 2017).

  • Jonas Bonér, Reactive Microsystems: The Evolution of Microservices at Scale (Sebastopol, CA: O’Reilly, 2017).

  • Jonas Bonér, Reactive Microservices Architecture: Design Principles for Distributed Systems (Sebastopol, CA: O’Reilly, 2016).

  • Hugh McKee, Designing Reactive Systems: The Role of Actors in Distributed Architecture (Sebastopol, CA: O’Reilly, 2016).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.217.228.35