Preface

As Hadoop has grown into mainstream popularity, so has its vibrant ecosystem, including widely used tools such as Hive, Spark, Impala, and HBase. This book focuses on one of those tools: Apache HBase, a scalable, fault-tolerant, low-latency data store that runs on top of the Hadoop Distributed Filesystem (HDFS). HBase merges Hadoop scalability with real-time data serving. At scale, HBase allows for millions of read and write operations per second from a single cluster, while still maintaining all of Hadoop’s availability guarantees. HBase quickly grew in popularity and now powers some of the largest Hadoop deployments on the planet—it is used by companies such as Apple, Salesforce.com, and Facebook.

However, getting started with HBase can be a daunting task. While there are numerous resources that can help get a developer started (including mailing lists, an online book, and Javadocs), information about architecting, designing, and deploying real-world applications using Apache HBase is quite limited. That’s where this book comes in.

The goal of the book is to bring to life real-world HBase deployments. Each use case discussed in this book has been deployed and put into production. This doesn’t mean there isn’t room for improvement, or that you won’t need to modify for your particular task, but it does show how things have actually been done.

The book also includes robust coverage of troubleshooting (Part III). Our goal is to help you avoid common deployment mistakes. Part III will also offer insight into often overlooked tuning such as garbage collection and region allocations.

Who Should Read This Book?

Architecting HBase Applications is designed for architects, developers, and those looking to get a better idea of big data application deployment in general. You should have basic knowledge about Hadoop, including the components needed for setting up and installing a successful Hadoop cluster. We will not spend time on Hadoop configurations or NodeManager actions. Architects reading this book are not required to have a full working knowledge of Java, but it will be necessary to fully grasp the deployment chapters. The book is designed to cover multiple vertical use cases and designed to assist enterprises and startups alike.

Architects will appreciate the detail-oriented use case chapters, which outline the individual components being deployed and how they are all tied together. The development chapters offer developers a quick look at detailed code examples to speed up production deployments. The deployment chapters will offer insight into the specific APIs being used, along with performance enhancement tips that can save hours of troubleshooting. Those curious about big data will find both the architecture and deployment chapters useful, and also gather insight into the HBase ecosystem and what it takes to deploy HBase.

How This Book Is Organized

Architecting HBase Applications is organized into three parts: Part I, the introduction to HBase, covers topics such as what HBase is, what its ecosystem looks like, and how to deploy it. Part II, which covers the use cases, is the heart of the book. We hope this will be the part you refer back to most frequently, as it contains tips and tricks that will prove useful to you. Finally, Part III discusses troubleshooting—you should refer to this part frequently. We hope this will be the second-most referenced section (in a proactive manner, not a reactive one). This part offers insights about controlling your region count, properly tuning garbage collection, and avoiding hotspots.

Additional Resources

This book is not designed to cover the internals of HBase. Our good friend Lars George has taken HBase internals to a whole new level with HBase: The Definitive Guide (O’Reilly, 2011). We recommend reading his book as a precursor to ours. It will help you better understand some terminology that we only gloss over with a paragraph or two.

While we spend a fair bit of time discussing the details of deploying HBase, our book will not cover much of the theory behind deploying HBase. Nick Dimiduk and Amandeep Khurana’s HBase in Action (Manning, 2013) covers the practicalities of deploying HBase. Their book is less focused on the total application development and spends more time on production best practices.

Conventions Used in This Book

The following typographical conventions are used in this book:

Italic

Indicates new terms, URLs, email addresses, filenames, and file extensions.

Constant width

Used for program listings, as well as within paragraphs to refer to program elements such as variable or function names, databases, data types, environment variables, statements, and keywords.

Constant width bold

Shows commands or other text that should be typed literally by the user.

Constant width italic

Shows text that should be replaced with user-supplied values or by values determined by context.

Tip

This element signifies a tip or suggestion.

Note

This element signifies a general note.

Warning

This element indicates a warning or caution.

Using Code Examples

Supplemental material (code examples, exercises, etc.) is available for download at https://github.com/ArchitectingHBase/examples.

This book is here to help you get your job done. In general, if example code is offered with this book, you may use it in your programs and documentation. You do not need to contact us for permission unless you’re reproducing a significant portion of the code. For example, writing a program that uses several chunks of code from this book does not require permission. Selling or distributing a CD-ROM of examples from O’Reilly books does require permission. Answering a question by citing this book and quoting example code does not require permission. Incorporating a significant amount of example code from this book into your product’s documentation does require permission.

We appreciate, but do not require, attribution. An attribution usually includes the title, author, publisher, and ISBN. For example: “Architecting HBase Applications by Jean-Marc Spaggiari and Kevin O’Dell (O’Reilly). Copyright 2016 Jean-Marc Spaggiari and Kevin O’Dell, 978-1-491-91581-3.”

If you feel your use of code examples falls outside fair use or the permission given above, feel free to contact us at .

Safari® Books Online

Note

Safari Books Online is an on-demand digital library that delivers expert content in both book and video form from the world’s leading authors in technology and business.

Technology professionals, software developers, web designers, and business and creative professionals use Safari Books Online as their primary resource for research, problem solving, learning, and certification training.

Safari Books Online offers a range of plans and pricing for enterprise, government, education, and individuals.

Members have access to thousands of books, training videos, and prepublication manuscripts in one fully searchable database from publishers like O’Reilly Media, Prentice Hall Professional, Addison-Wesley Professional, Microsoft Press, Sams, Que, Peachpit Press, Focal Press, Cisco Press, John Wiley & Sons, Syngress, Morgan Kaufmann, IBM Redbooks, Packt, Adobe Press, FT Press, Apress, Manning, New Riders, McGraw-Hill, Jones & Bartlett, Course Technology, and hundreds more. For more information about Safari Books Online, please visit us online.

How to Contact Us

Please address comments and questions concerning this book to the publisher:

  • O’Reilly Media, Inc.
  • 1005 Gravenstein Highway North
  • Sebastopol, CA 95472
  • 800-998-9938 (in the United States or Canada)
  • 707-829-0515 (international or local)
  • 707-829-0104 (fax)

We have a web page for this book, where we list errata, examples, and any additional information. You can access this page at http://bit.ly/architecting-hbase-applications.

To comment or ask technical questions about this book, send email to .

For more information about our books, courses, conferences, and news, see our website at http://www.oreilly.com.

Find us on Facebook: http://facebook.com/oreilly

Follow us on Twitter: http://twitter.com/oreillymedia

Watch us on YouTube: http://www.youtube.com/oreillymedia

Acknowledgments

Kevin and Jean-Marc would like to thank everyone who made this book a reality for all of their hard work: our amazing editor Marie Beaugureau; the exceptional staff at O’Reilly Media; Lars Hofhansl for composing the foreword; our primary reviewers Nate Neff, Suzanne McIntosh, Jeff “Jeffrey” Holoman, Prateek Rungta, Jon Hsieh, Sean Busbey, and Nicolae Popa; fellow authors for their unending support and guidance; Ben Spivey, Joey Echeverria, Ted Malaska, Gwen Shapira, Lars George, Eric Sammer, Amandeep Khurana, and Tom White. We would also like to thank Linden Hillenbrand, Eric Driscoll, Ron Beck, Paul Beduhn, Matt Jackson, Ryan Blue, Aaron “ATM” Meyers, Dave Shuman, Ryan Bosshart (thank you for all the hoops you jumped through), Jean-Daniel “JD” Cryans, St. Ack, Elliot Clark, Harsh J Chouraria, Amy O’Connor, Patrick Angeles, and Alex Moundalexis. Finally, we’d like to thank everyone at Cloudera and Rocana for their support, advice, and encouragement along the way.

From Kevin

I would like to thank my best friends and brothers for the lifetime of support and encouragement: Matthew “Kabuki” Langrehr, Scott Hopkins, Paul Bernier, Zack Myers, Matthew Ring, Brian Clay, Chris Holt, Cole Sillivant, Viktor “Shrek” Skowronek, Kyle Prawdzik, and Master Captain Matt Jones. I would also like to thank my friends, coworkers, partners, and customers who helped along the way: Ron Kent, John “Over the Top” Lynch, Brian Burton, Mark Schnegelberger, David Hackett, Sekou McKissick, Scott Burkey, David Rahm, Steve Williams, Nick Preztak, Steve “Totty” Totman, Brock Noland, Josh “Nooga” Patterson, Shawn Dolley, Stephen Fritz, Richard Saltzer, Ryan P, and Sam Heywood. A special thanks to everyone who helped by publishing their use case or consulting on content: Kathleen DeValk, Kevin Farmer, Raheem Daya, Tomas Mazukna, Chris Ingrassia, Kevin Sommer, and Jeremy Ulstad.

A special thanks goes to Mike Olson and Angus Klein for taking a chance and hiring me at Cloudera. Eric Sammer, Omer Trajman, and Marc “Boat Ready” Degenkolb for bringing me onto Rocana. A begrudging thanks to Don “I am a millennial-hating baby” Brown. One final thank you to Jean-Marc, I can’t thank you enough for asking me to be your coauthor.

From Jean-Marc

I would like to thank everyone who supported me over the journey.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.189.186.109