Flume architecture principles

Any technology piece to be successful, should have clearly defined architecture principles based on which its design is created and then evolved throughout. Flume also comes in one such software and up next are some of the architecture principles based on which Flume was designed (some of those got introduced as part of Flume NG):

  • Reliability: The capability of continuously accepting stream data and events without losing any data, in a variety of failure scenarios (mostly partial failures). One of the core architecture principles taken very seriously by Flume is fault-tolerance, which means that even if some components fail or misbehave, some hardware issues pop up, or if bandwidth or network behaves bad, Flume will accept these as facts of life in most cases and carry on doing its main job without shutting down completely. Flume does guarantee that the data reaching the Flume Agent will eventually be handed over to other components as long as the agent is kept running. There are settings that can be set to control the reliability level. Good to know that higher the reliability, lower will be the scalability.
  • Scalability: Flume has the ability to handle more stream data with mere changes in hardware topology. Flume scales horizontally by allowing to add additional machines to cater to the load of increased message throughput. In the architecture section we will cover various components which needs change when scaled horizontally. The scalability does however depend on the destination system's ability to keep taking data coming out of the pipeline and that at times can be a defining aspect of how much your flume can scale.
  • Manageability: Ability to manage various components as part of the solution centrally in all aspects is key to success installation of any architecture in production. Apache Flume, using Flume master (will explain in next section in detail) component allows managing all components in a central fashion using defined settings controlled through a web interface or Flume command line interface.
  • Extensibility: One of the very important principle which allows integration of this technology with various source and destination systems. This is a mandatory requirement and definitely one of the core principles how Flume was designed and architected. This is achieved mainly by writing new or using built-in connectors to connect to Flume in both input and output.

These are some of the core architecture  principles on which Flume was made and in the following sections many of these aspects will get clarified more.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.16.137.117