LinkedIn created Azkaban as a workflow manager mainly to solve the issue of time-based dependency Scheduling of Hadoop batch jobs. It's different from Luigi in at least two respects:
- It's written in Java
- A user can create schedules in a GUI web browser
Azkaban consists of three main components:
- A relational database: A MySql DB to store the state of the workflow
- AzkabanWebServer: Handles project management, scheduling and monitoring of job executions, security (AuthN), and provides a UI for the user to schedule and watch job executions
- AzkabanExecutorServer: Component responsible for handling job executions
Features of Azkaban:
- It is compatible with all Hadoop versions
- Provides capability to specify simple web and HTTP workflow uploads
- Modular and pluggable for each Hadoop ecosystem
- Tracks user actions, authentication, and authorization
- Provides a separate workspace for each new project
- Provides email alerts on SLAs, failures, and successes
- Allows users to retry failed jobs