Making Gantt charts

One form of very widely used visualization of time-based data is a Gantt chart. Named after the mechanical engineer Henry Gantt who invented it in 1910s, it is almost exclusively used to visualize work breakdown structures in project management. This chart is loved by managers for its descriptive value and not so loved by employees, especially when the project deadline is near.

This kind of chart is very straightforward, almost every one can understand and read it, even if it is overloaded with additional (related and unrelated) information.

A basic Gantt chart has a time series on the X axis and a set of labels that represent tasks or subtasks on the Y axis. Task duration is usually visualized either as a line or as a bar chart, extending from the start to end time of a given task.

If subtasks are present, one or many subtasks have a parent task, in which the case total time of a task is aggregated from subtasks in such a way that overlapping and gap time is accounted for.

So, in this recipe, we will be covering the creation of the Gantt chart using Python.

Getting ready

There are many full-fledged software applications and services that allow you to make very flexible and complicated Gantt charts. We will try to demonstrate how you could do it in pure Python, not relying on external applications, yet achieving neat looking and informative Gantt charts.

The Gantt chart shown in the example does not support nested tasks, but it is sufficient for simple work breakdown structures.

How to do it...

The following code example will allow us to demonstrate how Python can be used together with matplotlib to render the Gantt chart. We will perform the following steps:

  1. Load TEST_DATA that contains a set of tasks and instantiate the Gantt class with TEST_DATA.
  2. Each task contains a label and the start and end time.
  3. Process all tasks by plotting horizontal bars on the axes.
  4. Format the x and y axes for the data we are rendering.
  5. Tighten the layout.
  6. Show the Gantt chart.

The following is a sample code:

from datetime import datetime
import sys

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.font_manager as font_manager
import matplotlib.dates as mdates

import logging


class Gantt(object):
    '''
Simple Gantt renderer.
    Uses *matplotlib* rendering capabilities.
    '''

    # Red Yellow Green diverging colormap
    # from http://colorbrewer2.org/
RdYlGr = ['#d73027', '#f46d43', '#fdae61',
              '#fee08b', '#ffffbf', '#d9ef8b',
              '#a6d96a', '#66bd63', '#1a9850']

POS_START = 1.0
POS_STEP = 0.5

def __init__(self, tasks):
self._fig = plt.figure()
self._ax = self._fig.add_axes([0.1, 0.1, .75, .5])

self.tasks = tasks[::-1]

def _format_date(self, date_string):
        '''
        Formats string representation of *date_string* into *matplotlib.dates*
instance.
        '''
try:
date = datetime.strptime(date_string, '%Y-%m-%d %H:%M:%S')
exceptValueError as err:
logging.error("String '{0}' can not be converted to datetime object: {1}"
                  .format(date_string, err))
sys.exit(-1)
mpl_date = mdates.date2num(date)
returnmpl_date

def _plot_bars(self):
        '''
        Processes each task and adds *barh* to the current *self._ax* (*axes*).
        '''
        i = 0
for task in self.tasks:
start = self._format_date(task['start'])
end = self._format_date(task['end'])
bottom = (i * Gantt.POS_STEP) + Gantt.POS_START
width = end - start
            self._ax.barh(bottom, width, left=start, height=0.3,
align='center', label=task['label'],
color = Gantt.RdYlGr[i])
            i += 1

def _configure_yaxis(self):
        '''y axis'''
task_labels = [t['label'] for t in self.tasks]
pos = self._positions(len(task_labels))
ylocs = self._ax.set_yticks(pos)
ylabels = self._ax.set_yticklabels(task_labels)
plt.setp(ylabels, size='medium')

def _configure_xaxis(self):
        ''''x axis'''
        # make x axis date axis
        self._ax.xaxis_date()

        # format date to ticks on every 7 days
rule = mdates.rrulewrapper(mdates.DAILY, interval=7)
loc = mdates.RRuleLocator(rule)
formatter = mdates.DateFormatter("%d %b")

self._ax.xaxis.set_major_locator(loc)
        self._ax.xaxis.set_major_formatter(formatter)
xlabels = self._ax.get_xticklabels()
plt.setp(xlabels, rotation=30, fontsize=9)
def _configure_figure(self):
        self._configure_xaxis()
      self._configure_yaxis()

        self._ax.grid(True, color='gray')
        self._set_legend()
        self._fig.autofmt_xdate()

def _set_legend(self):
        '''
        Tweak font to be small and place *legend*
in the upper right corner of the figure
        '''
font = font_manager.FontProperties(size='small')
        self._ax.legend(loc='upper right', prop=font)

def _positions(self, count):
        '''
        For given *count* number of positions, get array for the positions.
        '''
end = count * Gantt.POS_STEP + Gantt.POS_START
pos = np.arange(Gantt.POS_START, end, Gantt.POS_STEP)
return pos

The main function that drives the Gantt chart generation is defined in the following code. In this function, we load the data into an instance, plot bars accordingly, set up the date formatter for the time axis (x axis), and set values for the y axis (the project's tasks).

def show(self):
        self._plot_bars()
        self._configure_figure()
plt.show()


if __name__ == '__main__':
TEST_DATA = (
{ 'label': 'Research',       'start':'2013-10-01 12:00:00', 'end': '2013-10-02 18:00:00'},  # @IgnorePep8
{ 'label': 'Compilation',    'start':'2013-10-02 09:00:00', 'end': '2013-10-02 12:00:00'},  # @IgnorePep8
{ 'label': 'Meeting #1',     'start':'2013-10-03 12:00:00', 'end': '2013-10-03 18:00:00'},  # @IgnorePep8
{ 'label': 'Design',         'start':'2013-10-04 09:00:00', 'end': '2013-10-10 13:00:00'},  # @IgnorePep8
{ 'label': 'Meeting #2',     'start':'2013-10-11 09:00:00', 'end': '2013-10-11 13:00:00'},  # @IgnorePep8
{ 'label': 'Implementation', 'start':'2013-10-12 09:00:00', 'end': '2013-10-22 13:00:00'},  # @IgnorePep8
{ 'label': 'Demo',           'start':'2013-10-23 09:00:00', 'end': '2013-10-23 13:00:00'},  # @IgnorePep8
                )

gantt = Gantt(TEST_DATA)
gantt.show()

This code will render a simple, neat looking Gantt chart like the following one:

How to do it...

How it works...

We can start reading the preceding code from the bottom after the condition that checks if we are in "__main__".

After we instantiate the Gantt class giving it TEST_DATA, we set up the necessary fields of our instance. We save TASK_DATA in the self.tasks field, and we create our figure and axes to hold the charts we create in future.

Then, we call show() on the instance that walks us through the steps required to render the Gantt chart:

def show(self):
        self._plot_bars()
        self._configure_figure()
plt.show()

Plotting bars requires iteration where we apply the data about the name and duration of each task to the matplotlib.pyplot.barh function, adding it to the axes at self._ax. We place each task in a separate channel by giving it a different (incremental) bottom argument value.

Also, to make it easy to map tasks to their names, we cycle over the divergent color maps that we generated using the colorbrewer2.org tool.

The next step is to configure the figure, which means that we set up the format date on the x axis and tickers' positions and labels on the y axis to match the tasks plotted by matplotlib.pyplot.barh.

Finally, we add a grid and a legend.

At the end, we call plt.show() to show the figure.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.188.178.181