Chapter 10. Migrating Data into JIRA

Overview

This chapter describes how data is migrated into JIRA from other systems, along with suggestions for how to estimate the effort involved in a migration. The short answer is more than is usually expected.

The source system is the system where the data currently is (e.g., Bugzilla, Rally, or even another instance of JIRA). The words migration and import tend to be used interchangeably. If there is a difference, JIRA has import tools, but the whole process is called a migration. A merge is just a migration that leaves existing data unchanged.

Tip

New JIRA administrators sometimes ask Can I synchronize the source system and JIRA, so I don’t have to do a single large export and import? Synchronization is a lot more work since it has to handle configurations in both systems changing over time. Instead, I recommend migrating the data once and then making the source system available in a read-only mode for a period after the migration to make testing easier. It’s also an incentive to move everyone onto a single system together.

Migrating Data from JIRA to JIRA

JIRA comes with the ability to import a JIRA project from a complete XML backup of a difference JIRA instance. This feature can be found at AdministrationSystemProject import. However as the feature’s documentation at https://confluence.atlassian.com/display/JIRA/Restoring+a+Project+from+Backup says, restoring a project from a backup is not a trivial task.

The problem is that the target JIRA has to be configured almost identically to the source JIRA for the project import to work. This includes custom field names and types, schemes, workflows, and users. The versions of JIRA and most of the add-ons must be identical, too. Getting the configuration right can take multiple attempts for each project import.

However, the biggest problem is that project imports do not include any data from the Active Objects tables in the JIRA database. These tables where complex add-ons such as JIRA Agile, JIRA Service Desk, and others store their data. The end result is that agile boards, sprints, rank, service desks, and test cases and result are all not imported when doing a project import. Project imports also don’t import Remote Links or Web Links.

If you are just trying to move some issues from a development JIRA to staging then this approach is worth considering. In general, though, it doesn’t do what is needed for most migrations.

Migration Steps

Any migration can be broken down into three main steps: extract, modify, and import.

Extract the Data

Some way to access the data in the source system is needed. This might be a database connection (a local database is fastest), or perhaps a REST or SOAP (Webservices) API for remote systems. The data could even be extracted as a CSV or XML file. Different source systems make it easier or harder to extract the data, and this step can take a long time (often hours).

A common assumption is that it takes one second per issue just to extract the data using REST. Because of how long this takes—and especially if the network access to the source system is intermittent—it’s a good idea to save the extracted data locally while developing your migration tools.

Some types of common data in different systems are:

Issue data

The source issue’s summary, description, dates, labels (tags), etc. This is the core information.

Comments

This is also core information but may be stored separately from the issue data.

Attachments

These are often large enough to take a long time to migrate. JIRA has a 10MB default maximum size per attachment.

Issue History

This is the record of what changes happened when, but is often not preserved because there isn’t an easy way to import it into JIRA with the CSV Importer. However the JSON importer can import this information.

Links

This data is about the relationships between source issues. Again, the CSV Importer does not support importing this information but the JSON importer does.

Modify the Data

Once as much of the data has been extracted from the source system as possible (and stored locally), the next step is to modify the data. Almost every field is modified in some way, which is a surprise to many people. This is one of the reasons why CSV exports from the source system often need further changes to work with larger imports.

The kinds of modifications needed include the following:

User ID mapping

For example, a user john.smith in the source system needs to be referred to as jsmith in JIRA. This can occur when a company has been acquired and its old data is being moved to the new parent company’s JIRA. Changing multiple user IDs in JIRA after the import is a lot of work, so such changes are generally done at this stage instead.

Date offsets

The dates in the source system may not be from the same time zone needed for the JIRA instance. They almost certainly won’t be in the correct format for importing into JIRA.

Unicode

Some data (such as the summary, description, comments, usernames, and attachment filenames) will likely contain characters that are not UTF-8 encoded, particularly if the text has been cut and pasted from Microsoft applications such as Word or Excel. This data will need to be encoded to be ready for import into JIRA.

Cleaning up dirty data

Over time, most systems accumulate data that doesn’t conform to what is currently considered valid data. Some work will likely be needed to handle this data, if only to ignore it.

Merging fields

Some data will not be wanted for JQL queries in JIRA, but still needs to be preserved. This is often done by converting it into a comment. Other fields may be concatenated for use in JIRA.

Splitting a field

Sometimes data in a single field in the source system is split up into multiple fields in JIRA.

Complex mappings

Sometimes the value of a field in JIRA depends on multiple fields in the source system, in a complex way that is better performed by code before the import, rather than manually afterwards.

Import the Data

The most common way to import data into JIRA is to use one to the standard JIRA Importers. There are importers for Bugzilla, Mantis, FogBugz, Pivotal Tracker, Trac, Asana and Redmine, and also for GitHub and Bitbucket. If you can use one of these standard importers then that will always be the fastest and cheapest approach. However, they don’t allow for many of the data modifications described in “Modify the Data”.

There are two standard JIRA Importers that are more general—CSV and JSON. Both of these importers take a file created with the first two steps (extract, modify) and use the data in that file to create issues in JIRA. This approach allows you to import data from many different source systems while still letting you modify the data for customized imports.

I have also used two other approaches in the past. The first was a direct database import using SQL insert commands, which is fast but involves a deeper understanding of the JIRA and add-ons database schemas. The second was a custom add-on in JIRA that extracted data from the source system and creates the issues using JIRA’s internal API. This last approach was flexible, but required a local copy of the database to work well for large source systems.

It is also possible to create your own JIRA importer using the tutorial at https://developer.atlassian.com/display/JIRADEV/Tutorial+-+Writing+custom+importer+using+JIRA+importers+plugin. I’d recommend this approach over modifying the standard JIRA importers, since you can extend and modify them with your own importer.

The JIRA CSV Importer

The CSV import is located at AdministrationSystemExternal System Import. Once you have done your first test import, you can save a configuration file to avoid having to re-enter the field mapping again.

During the import the CSV data file is uploaded from your local machine to JIRA via a web page. This can take a long time on a slow connection and could fail due to the default 10 minute timeout. You may also have to increase the maximum attachment size to be able to upload the CSV file.

You can assume a rate of 2 or 3 issues/second for importing the data, depending on how many large attachments are present.

Attachments are referred to in the CSV file either using a http:// or file:// URL. If an http URL is used, the attachment must be visible in a browser without using any authentication. If a file URL is used, then each attachment can be retrieved from the local JIRA jira_home/import/attachments directory. The second approach is faster, but either way, the URL has to be encoded as a URL with no invalid characters.

Cloud Differences

Importing data into a JIRA Cloud instance imposes other restrictions on what can be done. If given the choice, I generally prefer JIRA Server for doing migrations rather than JIRA Cloud. The details of the restrictions are described in the following sections.

No Staging Instance

Importing into a Cloud instance is like importing into production. If an import creates unwanted users, they will need to be removed either manually or with an extra script. To delete a test project with more than 5,000 issues, you may need to delete the issues first, typically with a script. One way to work around this is to have another temporary Cloud instance for staging. In this case the user license has to be the same or larger than the production Cloud instance.

Changing CSV Importer Version

The version of the CSV Importer add-on will be updated whenever a new one is available and deployed by Atlassian. This is one more variable to deal with when you’re working towards achieving a stable import.

No Attachment Imports with a File URL

You have to use an unauthenticated http URL in the CSV file to retrieve attachments. This may not always work as noted at https://confluence.atlassian.com/display/CLOUDKB/Importing+Attachments+Into+JIRA+Cloud+Using+CSV. If contacting Atlassian Support doesn’t help, another workaround is to upload the attachments with a script after the import, though this doesn’t preserve the attachment author or date.

No Custom Add-ons

You can’t create a custom importer add-on and use it with Cloud as you can with a JIRA Server instance.

Estimating a Migration

The three main steps of a migration (extract, modify, import) are repeated many times during development before the final migration, not just once. With large amounts of data, the complete extract and import steps can take hours to run, so it’s important to get as much work done with a smaller set of data as possible.

You can assume a rate of one issue/second for extracting the data and a rate of one issue/second for importing the data. Therefore, 36,000 issues will take at least 20 hours to migrate before the users can access the data in JIRA.

Some common tasks in a non-trivial migration and their estimated average duration are shown in Table 10-1. The estimates are averages, and each one can vary substantially, which makes the total effort for a migration quite hard to estimate accurately. The second migration is generally faster than the first one you do.

Table 10-1. Estimated work for a migration
Estimate (Days)Activity
0.25Gather information about the source system (type, version, amount of customization done, number of issues)
0.25Gather information about the target JIRA instance (Server or Cloud, existing custom fields)
0.5Confirm access to source system as the user sees it, i.e., browser or client application
1Confirm access to source system as the system level, e.g., database or API
1Extract the core data for a single issue from the source system
1Store the extracted data locally
1Allows restarting of data extraction after network failures if applicable
2Extract all required data for a single issue from the source system
2Define mapping from source system to JIRA, particularly for user IDs
1Import a single issue’s core data into JIRA
1Import all of a single issue’s data into JIRA, including attachments
4Import all the source system data into JIRA, allowing for multiple iterations
2Review the imported data
17Total number of days

Most migrations need multiple complete test imports into JIRA, each one with the requested changes at the modification stage or to retrieve updated data from the source system. This makes the whole project take longer than most people expect—that is, weeks, not days.

Further Reading

The main starting documentation page for importing data into JIRA is at https://confluence.atlassian.com/display/JIRA/Migrating+from+Other+Issue+Trackers.

A number of Atlassian Experts offer migration tools for JIRA at http://bit.ly/114xrmP, but these are better described as migration services that modify existing migration scripts for each customer. This is because every major migration has its own special needs.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.116.23.112