Chapter 24

Migrating to Git

To successfully migrate from another version management to Git, it takes more than just a transfer of software versions to a Git repository. This workflow shows how to organize the migration of a project and what you should bear in mind:

  • Knowledge structure and know-how transfer
  • Strategic decisions you should take
  • Transfer of content to a Git repository
  • Expiration of the actual migration
  • Dragging changes that have arisen since the creation of the Git repository in the old version management

Overview

The migration process is divided into several phases. If several projects need to be migrated in succession, some of these phases can be skipped.

  1. Learn Git, gain experience
  2. Make decisions
  3. Find branches
  4. Prepare the repository
  5. Fetch branches
  6. Take the repository in question
  7. Clean up

Requirements

The project is taken from another version control system. We have also made the following assumptions:

Permissions: You should have free write access to all the files and directories in the workspace. In particular, the files may not be set to “Read only.” Optionally, the configuration of the old version management needs to be adapted.

Ignoring directories: The Git repository is created in the workspace of the other version management. A .git directory that is ignored in the old version management must be able to protect it from accidental deletion.

Attention! Unlike most of the previous workflows, this workflow is highly dependent on external factors: What version control is used? How are the projects organized? How are branches used? You will probably not implement this workflow exactly as it is described here. So plan a little time to adjust the workflow to your requirements.

Workflow “Migrating to Git”

A project from another version control system is migrated to Git. All software versions to be further developed are adopted in the Git repository. Afterward, you can continue working with the new repository. If required, “trickling” changes from the old version management need to be tightened.

Figure 24.1: Workflow overview

Process and Implementation

Learn Git, Gain Experience

Git is not difficult to learn, because the basic concepts are logical and well thought of (which hopefully we have been able to convince you with this book). For developers who have worked a lot with centralized version control systems, some aspects of Git need to be gotten used to, including working with remote repositories or dealing with branches. Therefore, we recommend that one or two developers prepare to support the team as a coach during migration.

Step 1: Test Git

Start a little unimportant sample project. It is best to choose something that is not yet in the old version control. Perhaps if you want to develop just a new utility class, try a new Java library or write some shell scripts for server management.

Start with the Git command line, even if you later use a Git front-end or a plug-in for your development. So, first learn Git “unfiltered.” Although the front end is much easier, it often disguises what really happens in Git. When it comes to problems with the migration, at least you should know what is going on behind the scenes. Try basic commands such as add, commit, push, pull and log.

After that you should look at branches, because that is the area where most difficulties are initially located. Try the workflows “Developing on the Same Branch” and “Developing with Feature Branches.” Create a few conflicts on purpose to practice the conflict resolution.

Then you can try the front-end. There is now a good selection. Try those variants that match your development environment and target platform. Pay special attention to how well the merge conflict resolution is supported. At this point, the environments differ significantly.

As a coach, you should at least master the following items:

  • You have gained so much experience that you can teach your colleagues basic Git commands such as add, commit, status, diff, log, push and pull.
  • You should look at the following commands closely because they operate differently from their counterparts in traditional version control systems: branch, checkout, reset, merge and rebase.
  • You have worked with at least one of the two branching strategies: “Developing on the Same Branch” and “Developing with Feature Branches.”
  • You can set up Git and possibly a front-end on a developer machine.
  • You can resolve merge conflicts (possibly also on the front-end).

Step 2 (Optional): Work in parallel in the other version control

You can also learn Git by putting it to use in parallel with the old version control for a while, so while you are developing locally with Git, the project remains in the central versioning. See the workflow “Using Other Version Control Systems in Parallel .”

Make Decisions

A few important decisions should be made early.

Step 1: Migrate all projects at once?

This workflow describes how to migrate a project. If you have multiple projects, you can perform this workflow several times in succession. However, you can also carry out migration for all the projects in parallel. This has both advantages and disadvantages compared to a single project migration.

Advantages

  • You can use Git extensively earlier.
  • You benefit sooner from the advantages of Git .
  • After the migration you only need to support one version control system.

Disadvantages

  • The know-how transfer becomes more difficult. In the first two to three weeks after the migration, many questions will be asked. If there are many developers, and only one coach who is familiar with Git, this can jeopardize the success of the migration.
  • Problems with migration can easily cause major issues, when many projects are affected at the same time.
  • If in the initial phase you make a decision that later turns out to be unfavorable, this must be corrected in many projects.

Our recommendation: If you are not sure, we recommend you start with a single project in order to determine how you will proceed.

Step 2: Which projects should be migrated?

We cannot help you with this decision.

Step 3: Should the existing structure be adopted?

How are your projects currently organized?

  • Are all projects in the same repository? Or are they in different repositories?
  • Do the projects have a common release cycle? Are they even part of the same product?
  • Are there common cross-project changes?
  • Are the projects more closely or loosely coupled?

Unfortunately, we cannot say what the ideal Git repository structure for your projects should look like. We can only give you some rough rules of thumb:

  • If you currently store all your projects in a repository, we tend to suggest you do the same in Git.
  • If the projects have a common release cycle, then we would suggest a common repository in Git because then the workflow “Performing A Release” can be applied across the project.
  • Even frequent cross-project changes are an indicator for a shared repository.
  • If the projects are loosely coupled, this might be a candidate for separate repositories.
  • Git is not suitable for binary files or resources that are large or change often. You can use Git to manage such files, but they are better stored in a separate repository, so that the performance of the normal (source code) projects does not suffer.

Our recommendation: If in doubt, just start with a structure similar to the structure in your previous version management. Later you can create a better structure using the “Merging Small Projects” and “Splitting A Large Project” workflows.

Step 4: Can you afford development interruption during the migration?

If yes, this simplifies the migration process a little. Make the new repository ready and retire the old repository. The developers should then switch over to Git and do a release from Git.

However, if you operate a 24/7 system, you might want to hold any hotfixes from the old version management as long as possible during migration, until the developers are able to provide release versions from the Git repository. In this case, you have to deal with how you can monitor changes from the old repository.

Step 5: Which branching strategy do you want to use?

“Developing on the Same Branch” or “Developing with Feature Branches”? You should decide on this in time so that you can give the developers the new workflow before migration, so that afterwards you can continue working equally productively.

Our recommendation: If you are not sure, start with “Developing on the Same Branch” because the classic version management is similar.

Step 6: Which front-end do you want to use?

Finally, you should supply the developers with the right software before the migration.

Find Branches

Next, you need to find out which software releases in the old repository are to be developed as Git branches:

  • The main line of development, known in other version control systems as trunk or main line, is to be fetched safely.
  • Each version, for which bug fixes or extensions must be supplied, should be fetched as a branch in Git. If you develop a product that is installed at the customer site, there could be many versions. If you are developing a web application, however, you may only be using two branches: one for the productive version, on which hotfixes are built, and one for the feature development for the next release.
  • If you work with feature-branches in your previous version control, you should also use this in Git or finish and close the branches just before the migration. The latter makes migration easier, of course.

In many version control systems, there are the so-called floating tags, which are tags like RELEASE3, which can be moved to hotfixes. Such tags are often candidates for release branches in Git.

The fewer branches you have to take, the less work you have to do at the migration. So choose wisely what you really need, and think when you will perform the migration. Maybe there is a point in time where a few branches should be taken from the old version control.

Subsequently, draw a “relationship” diagram for all the branches found. In the simplest case, this is a sequence with the oldest version at the bottom and the latest at the top.

Prepare the Repository

Next, create the Git repository. So that you can set it up thoroughly and test it, you should allow a few days (or weeks) before starting the actual migration. However, in the meantime, this means new changes in the old repository. Therefore, it will be necessary to monitor these changes later.

To achieve this, you can work on the Git part with two branches, one represents the development in Git and the other the development in the old repository. We call the latter the legacy branch. We do not develop on the legacy branch, it is only for software versions from the old repository. Transfer these changes later with a merge on the development branch.

If the concept of legacy branches somehow reminds you of remote-tracking branches, then you have just recognized a pattern. Both illustrate operations from another repository into the local repository.

In this example we use the following naming convention: For branches or tags from the old version management, we use capital letters, e.g. RELEASE3. In Git we use lowercase letters and call the development branch release3 and the legacy branch legacy-release3.

Step 1: Get the project from the old version management

A workspace that includes project files is obtained from the old version control. In other version control systems it is often called a checkout.

Step 2: Create a Git repository

In this workspace from the old version control, a Git repository is now created. The result is a workspace that is associated with both version control systems. We call this a dual-use workspace.

> cd old-vcs-workspace 

> git init 

Step 3: Create a local backup

When you are working with two different version control systems at the same time, it may well happen that you specify a force or a clean in the wrong place. Therefore, a backup is not a bad idea:

> git clone --no-hardlinks --bare . /backups/myproject.git 

> git remote add backup /backups/myproject.git 

Later, you should secure it occasionally.

> git push --all backup 

To restore, clone the repository into a temporary directory, then switch to the desired Git branch and move the .git directory in the workspace of the old version control.

Step 4: Allow to ignore metafiles

First, ensure that the two version control systems do not hit each other’s workspace.

The metafiles of the old version control should not be taken into the Git repository. For this, create a .gitignore file and enter the paths or file patterns to be ignored. The status command may no longer show the metafiles from the old version management.

> git commit .gitignore -m "ignore legacy metafiles" 

Conversely, the old version management must be configured so that the .git directory and the .gitignore file are preserved. For instance, in CVS you can do this by creating a .cvsignore file in the user directory.

Fetch the Branches

To fetch the tags and branches from the old version control, there are steps to execute. You start with the oldest branch or tag that is to be fetched.

Step 1: If necessary, switch to the previous branch

With the first branch, you can skip this step because there is no predecessor branch.

In Figure 24.1, you can see which is the predecessor of a branch. If you want to migrate RELEASE3, then switch now to its predecessor, i.e. to the legacy branch for RELEASE2.

> git checkout legacy-release2

Step 2: Create a legacy branch

Create a legacy branch for the version. For example, for RELEASE3 you would create

> git branch legacy-release3

Step 3: Take the version from the old version management

Now, we switch to the old version management at the software version we want to take, e.g. RELEASE3.

> git status

The status command shows what changes from RELEASE2 are on RELEASE3. You should consider briefly whether it looks plausible. If so, you can accept the changes in the new legacy branch.

> git add --all 
> git commit -m "RELEASE3 retrieved from legacy-cvs"

Step 4: Leave generated files unversioned

The current software version is built and tested. This probably creates new files that should not be included in the repository. You must create a .gitignore file.

> git commit .gitignore -m "ignore build artifacts" 

Step 5: Create a Git branch

Now create a Git branch on which development will later take place.

> git branch release3

Step 6: Check the result

You should check the result again. So that the metadata of the version control systems do not interfere with the process, comparison should be performed in temporary directories outside the workspace. The archive command can help. This command exports the file tree of all commits to an archive file (tar or zip). The current version of the branch in Git, in this case release3, is written to a temporary directory named git-vcs.

> git archive release3 | tar -x -C /tmp/git-vcs

Next, export the version from the old version management, for example to /tmp/legacy-vcs. Now you can make a comparison, for example, with kdiff3. Except for the .gitignore file, there should be no differences.

> kdiff3 /tmp/git-vcs/ /tmp/legacy-vcs

Take the repository in question

Our goal is to provide a transition that is as friction-free as possible.

Step 1: Announcement

Announce the migration on time. The notice should contain the following information:

Introduction: Invite everybody to a meeting in which the normal work with Git is shown.

Installing the development environment: Briefly describe how to set up the development environment (Git and IDE plug-ins to install and configure), and how you can clone a project.

Freeze time: Encourage employees to bring any local changes in the old version control to a pre-determined date and from then reload any new changes.

Resume date: When can people resume work with the new Git repository?

Emergency plan: Hotfix releases can also be carried out during the transition phase from the old version management. The changes must then be tightened later in Git. It is important to clearly acknowledge that this has to be made essential. Otherwise, an already fixed bug could be delivered again at a Git release.

Step 2: Introduction

Now show them how to work with Git. For day to day operations, you only need a few commands. You can, for example, introduce the workflow “Developing on the Same Branch”. For demonstration you can just use a clone of the new repository. You can experiment to your heart’s content and then just throw away the clone.

Step 3: Get the recent changes

After the freeze point, you have to follow all the changes from the old repository since the creation of the Git repositories. This is done in the dual-use workspace for each legacy branch. You first switch to the legacy branch in Git.

> git checkout legacy-release3 

Then you switch to the old version management on the appropriate branch or tag, e.g. RELEASE3, and check if there have been any changes.

> git status 

If there have been no changes, you are covered in the legacy branch.

> git add -all 

> git commit -m "updating legacy-release3 from old vcs" 

Thereafter, the changes in the new branch can be fetched for further development in Git.

> git checkout 

> git merge legacy-release3 

If no development has taken place in Git, there will be no merge conflict here.

In this way, you can trace if the development has already begun in Git, if a developer has missed the freeze date, or if a hotfix piece had to be carried out in the old version management. In such a case, however, there may be merge conflicts that you must resolve manually.

Step 4: Publish the new repository

After all branches have been traced, you can put the repository on the server. Then, add the known URL and ask the developer to clone the repository and proceed with development (Continue time).

Step 5: Build the product or perform a release

Now comes the time to build the current version of your product to ensure that you can now do a release without the old version management.

Step 6: Make the old repository read-only

As soon as you are able to perform a new release (or build a product) from the new repository, you should make the old repository read-only. It is then only used as an archive for the history of the project.

Step 7: Support the developers

Do not forget to make time to support the developers during the first few weeks. In particular, you should be prepared to resolve merge conflicts and make local edits, such as by using the reset command or by using interactive rebasing.

Clean up

After the old repository has been disconnected, you can delete the legacy branches. The best way to do is in a freshly cloned workspace and not in the dual-use workspace, because there is no linked origin in a fresh clone.

> git branch -d legacy-release3 

> git push origin :legacy-release3 

Why Not the Alternatives?

Why Not Take over the Whole History?

In this workflow, only individual software versions that are to be developed further are taken. This has the disadvantage that you cannot see the old history in the new Git repository. The history remains in the old repository.

There are various tools (from Git as well as from independent projects) for acquiring a history. For example, the cvsimport command can transfer CVS repository contents to a Git repository. However, since the structure in the CVS repository is very different than that in Git, the translation is not trivial and the quality of the result may vary depending on the manner in which CVS was used previously. In any case, you should look at the import result very carefully before you continue working with it. You may have to rework it to make it fit.

It is the thing that has prevented us from going this route. Secondly, we wanted to show that a migration path was feasible, regardless of which version control system you are migrating from.

Can We Get Rid of the Legacy Branches?

In the workflow legacy branches are initially created that reflect the current states of branches and tags in the old version management. At the end of the workflow, these legacy branches will be deleted. They serve only one purpose: namely, the fetching of subsequent changes from the old repository. If you can interrupt the development for a few days (for example, when the team is attending a training course or can work on something else), then you can certainly do without legacy branches and simplify the workflow a bit.

Can We Do without the Dual-Use Workspace?

In the dual-use workspace you can work with Git and the old version management at the same time. This facilitates the exchange of software versions: After getting the desired version, you can perform a commit in Git.

However, a dual-use workspace is not always possible with any version control system. In such cases where it is not possible, you could operate with two separate workspaces. Then you would have to change back and forth, possibly by using a shell script or rsync. So, eliminating the dual-use workspace is possible, but it would make migration much more complex.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.7.102