Chapter 8

A Cleaner History with Rebasing

Many branches in a commit history are unclear. Git makes it possible to straighten out the history. The most important tool for this is the rebase command, which can move the impact of a commit to another place in the commit graph. You can do this,

  • if you accidentally performed a commit on the wrong branch. It could be a bug fix that you have diverted to the line of development (master).
  • when multiple developers work hard on the same software and integrate their changes frequently. Without rebasing they would create a history of many small branches and junctions (called a diamond necklace). With the rebase command you can instead create a smooth linear history.

The Principle: Copying of Commits

The rebasing principle is simple: Git takes a sequence of commits that you want to move, and play it again on the target branch in exactly the same order. This produces for each of the original commits a true copy of the same changeset, the same author, date and comments.

Note: At first glance it looks like Git would move commits when rebasing. In fact, the “shifted” commits are always new commits with different commit hashes. This is important if more branches have already been diverted from the original commits.

Note: As the new commits are recorded at different points in the commit graph, naturally there can be conflicts because the changes do not fit there. Such changes must then be resolved as merge conflicts manually.

Avoiding the “Diamond Chain”

When several developers working on the same software incorporate changes frequently, a commit history that looks like a diamond chain will be created. With rebasing, you can create a linear history with equivalent content.

Figure 8.1: Diamond necklace

The example in Figure 8.2 shows how rebasing works. A branch called feature-a was diverted from the master branch and has two commits, C and D. At the same time, master has been further developed and has one more commit, B.

Figure 8.2: Simple rebasing

Now you can merge the changes with git merge master and smooth out the history again using the rebase command. This command takes a parameter: the branch whose latest changes will be brought into the active branch.

> # Branch "feature-a" is active 
> git rebase master 

Upon receiving this command, Git will rebase the active branch (feature-a) onto master by doing the following.

  • Which commits? Git determines which commits in the active branch feature-a are not yet in the target branch (master). In our example, C and D.
  • Where? Git determines the target commit, the commit in master on which feature-a will be rebased to. In the example, B.
  • Copying the commits: Based on the target commit all changes in the commits to be rebased are replayed, creating commits C' and D'.
  • Resetting the active branch to the copy: The active branch is moved to the top of the copied commit. In the example, D'.

In many cases, you should not call the rebase command directly. Instead, you should use the --rebase option with the pull command to rebase the changes in the remote repository.

Note: The old commits C and D are incidentally still in the repository, even though they are no longer directly visible because the branch feature-a is now pointing to D'. However, C and D can still be accessed using their hashes, respectively. Only after a garbage collection with the gc command will they disappear from the repository.

And When It Comes to Conflicts?

Just like the merge command, the rebase command may also terminate with conflicts when changes do not match. There is one important difference, however: When merging you get results in a single commit which are the combination of changes from both branches. When rebasing, however, several commits are generated step by step again. If everything goes smoothly, the contents of the last commit will be the same as that of the result of the merge command, because Git uses the same algorithms to resolve conflicts for both commands. But if the rebase command encounters a conflict, the process will be interrupted, with files decorated with conflict markers. You can clean up the files manually or with a merge tool, then add them to the staging area. From there you can use the rebase command with the --continue option to proceed.

> git add foo.txt
> git add bar.txt
> git rebase --continue

You can also cancel the rebase command using the --abort option. You can skip the conflicted commit using the --skip option. It will then simply be omitted, i.e. their changes will not appear on the new branch.

Warning! Unlike in a merge, in an interrupted rebasing parts of the copied commits may have been applied.

Transplanting A Branch

Sometimes you already created a branch and its first commit. With the --onto option you can transplant a branch to another location in the commit graph.

Figure 8.3: Moving a branch

In the example in Figure 8.3, the feature-a branch is transplanted to release1.

> # Branch "feature-a" is active
> git rebase master --onto release1 

The first parameter to the rebase command specifies the original branch (in this case, master). Git then determines all commits in the active branch (feature-a) that are not in the original branch (in this example, E and F). These commits will then be copied to the location indicated by the --onto option (in this case, release1).

Note: The origin for the rebase command does not have to be a branch. It can also be any commit.

What Happens to the Original Commits after Rebasing?

Commits will be copied during rebasing. The originals (in this example, C and D) can still be accessed using their hashes. Normally, when no further branches are diverted from these commits, the next garbage collection (using the gc command) will simply remove them from the repository.

Figure 8.4: The old and new branches after rebasing

Why Is It Problematic to Have the Original and Copy Commits in the Same Repository?

Duplicates make the repository confusing. They can easily cause misunderstandings as to which branches a given code change is included in and which branches it is not included in. Normally, git log HEAD..a-branch shows which commits in a-branch are not yet in the current branch. If there are duplicates, the current branch may already contain the code change. This will complicate reviews and quality assurance.

In addition, there may be difficulties later when the branch with the duplicate commit is merged with the branch with the original commit. In the best case, Git recognizes that the same change occurred more than once, and apply them only once. In the worst case, if the duplicate commit was modified to resolve a conflict, Git does not detect this and then tried to apply the changes several times. This results in unexpected conflicts for the user.

Once you have transferred a commit to a remote repository, you should not move the commit with the rebase command. Otherwise, other developers could continue to work with the originals and problems will occur when it is time to merge the changes again.

Cherry-Picking

There is another way to copy a commit, namely by using the cherry-pick command. You specify which commit you want to have, and Git will create a new commit with the same changeset and metadata in the current branch.

> git cherry-pick 23ec70f6b0 

Here are the things you should know about cherry-picking:

  • cherry-pick does not take the history into account. merge and rebase can still classify renamed and moved files correctly. cherry-pick cannot.
  • Cherry-picking is sometimes used to transfer small bug fixes to various release versions.
  • Another application is to transfer useful changes from a feature branch that you want to remove.
  • Warning: Cherry-picking can lead to the problems with duplicate commits above.

Summary

  • Rebasing: Git can copy commits to other places in the commit graph. The changes and metadata (author, date) remain the same, but there will be a new commit hash. You can use the rebase command to rebuild the commit graph in many ways.
  • Just before the push: Normally you should use the rebase command only on commits that have not been transferred to other repositories. Otherwise, it could later lead to nasty merge conflicts.
  • Smoothing out the history: If you resolve conflicts during parallel development with the merge command, you will get a history with many branches and merges. If you use rebase instead of merge, you will get a linear history.
  • Conflicts during rebasing: Git plays the copied commits piece by piece again. If there is a conflict because the changes do not fit the workspace, the process will be interrupted. As with merge, the developer can resolve the conflict manually and continue the rebasing.
  • rebase --onto: With this option, you can move a branch to a completely different location in the commit graph.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.188.152.136