Chapter 8
A Cleaner History with Rebasing
Many branches in a commit history are unclear. Git makes it possible to straighten out the history. The most important tool for this is the rebase command, which can move the impact of a commit to another place in the commit graph. You can do this,
The Principle: Copying of Commits
The rebasing principle is simple: Git takes a sequence of commits that you want to move, and play it again on the target branch in exactly the same order. This produces for each of the original commits a true copy of the same changeset, the same author, date and comments.
Note: At first glance it looks like Git would move commits when rebasing. In fact, the “shifted” commits are always new commits with different commit hashes. This is important if more branches have already been diverted from the original commits.
Note: As the new commits are recorded at different points in the commit graph, naturally there can be conflicts because the changes do not fit there. Such changes must then be resolved as merge conflicts manually.
Avoiding the “Diamond Chain”
When several developers working on the same software incorporate changes frequently, a commit history that looks like a diamond chain will be created. With rebasing, you can create a linear history with equivalent content.
Figure 8.1: Diamond necklace
The example in Figure 8.2 shows how rebasing works. A branch called feature-a was diverted from the master branch and has two commits, C and D. At the same time, master has been further developed and has one more commit, B.
Figure 8.2: Simple rebasing
Now you can merge the changes with git merge master and smooth out the history again using the rebase command. This command takes a parameter: the branch whose latest changes will be brought into the active branch.
> # Branch "feature-a" is active > git rebase master
Upon receiving this command, Git will rebase the active branch (feature-a) onto master by doing the following.
In many cases, you should not call the rebase command directly. Instead, you should use the --rebase option with the pull command to rebase the changes in the remote repository.
Note: The old commits C and D are incidentally still in the repository, even though they are no longer directly visible because the branch feature-a is now pointing to D'. However, C and D can still be accessed using their hashes, respectively. Only after a garbage collection with the gc command will they disappear from the repository.
And When It Comes to Conflicts?
Just like the merge command, the rebase command may also terminate with conflicts when changes do not match. There is one important difference, however: When merging you get results in a single commit which are the combination of changes from both branches. When rebasing, however, several commits are generated step by step again. If everything goes smoothly, the contents of the last commit will be the same as that of the result of the merge command, because Git uses the same algorithms to resolve conflicts for both commands. But if the rebase command encounters a conflict, the process will be interrupted, with files decorated with conflict markers. You can clean up the files manually or with a merge tool, then add them to the staging area. From there you can use the rebase command with the --continue option to proceed.
> git add foo.txt > git add bar.txt > git rebase --continue
You can also cancel the rebase command using the --abort option. You can skip the conflicted commit using the --skip option. It will then simply be omitted, i.e. their changes will not appear on the new branch.
Warning! Unlike in a merge, in an interrupted rebasing parts of the copied commits may have been applied.
Transplanting A Branch
Sometimes you already created a branch and its first commit. With the --onto option you can transplant a branch to another location in the commit graph.
Figure 8.3: Moving a branch
In the example in Figure 8.3, the feature-a branch is transplanted to release1.
> # Branch "feature-a" is active > git rebase master --onto release1
The first parameter to the rebase command specifies the original branch (in this case, master). Git then determines all commits in the active branch (feature-a) that are not in the original branch (in this example, E and F). These commits will then be copied to the location indicated by the --onto option (in this case, release1).
Step by Step
Moving a branch
A branch has to be moved to another location in the commit graph.
1 If necessary, change to the branch to be moved
> git checkout the-branch
2. Determine the origin
There is the origin branch, from which the branch to be moved has diverted. Git will move all commits that are not in the origin branch.
3. Check to see what are being moved
It is advisable to check in advance which commits will be affected, because a wrong rebasing can lead to a very confusing situation in the repository.
> git log origin..the-branch
4. Determine the target point
Choose a branch target on which the shifted branch should be rebased to.
5. Perform the rebasing
> git rebase origin --onto target
Note: The origin for the rebase command does not have to be a branch. It can also be any commit.
What Happens to the Original Commits after Rebasing?
Commits will be copied during rebasing. The originals (in this example, C and D) can still be accessed using their hashes. Normally, when no further branches are diverted from these commits, the next garbage collection (using the gc command) will simply remove them from the repository.
Figure 8.4: The old and new branches after rebasing
Why Is It Problematic to Have the Original and Copy Commits in the Same Repository?
Duplicates make the repository confusing. They can easily cause misunderstandings as to which branches a given code change is included in and which branches it is not included in. Normally, git log HEAD..a-branch shows which commits in a-branch are not yet in the current branch. If there are duplicates, the current branch may already contain the code change. This will complicate reviews and quality assurance.
In addition, there may be difficulties later when the branch with the duplicate commit is merged with the branch with the original commit. In the best case, Git recognizes that the same change occurred more than once, and apply them only once. In the worst case, if the duplicate commit was modified to resolve a conflict, Git does not detect this and then tried to apply the changes several times. This results in unexpected conflicts for the user.
Once you have transferred a commit to a remote repository, you should not move the commit with the rebase command. Otherwise, other developers could continue to work with the originals and problems will occur when it is time to merge the changes again.
Cherry-Picking
There is another way to copy a commit, namely by using the cherry-pick command. You specify which commit you want to have, and Git will create a new commit with the same changeset and metadata in the current branch.
> git cherry-pick 23ec70f6b0
Here are the things you should know about cherry-picking:
Summary
18.191.181.252