Chapter 7

Merging Branches

One of the most important Git operations is the merging of branches with the merge command. The underlying algorithms are complex, but the call is easy. You do that by specifying the name of the branch whose changes are to be integrated. Git then creates a new commit that contains the merged content.

Figure 7.1 shows an example: While some developers keep on working on a branch named feature, another developer has just fixed an error on the master branch (commit E). Shortly thereafter, the feature is completed and will also be delivered. The next version on the master branch should contain both the fix and the new feature. With the merge command carried along the branches, the result is a merge commit (in this case F) which has two predecessors (D and E).

> # on the branch "master"
> git merge feature

Figure 7.1: Merging branches

What Happens during A Merge?

One of the objectives of Git is to make collaboration of distributed developers as easy as possible. Therefore, to a large extent the merge command merges branches automatically, without user interaction. But how is it possible?

Figure 7.2 shows two different versions of a file, in branch a and branch b, respectively. It is pretty easy to see which rows are different. But which variant is the right one? “Freitag” or “Montag”? “Git” or “Fit”? How should the merge algorithm decide?

The key often lies in the commit history. The trick is to find the last common ancestor. In somewhat simplified terms this is the point where the paths of the branches have separated. If you compare the original version with the variations in the branches, the picture will become clearer.

Figure 7.2: Two versions: Which one is correct?

Figure 7.3: 3-way view

In the example in Figure 7.3 you can see that in the first line “Freitagabend” was replaced by ”Montagabend” in branch b. In branch a, the first line was not changed. This is a strong indication that you should take “Montagabend” when merging the two branches. In the same way, it is safe to conclude that you should take “Git” and not “Fit” in the last line. Figure 7.4 shows the result.

Figure 7.4: Merge result

Indeed, it is not so easy to find the common ancestor. As such, Git implements three different merge algorithms. The default is the recursive algorithm. The classic 3-way and the “octopus” algorithms are also implemented. “Octopus” can bring together many branches simultaneously.

Conflicts

Git is very good at merging changes in program source code when several developers have made changes in multiple places in the software. This often works even if the affected files have been moved or renamed. Unfortunately, there are still always conflicts that Git cannot resolve automatically.Edit conflicts occur when two developers have changed the same lines of code differently and Git cannot decide which of the two is the correct change.Content conflicts occur when two developers have changed several parts of the code. This happens, for example, when a developer changed a function and another developer changed the same function at the same time.

Edit Conflicts

If Git is unable to resolve a conflict, it will display an error message.

> git merge one-branch 
Auto-merging foo.txt 
CONFLICT (content): Merge conflict in foo.txt 
Automatic merge failed; fix conflicts and then commit the result.

The following is what happens:

  1. 1. Git has not created a commit. Normally Git creates a commit automatically after a merge. In the event of a conflict, you must first resolve the problem and then create a commit manually.
  2. 2. In .git/MERGE_HEAD there is the commit hash of another branch.
  3. 3. The files in the workspace reflect the merge result.
  4. 4. Conflict-free merged changes are logged in the staging area, ready for the next commit.
  5. 5. Conflict markers are inserted.
  6. 6. The points of conflict are not yet registered for the next commit.

The status command now displays the files that were automatically merged in the section “Changes to be committed”. In the section “Unmerged paths” it shows the files that the user must manually edit.

> git status
# On branch master 
# You have unmerged paths. 
#   (fix conflicts and run "git commit") 
#
# Changes to be committed: 
# 
# modified:   blah.txt 
#
# Unmerged paths: 
#   (use "git add <file>..." to mark resolution) 
# 
# both modified:       foo.txt
#

Conflict Markers

A conflict marker shows both variants. First come the lines as they looked on the current branch (HEAD). Next to it you can see how they look like on the other branch (MERGE_HEAD, here one-branch):

In the early morning dew
<<<<<<< HEAD 
to the valley 
======= 
 for swimming 
>>>>>>> one-branch 
We’re going. Fallera!;

For historical reasons, the common ancestor is not displayed by default. However, you can configure the 3-way format:

> git config merge.conflictstyle diff3

An edit conflict will then be represented as follows:

In the early morning dew 
<<<<<<< HEAD
to the valley
||||||| merged common ancestors
to mountains
=======
for swimming
>>>>>>> one-branch
We’re going Fallera!;

Resolving Edit Conflicts

Your best bet to resolve edit conflicts would be to use a merge tool, like kdiff3. You start the merge tool using the mergetool command.

> git mergetool

Here you can resolve the conflicts, save the changes and terminate the application. Afterward, the merged changes will be in the staging area and can be confirmed with a commit.

For binary files, there is no textual conflict marker. Here you have to look at the original versions. Three versions of the file play a role in the conflict: the version on the current branch (ours), the version on the other branch (theirs), and the last common ancestor of these two branches (ancestor). The show command can be used to retrieve these versions.

> git show :1:picture.png  >ancestor.png 
 > git show :2:picture.png  >ours.png 
> git show :3:picture.png  >theirs.txt

Merge and diff tools usually also show changes in the whitespace. If, for example, a developer has replaced tabs with spaces, all rows will be marked, although the content has probably not changed. The tools usually come with an option to ignore whitespace changes. You should use this option.

It is even better of course if all developers use the same automatic source code formatter, which would then rule out formatting as a source of conflict.

Accidents happen! If you make a merge mistake or when trying to revolve a conflict, then you should not continue. Instead, you should explicitly cancel the merge so that there is no trace of merging in the workspace, and so that Git will not mark the next commit a merge commit.

A merge can be canceled using the reset command:

> git reset --merge 

What about the Content Conflict?

The real problem is the content conflict, because Git does not recognize it and certainly cannot resolve it automatically. The real danger is when the merge command produces a valid merge commit when there is a content conflict.

Attention! Even if all merged versions are correct and Git has reported no editing conflicts, a merge commit may be broken!

If you want to prevent content conflicts from messing with software versions, you should do more:

  • Protection through automated tests: If these are carried out regularly and you have a good coverage, you will discover contents conflicts quickly.
  • Assertions, pre- and post-conditions: The more assertions you check explicitly, the sooner you will recognize problems.
  • Clear interfaces, loose coupling: The cleaner the architecture is at this point, the less likely surprising side effects of code changes in different places will creep in.
  • Static type checking: If your programming language supports this, problems caused by signature changes will be detected at compile time.

Incidentally, it is valid to specify multiple branches to be merged in the merge command. This is what we call an octopus merge.

Fast-Forward Merges

Often the following occurs: There are several branches, but work continues only on one branch. In the project in Figure 7.5, the developers have been developing on a-branch, and nothing is happening on b-branch. When executing a merge with a-branch on b-branch, Git makes it simple: It simply moves the pointer forward. There will be no merge commit. This is called a fast-forward merge.

> git checkout b-branch 
> git merge a-branch
Updating 9d4caed..9332b08 
Fast-forward 
 foo.txt   |   2 +- 
1 files changed, 1 insertions(+), 1 deletions(-)

Figure 7.5: Fast-forward merge

The advantage of a fast-forward merge is that the history remains simple and linear. The disadvantage is that you cannot see in the history that a merge has occurred. Because of this disadvantage, in some workflows in this book we use the --no-ff option to force a new commit to occur (See Figure 7.6).

> git merge --no-ff a-branch 

Figure 7.6: No fast-forward merge

First-Parent History

A merge commit usually has two predecessors, even though there can be more than two predecessors in an octopus merge. In the following example, there are two preceding commits ed1c70e and f1d55be.

> git log --merges
commit 7f3eae07c42df05f894fdd4754e38ab9e66a5051
Merge: ed1c70e f1d55be
Author: ...

The first specified commit in the example above (ed1c70e) is called the first parent. It is the commit that was HEAD when the merge was performed. This indicates where the merge has occurred.

If all developers are working on the same branch, then the result of when and where merges are performed is arbitrary. In this case, which one if the first-parent commit is rather uninteresting.

On the other hand, when developing with feature branches where one feature after another is integrated on the feature branch, the result of the integration branch (in this example, master) is a sequence of merge commits (See Figure 7.7). The first parent is always the merge commit of the previous feature.

Figure 7.7: First-parent history

If you follow the first parent chain all the way down to the root, you will get an overview of the feature integrations. This episode is called the first-parent history. You can use the --first-parent option of the log command to display the feature integrations:

> git log --first-parent --oneline R1.0..master 
7f3eae0 Merge branch 'Feature-C' Finished (M4) 
ed1c70e Merge branch 'Feature-A' Finished (M3) 
eeb6ec2 Merge branch 'Feature-B' Finished (M2) 
8ce3213 Merge branch 'Feature-A' Partial delivery (M1)

The beauty of the first-parent history is that it provides a summarized presentation of the history. You can see what features have been integrated, without having to examine every single commit of the feature branches.

Attention! This only works if you do not perform any fast-forward merges on the integration branch. Otherwise, individual commits of the feature branches would be placed directly in the first-parent history of the master.

Attention! In addition, you should not perform internal merges on the integration branch (here: master). Instead, make sure that the features are all integrated in succession, so that you get a linear history of feature merges.

Tricky Merge Conflicts

Most Git merges are done automatically with little or no manual assistance. If two branches have evolved into very different branches, there may be tricky conflicts.

In this section we only talk about merges involving two branches. If you encounter a problem related to an octopus merge, you should cancel the merge and try to address the issue branch by branch.

The important thing is, start by collecting information to understand what is happening in the branches. Here, using the .. notation in the log command might help. For example, a..b denotes commits from branch b that are not in branch a. It can show what “we” have done (on the current branch) that will not be listed by commits in the other branch.

> git log MERGE_HEAD..HEAD 

Conversely, you can display what “the others” have done.

> git log HEAD..MERGE_HEAD

A graphical representation of branching can also be useful.

> git log --graph --oneline --decorate HEAD MERGE_HEAD

You can restrict the output of the log command to merge commits using the --merge option.

> git log --merge 

Also, comparing the original version with the tips of the branches can be useful. This requires the merge base, i.e. the common ancestor of the branches in the merge.

> git merge-base HEAD MERGE_HEAD 
ed3b1832c48b359111d00bddb071c42ba6f38324 
> git diff --stat ed3b18 HEAD         % Our changes
> git diff --stat ed3b18 MERGE_HEAD   % Changes by others 

You can also use the difftool command if you want to have a graphical tool instead of a text output.

Now you can see which developers are involved in the conflict. It is best if you can bring everyone to the table, then everyone can ensure that his or her changes are correctly included in the merging.

It will be harder if the others are not available, because you are not normally versed in the other branch. Technically speaking, a merge is a symmetric operation. In our mind, however, we often have an asymmetric view. You should ask yourself this question: ”How do I include other people’s code in my own code?” Sometimes it helps to reverse the question, taking the other’s version level as a starting point, and figuring out how you would integrate your changes there. Sometimes the change of perspective helps.

Regardless, Somehow It Will Work

Pressed for time, you may be tempted to simply pick one or the other variant of the code shown by the merge tool. You should resist this temptation. If after analysis of diffs and logs and the help of the “other” versions you are still unsure how to resolve the conflict, you should cancel the merge. A few possible strategies are then:

  • Restructuring the branch: The cleanest solution probably consists of cleaning up one of the branches by refactoring and interactive rebasing. However, this is a lot of work.
  • Merging in small steps: If one of the two branches of finely-granular commits exists, you can proceed with one commit at a time. The advantage of this approach is that with smaller commits conflicts are usually easier to resolve. That can be time consuming if the number of commits is high, though. In any case, it is recommended that you create a local branch for this.
  • Discarding and cherry-picking: In some cases it is better not to accept changes in an inferior branch. Some improvements can be taken with the cherry-pick command.
  • Rating and testing: If the affected functionality can pass the test, you can of course try to guess when trying to resolve a conflict, and to improve the outcome until all tests pass.

Summary

  • Merge: A merge is the merging of branches in the commit graph.
  • Merge commit: The result of the merge command is a merge commit.
  • 3-way merge: Git uses the commit graph to find the last common ancestor when merging. Git then takes the changes that have taken place on a branch since the ancestor, along with the changes that have been made on the other branch. As long as the changes are happening at different code points, Git will be able to create a merge commit automatically.
  • Conflict: A point in the code where Git cannot automatically merge, perhaps because the same line has been changed differently, is called a conflict.
  • Content conflict: Changes often take place at different locations, but still the contents do not match. Git cannot detect such content conflicts. A project should have its own precautions, such as automated testing, in order to protect yourself from content conflicts.
  • Fast-forward merge: It is quite common that one of the branches during a merge is the ancestor of the other branch. In this case, Git simply moves the branch pointer forward. There is no merge commit necessary.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.219.34.62