The human gatekeeper workflow

This is one of the most common distributed workflows. In this workflow, collaborators have read-only access to the mainline branch, and can propose their own branches for merging into the mainline. The maintainer of the mainline is the gatekeeper, who reviews merge proposals and either accepts and merges the branch into the mainline or rejects the proposal with comments.

If a branch was rejected, its author can fix the problems and commit them in the same branch, and propose it again for merge. This cycle can continue for as long as necessary, until finally the branch can be accepted and merged into the mainline.

Overview

The general flow with two collaborators and a gatekeeper looks similar to the following:

Overview

The important points to be noted are as follows:

  • Collaborators do not have write access directly on the mainline branch
  • Collaborators can propose their branches for merging to the gatekeeper
  • It is up to the gatekeeper whether or not to accept or reject a merge proposal

In the same way that Jack and Mike branched from the mainline and worked on their local branch, the gatekeeper too does the same. In this example, the gatekeeper first merged and accepted Mike's branch, and then merged and accepted Jack's branch. At this point, the gatekeeper can push his branch to the mainline, resulting in the following:

Overview

At this point, since the mainline contains all the revisions of Mike and Jack, they can update their local branches by using bzr pull to make them exact mirrors of the mainline. Had they added new revisions after the time their branches were merged, bzr pull would not have worked, as the branches would have diverged.

Note

Ideally, the proposed branch should be a feature branch, with revisions that implement a single feature, bugfix, or specific improvement. To work on another feature, it is better to start afresh by creating a new branch from the current mainline rather than reusing the same one.

Setting guidelines to accept merge proposals

To reduce the turnaround of merge proposals and rejections, it is a good idea to keep a public list of the guidelines used when evaluating merge proposals. In this way, collaborators can know in advance what to watch out for, and avoid common pitfalls, increasing the chance that their merge proposal will be accepted, thereby making the review process more smooth and efficient. The guidelines can be general practices such as:

  • The changes should not break anything
  • Pass automated tests, such as unit tests or schema validation
  • Conform to the general best practices of the relevant domain
  • The changes should be about a single feature, bugfix, or some specific improvement
  • The changes should be in line with the project strategy, not deviating from the main direction

Since the gatekeeper is a human, inevitably there may be some subjective criteria such as:

  • Coding style (in software development projects)
  • Writing style (in professional writing or translation)
  • The changes should be "readable"; the gatekeeper may reject anything that is not clear to understand

The gatekeeper's job is easiest if the branch proposed for merge passes all the guidelines and automated tests. In that case, it can be simply accepted and merged into the mainline.

If the proposed branch does not meet some of the guidelines, it is best to reject the proposal with appropriate comments listing what to fix. The gatekeeper should keep rejecting a branch multiple times, if necessary, until it meets all the guidelines.

If the proposed branch does not meet all the guidelines but represents a significant improvement, then it might be tempting for the gatekeeper to work on the branch himself in order to make it pass the guidelines. In short, this way the improvement will get into the mainline faster, thanks to skipping the turnaround time between the gatekeeper and the collaborator. However, in the long term, this may very well end up slower and more costly for the gatekeeper. If the gatekeeper does not consistently reject branches that violate the guidelines, then the collaborator may never learn to play by the rules, and the gatekeeper will have to keep fixing the same mistakes over and over again.

The guidelines should be well understood by all the collaborators of the project to avoid frustration and unnecessary turnarounds. Collaborators who are lazy to read and understand all the guidelines will eventually get it after their merge proposals are rejected a few times.

The longer or more rigorous the list of criteria, the more difficult to join the project. This can be a very important point in open source software projects. In a new project, you probably don't want to impose too many rules at first, as that can discourage early contributors.

On the other hand, if there are not enough guidelines, it is likely to result either in a lot of rejected proposals, or a lot of extra work for the gatekeeper. The right balance depends upon the project and the team.

The role of the gatekeeper

At the minimum, the gatekeeper should enforce the common guidelines and best practices of the project, thereby ensuring continued high quality.

Merge proposals should not be accepted blindly, even if they have passed the common guidelines. It is crucial that the gatekeeper fully understands the changes introduced by a merge proposal, and its overall impact on the project. Therefore, naturally, the gatekeeper should be somebody with a firm grasp of the entire project and its future direction.

In addition to knowing the project through and through, the gatekeeper role typically involves a lot of interaction with the authors of merge proposals, for clarification or discussion on the new changes. Therefore, communication skills are also very important.

Creating a merge proposal

In order to propose a branch for merging, the collaborator must make the branch available (visible) to the gatekeeper in some way. There are several ways to do that:

  • Using a Bazaar hosting site
  • Sharing the branch URL with the gatekeeper
  • Creating and sending a merge directive

Using a Bazaar hosting site

Ideally, projects should use a Bazaar hosting site such as Launchpad.net, where members of the project can have their own workspaces to share branches with the gatekeeper and other collaborators. Using such a site can greatly simplify the process of submitting and evaluating merge proposals.

Launchpad is a collaboration and hosting platform for software projects. The merge proposal process works as follows:

  1. Upload the completed feature branch to your account on Launchpad using a push operation
  2. Use the web interface to visit your branch and propose it for merging into another branch, along with a description and other options that may help the gatekeeper and other reviewers. The merge proposal triggers an e-mail notification to the gatekeeper to bring attention to the new branch ready for merging

A Bazaar hosting site, such as Launchpad, can be very useful as the central hub of a project, where collaborators can find the mainline branches, or push their own branches to propose for merging. Launchpad has many other useful features, which we will cover in the next chapter.

Sharing the branch URL with the gatekeeper

The gatekeeper can review and merge from any branch that is accessible by a protocol supported by Bazaar. For example, the collaborator can publish a branch with bzr push to a website, FTP server, remote filesystem, SSH server, or anywhere that is accessible by the gatekeeper.

After making a branch available, the collaborator should tell the URL of the branch to the gatekeeper, along with a brief summary of the changes.

For example, if you run the website http://example.com/, and the files of the website are served from the directory /var/www/example.com/, which you can access using SSH, then you can push your branch with the following command:

$ bzr push bzr+ssh://[email protected]/var/www/example.com/feat12

As a result, the branch will become visible at the URL http://example.com/feat12, and the gatekeeper can run Bazaar commands to inspect it and merge from it. For example:

$ cd /path/to/local/shared/repository
$ bzr branch http://example.com/feat12
$ bzr info feat12
$ bzr qlog feat12
$ cd mainline
$ bzr merge ../feat12

Sending a merge directive

If it is not possible to make a branch accessible to the gatekeeper via a URL, the best alternative is to generate a merge directive and send it by an e-mail.

A merge directive is like a "mini-branch" packaged into a single file, which can be applied to other branches by using bzr merge or bzr pull. A merge directive contains only the necessary revisions to merge from a source branch to a submit branch. By default, the source branch is the current branch, and the submit branch is either a previously saved submit branch or the parent branch.

To demonstrate the use of merge directives, let's fetch two sample branches into a shared repository:

$ cd /sandbox
$ bzr init-repo using-merge-directives
Shared repository with trees (format: 2a)
Location:
  shared repository: using-merge-directives
$ bzr branch lp:~bzrbook/bzrbook-examples/hello-start trunk
Branched 6 revisions.
$ bzr branch lp:~bzrbook/bzrbook-examples/hello-fix-c fix-c
Branched 8 revisions.

Now, we have two branches—the trunk, and a feature branch that fixes a bug. Imagine that you have fixed a bug in your local branch, but you have no way to give access to this branch to the gatekeeper. In such a situation, your next best option is to create a merge directive and e-mail it to the gatekeeper.

Creating a merge directive

You can create a merge directive by using the bzr send command and specifying the destination branch, where the merge should be applied. You must specify either an e-mail address with the --mail-to option or a filename with the --output or -o option. If you specify an e-mail address, Bazaar will open the default e-mail application, pre-filled with the content of the merge directive. Alternatively, you can save the merge directive to a file and e-mail it later.

For example, we can create a merge directive from the example fix-c branch to the trunk, as follows:

$ cd /sandbox/using-merge-directives/fix-c
$ bzr send --output -
Bundling 2 revisions.
# Bazaar merge directive format 2 (Bazaar 0.90)                  

# revision_id: janos@axiom-20130303203100-3uy33a4q96ux5u9c
# target_branch: ../trunk/
# testament_sha1: 1686e71d4453af6b4b086831179bf55faac7729b
# timestamp: 2013-04-04 06:20:56 +0200
#   examples/hello-start
# base_revision_id: janos@axiom-20130303141948-m5zhycy23bkvs2xv
#
# Begin patch
=== modified file 'hello.c'
--- hello.c     2013-03-03 14:14:35 +0000
+++ hello.c     2013-03-03 20:31:00 +0000
@@ -1,5 +1,5 @@
-#include "stdio.h"
+#include <stdio.h>
  
 int main() {
-    printf("Hello World!");
+    printf("Hello World!
");
 }
# Begin bundle
IyBCYXphYXIgcmV2aXNpb24gYnVuZGxlIHY0CiMKQlpoOTFBWSZTWcYlJSIAApdfgEAQeGP//1LQ
...

The merge directive starts with a header, with important parameters describing the mini-branch, such as the storage format used by the revisions, the latest revision ID, and the base revision ID.

By default, the merge directive includes an optional patch, which can be helpful especially when the changes are small, like in this example, so that the recipient of the merge directive can get a quick idea of the changes just by reading the e-mail. With larger changes, this might not be all that useful as it is easier to read large changes using Bazaar Explorer's Diff view. In this case, it may be better to completely omit the patch using the --no-patch flag.

When using the --mail-to option to e-mail the merge directive instead of saving it in a file, Bazaar will launch the e-mail client configured in the global mail_client setting. You can change this setting by using Bazaar Explorer, from the menu option Setting | Configuration | User Configuration or by launching bzr qconfig, or by editing the bazaar.conf file in your Bazaar configuration directory. The "default" value in this setting means Bazaar will use the preferred e-mail client configured in your system.

Merging from a merge directive

A merge directive can be used in the bzr merge and bzr pull operations as if it was a regular branch. To demonstrate this, let's create a merge directive from the fix-c branch to the trunk, and then try to merge it in the trunk:

$ cd /sandbox/using-merge-directives/fix-c
$ bzr send -o ../merge-directive.out ../trunk
 M  hello.c                                                                    
All changes applied successfully.
$ bzr diff
=== modified file 'hello.c'
--- hello.c     2013-03-03 14:14:35 +0000
+++ hello.c     2013-04-08 05:04:26 +0000
@@ -1,5 +1,5 @@
-#include "stdio.h"
+#include <stdio.h>
  
 int main() {
-    printf("Hello World!");
+    printf("Hello World!
");
 }
$ bzr status -v
modified:
  hello.c
pending merges:
  Janos Gyerik 2013-03-03 use more modern include-style
    Janos Gyerik 2013-03-03 c impl should add newline

The result is exactly the same as when merging from a real branch—changes are applied, and the revision history will be correctly preserved.

Merge directives without revision content

If the source branch is visible by a public URL, or if it has a public mirror, then it can be a good idea to omit the bundle from the merge directive in order to make it lighter, since in this case, the recipient can find the revisions in the public URL. For this to work, the public URL of the source branch must be specified on the command line or in the branch configuration file .bzr/branch/branch.conf with the public_branch setting. Use the --no-bundle flag to create a merge directive without a bundle. For example:

$ cd /sandbox/using-merge-directives/fix-c
$ bzr send ../trunk/ -o- --no-bundle --no-patch lp:~bzrbook/bzrbook-examples/hello-fix-c
# Bazaar merge directive format 2 (Bazaar 0.90)                                
# revision_id: janos@axiom-20130303203100-3uy33a4q96ux5u9c
# target_branch: ../trunk/
# testament_sha1: 1686e71d4453af6b4b086831179bf55faac7729b
# timestamp: 2013-04-08 06:45:31 +0200
# source_branch: lp:~bzrbook/bzrbook-examples/hello-fix-c
# base_revision_id: janos@axiom-20130303141948-m5zhycy23bkvs2xv
#

In this case, the merge directive file is much smaller, and instead of a bundle at the end, the public URL of the branch is included in the header as source_branch. When running this command, Bazaar verifies that the public URL is indeed a Bazaar branch and that it contains the latest revision of the current branch, otherwise the recipient won't be able to perform the merge.

Rejecting a merge proposal

The gatekeeper should carefully verify a merge proposal before accepting it, and put it through various tests. For example:

  • Try to merge from the branch and see if there are any conflicts. This could be a warning sign, though it may not necessarily mean that the author did something wrong.
  • Verify that the project is still working well after the merge.
  • Run automated or manual non-regression tests.
  • Look for inefficiencies that may cause problems and should be improved before merging.
  • Verify that the general guidelines of the project are followed correctly.
  • Needless to say, the changes should be in line with the long-term strategy of the project.

If there are any problems at any step, the gatekeeper may need assistance from the author to complete the merge. In order to ensure the continued high quality of the project, the gatekeeper must be wise, and should reject merge proposals that are not good enough.

When rejecting a merge proposal, the gatekeeper should explain to the author about the necessary improvements to make, in order to get the branch accepted. The author should continue working on the branch and commit more revisions that fix the issues that were pointed out by the gatekeeper. When ready, the author should propose the branch for merging again.

This cycle should continue as long as necessary, until the branch is approved by the gatekeeper. It is not fun for either party. Evaluating branches that have obvious problems that could have been avoided by following the guidelines is a waste of time for the gatekeeper, while getting rejected is frustrating for the branch author. It is important to remain patient, tolerant, and respectful during the process.

Although some problems can be fixed by the gatekeeper, it is better to let the branch contributor do it, in order to learn and stop making the same mistakes in future.

By rejecting merge proposals, the gatekeeper has the power to enforce the best practices documented in the project, even if some collaborators may be reluctant to do so.

Accepting a merge proposal

As always, when merging from a remote branch, it is a good idea to first fetch the remote branch, ideally into a shared repository. For example:

$ cd /path/to/shared/repo
$ bzr branch BRANCH_URL

In this way, you can run various commands to inspect the branch without unnecessarily paying the network overhead in each operation. For example:

$ cd the_branch
$ bzr info
$ bzr qlog
$ bzr missing ../mainline
$ bzr diff --old ../mainline

If you notice issues with the branch at this point, you can point them out to the branch author and ask to work on the branch some more. After the author updates the branch, you can do a bzr pull to bring your local mirror up-to-date.

If the branch passes the initial tests, try to merge it into your local mirror of the mainline, after making sure that the mirror is clean and up-to-date. For example:

$ cd ../mainline
$ bzr status
$ bzr pull
$ bzr merge ../the_branch

If the merge results in conflicts, which may be a warning sign, it does not necessarily mean that it is the fault of the branch author. Investigate, and if necessary, ask the branch author to help resolve the conflicts. You can also try to redo the merge by using a different algorithm with bzr remerge, or by completely aborting the merge with bzr revert.

After all conflicts are resolved, make sure to understand the meaning of the changes and verify carefully that the project is still working well, running automated or manual non-regression tests, and validating the common guidelines of the project.

If everything is in order, commit the merge with a short summary of the changes made in the branch, and push it to the central server to make it available to other team members:

$ bzr commit -m 'implemented feature X'
$ bzr push

Note

If doing bzr push for the first time, you may have to specify the parent location with bzr push :parent.

The gatekeeper must be wise and responsible, and therefore very careful when accepting changes in order to ensure the continued high quality of the project.

When working with a user-friendly Bazaar hosting site, such as Launchpad, the bzr push step should trigger an automatic e-mail to notify the author that the merge proposal was accepted and the branch was successfully merged.

Reusing a branch

Whenever possible, it is best to create a clean new branch from the mainline for each new feature, bugfix, or other specific improvement. When a feature is complete, propose the branch for merging and start a completely new branch from the latest version of the mainline in order to work on the next improvement.

However, sometimes this may not be practical, and it may be tempting to continue working in the same branch, even after it has already been merged into the mainline; for example, in situations similar to the following:

  • There are many configuration files in the working tree that are required when working on the project, but cannot be added to version control because they are specific to the local working environment of each collaborator
  • After the branch was proposed for merge and while waiting for the gatekeeper to accept or reject, you need to start working on the next feature that depends on the changes in the pending merge proposal
  • The working tree is quite large, and thus keeping multiple working trees will be a waste of disk space

The cleanest way of reusing a branch is to wait until the merge proposal is accepted and merged into the mainline, then synchronize the local branch with the mainline using a pull operation. The pull operation will copy all the missing revisions and convert the branch to a perfect mirror of the mainline, and you may continue to work on the next improvement or bugfix.

Note

Another way to re-use a branch is to merge from the mainline, but this leads to a messy history that's difficult to read, and the gatekeeper may also have issues with the criss-cross merges when the mainline and a collaborator branch are merged into each other repeatedly.

This effectively means working on the branch in lockstep with the gatekeeper:

  1. You begin new work from a clean state, synchronized with the mainline.
  2. When your improvement is completed, submit a merge request and wait for the gatekeeper to review the merge proposal and take action.
  3. The gatekeeper may reject the proposal and ask you to improve the branch.
  4. After the merge proposal is accepted and the branch is merged into the mainline, you can pull from the mainline to return the branch to the clean, synchronized state, and begin working on the next improvement.

This is a clean way for re-using a branch for multiple improvements, with the limitation that you have to work on improvements one by one, and refrain from committing new revisions while a merge proposal is still pending, practically working on the branch in lock-step with the gatekeeper.

There is a way for re-using a working tree to work on multiple branches by using lightweight checkouts and switching branches. This is an advanced setup, which will be explained in Chapter 8, Using Advanced Features of Bazaar.

Commander/Lieutenant model

As the project grows, it may become increasingly more difficult for the gatekeeper to oversee all the changes going into the different parts of the project. When the project reaches a point where the gatekeeper's job becomes impossible, the workflow can be scaled up by adding more gatekeepers, and splitting their responsibilities over different parts or modules of the project.

In very large projects, there can be several gatekeepers who oversee different parts of the project. This is often called the Commander/Lieutenant or Dictator/Lieutenant model. In this model, there are two levels of gatekeepers—Lieutenants review the merge proposals of the collaborators within their defined perimeters, but instead of merging collaborator branches into the mainline, they merge them into their own branches. The Commander works mostly with Lieutenants, reviewing their merge proposals and merging them into the mainline. In other words, the Commander is the gatekeeper of Lieutenants.

At the level of the Commander, it may be practically impossible to understand in depth all the individual changes going into the project. Instead, the Commander must focus on the higher-level logic of the proposed changes, and trust the Lieutenants' judgment on lower-level details.

Switching from the peer-to-peer workflow

Switching from the peer-to-peer workflow to the human gatekeeper workflow requires the following changes in the working style:

  • Dedicate a mainline branch that is only updated by the gatekeeper
  • The new work should start from the mainline branch, not from the branch of another collaborator
  • Collaborators should avoid merging from each other directly
  • Collaborators should avoid re-using the same branch for multiple features, and always start the new work in a clean, new branch based upon the mainline branch
  • The mainline branch should have mostly merge commits only, no other changes

If the peer-to-peer example at the beginning of the chapter had been using the human gatekeeper workflow, the revision graph would have become something similar to the following:

Switching from the peer-to-peer workflow

We can arrive at this graph by replacing the revisions that were merges from collaboratorB to collaboratorA with a merge from collaboratorB to the mainline, followed by a new branch from mainline to collaboratorA. Bazaar Explorer does a much better job at rendering such graphs:

Switching from the peer-to-peer workflow

This is not the best example, because there are too many branches here, with only a single revision. In reality, feature branches often have several revisions, and grouping them together gives a very useful, high-level overview of the larger steps in the evolution of the project.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
13.58.121.8