Commit History

Viewing Old Commits

The primary command to show the history of commits is git log. It has more options, parameters, bells, whistles, colorizers, selectors, formatters, and doodads than the fabled ls. But don’t worry. Just as with ls, you don’t need to learn all the details right away.

In its parameterless form, git log acts like git log HEAD, printing the log message associated with every commit in your history that is reachable from HEAD. Changes are shown starting with the HEAD commit and working back through the graph. They are likely to be in roughly reverse chronological order, but recall Git adheres to the commit graph, not time, when traveling back over the history.

If you supply a commit à la git log commit, the log starts at the named commit and works backward. This form of the command is useful for viewing the history of a branch:

$ git log master

commit 1fbb58b4153e90eda08c2b022ee32d90729582e6
Merge: 58949bb... 76bb40c...
Author: Junio C Hamano <[email protected]>
Date:   Thu May 15 01:31:15 2008 -0700

    Merge git://repo.or.cz/git-gui

    * git://repo.or.cz/git-gui:
      git-gui: Delete branches with 'git branch -D' to clear config
      git-gui: Setup branch.remote,merge for shorthand git-pull
      git-gui: Update German translation
      git-gui: Don't use '$$cr master' with aspell earlier than 0.60
      git-gui: Report less precise object estimates for database compression

commit 58949bb18a1610d109e64e997c41696e0dfe97c3
Author: Chris Frey <[email protected]>
Date:   Wed May 14 19:22:18 2008 -0400

    Documentation/git-prune.txt: document unpacked logic

    Clarifies the git-prune manpage, documenting that it only
    prunes unpacked objects.

    Signed-off-by: Chris Frey <[email protected]>
    Signed-off-by: Junio C Hamano <[email protected]>

commit c7ea453618e41e05a06f05e3ab63d555d0ddd7d9

...

The logs are authoritative, but rolling back through the entire commit history of your repository is likely not very practical or meaningful. Typically, a limited history is more informative. One technique to constrain history is to specify a commit range using the form since..until. Given a range, git log shows all commits from since up to and including until. Here’s an example:

$ git log --pretty=short --abbrev-commit master~12..master~10

commit 6d9878c...
Author: Jeff King <[email protected]>

    clone: bsd shell portability fix

commit 30684df...
Author: Jeff King <[email protected]>

    t5000: tar portability fix

Here, git log shows the commits between master~12 and master~10, or the 10th and 11th prior commits on the master branch. You’ll see more about ranges in Commit Ranges.

The previous example also introduces two formatting options, --pretty=short and --abbrev-commit. The former flag adjusts the amount of information about each commit and has several variations, including oneline, short, and full. The latter simply requests that hash IDs be abbreviated.

-p prints the patch, or changes, introduced by the commit:

$ git log -1 -p 4fe86488

commit 4fe86488e1a550aa058c081c7e67644dd0f7c98e
Author: Jon Loeliger <[email protected]>
Date:   Wed Apr 23 16:14:30 2008 -0500

    Add otherwise missing --strict option to unpack-objects summary.

    Signed-off-by: Jon Loeliger <[email protected]>
    Signed-off-by: Junio C Hamano <[email protected]>

diff --git a/Documentation/git-unpack-objects.txt 
b/Documentation/git-unpack-objects.txt
index 3697896..50947c5 100644
--- a/Documentation/git-unpack-objects.txt
+++ b/Documentation/git-unpack-objects.txt
@@ -8,7 +8,7 @@ git-unpack-objects - Unpack objects from a packed archive

 SYNOPSIS
 --------
-'git-unpack-objects' [-n] [-q] [-r] <pack-file
+'git-unpack-objects' [-n] [-q] [-r] [--strict] <pack-file

Notice the option -1 as well: it restricts the output to a single commit. You can also type -n to limit the output to at most n commits.

The --stat option enumerates the files changed in a commit and tallies how many lines were modified in each file:

$ git log --pretty=short --stat master~12..master~10

commit 6d9878cc60ba97fc99aa92f40535644938cad907
Author: Jeff King <[email protected]>

    clone: bsd shell portability fix

 git-clone.sh |    3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

commit 30684dfaf8cf96e5afc01668acc01acc0ade59db
Author: Jeff King <[email protected]>

    t5000: tar portability fix

 t/t5000-tar-tree.sh |    8 ++++----
 1 files changed, 4 insertions(+), 4 deletions(-)

Tip

Compare the output of git log --stat with the output of git diff --stat. There is a fundamental difference in what’s displayed. The former produces a summary for each individual commit named in the range, whereas the latter prints a single summary of the total difference between two repository states named on the command line.

Another command to display objects from the object store is git show. You can use it to see a commit:

$ git show HEAD~2

or to see a specific blob object:

$ git show origin/master:Makefile

In the latter git show, the blob shown is the Makefile from the branch named origin/master.

Commit Graphs

Object Store Pictures introduced some figures to help visualize the layout and connectivity of objects in Git’s data model. Such sketches are illuminating, especially if you are new to Git; however, even a small repository with just a handful of commits, merges, and patches becomes unwieldy to render in the same detail. For example, Figure 6-3 shows a more complete but still somewhat simplified commit graph. Imagine how it would appear if all commits and all data structures were rendered.

Yet one observation about commits can simplify the blueprint tremendously: each commit introduces a tree object that represents the entire repository. Therefore, a commit can be pictured as just a name.

Full commit graph

Figure 6-3. Full commit graph

Figure 6-4 shows the same commit graph as Figure 6-3 without depicting the tree and blob objects. Usually for the purpose of discussion or reference, branch names are also shown in the commit graphs.

Simplified commit graph

Figure 6-4. Simplified commit graph

In computer science, a graph is a collection of nodes and a set of edges between the nodes. There are several types of graphs with different properties. Git makes use of a special graph called a directed acyclic graph (DAG). A DAG has two important properties. First, the edges within the graph are all directed from one node to another. Second, starting at any node in the graph, there is no path along the directed edges that leads back to the starting node.

Git implements the history of commits within a repository as a DAG. In the commit graph, each node is a single commit, and all edges are directed from one descendant node to another parent node, forming an ancestor relationship. The graphs you saw in Figures 6-3 and 6-4 are both DAGs.

When speaking of the history of commits and discussing the relationship between commits in a graph, the individual commit nodes are often labeled as shown in Figure 6-5.

Labeled commit graph

Figure 6-5. Labeled commit graph

In these diagrams, time is roughly left to right. A is the initial commit as it has no parent, and B occurred after A. Both E and C occurred after B, but no claim can be made about the relative timing between C and E; either could have occurred before the other. In fact, Git doesn’t really care about the time or timing (absolute or relative) of commits. The actual “wall clock” time of a commit can be misleading because a computer’s clock can be set incorrectly or inconsistently. Within a distributed development environment, the problem is exacerbated. Timestamps can’t be trusted. What is certain, though, is that if commit Y points to parent X, then X captures the repository state prior to the repository state of commit Y, regardless of what timestamps might be on the commits.

The commits E and C share a common parent, B. Thus, B is the origin of a branch. The master branch begins with commits A, B, C, and D. Meanwhile, the sequence of commits A, B, E, F, and G form the branch named pr-17. The branch pr-17 points to commit G. (You can read more about branches in Chapter 7.)

Because it’s a merge, H has more than one commit parent—in this case, D and G. Even though H has two parents, it is only present on the master branch as pr-17 refers to G. (The merge operation is discussed in more detail in Chapter 9.)

In practice, the fine points of intervening commits are considered unimportant. Also, the implementation detail of a commit pointing back to its parent is often elided, as shown in Figure 6-6. Time is still vaguely left to right, there are two branches shown, and there is one identified merge commit (H), but the actual directed edges are simplified because they are implicitly understood.

Commit graph without arrows

Figure 6-6. Commit graph without arrows

This kind of commit graph is often used to talk about the operation of certain Git commands and how each might modify the commit history. The graphs are a fairly abstract representation of the actual commit history, in contrast to tools (such as gitk and git show-branch) that provide concrete representations of commit history graphs. In these tools, though, time is usually represented from bottom to top, oldest to most recent. Conceptually, it is the same information.

Using gitk to view the commit graph

A graph, by its very nature, is a visual aid to help you visualize a complicated structure and relationship. The gitk command[14] can draw a picture of a repository DAG whenever you want.

Let’s look at our example website:

$ cd public_html
$ gitk

The gitk program can do a lot of things, but let’s just focus on the DAG. The graph output looks something like Figure 6-7.

Merge viewed with gitk

Figure 6-7. Merge viewed with gitk

Here’s what you must know in order to understand the DAG of commits. First of all, each commit can have zero or more parents as follows:

  • Normal commits have exactly one parent, which is the previous commit in the history. When you make a change, your change is the difference between your new commit and its parent.

  • There is usually only one commit with zero parents: the initial commit, which appears at the bottom of the graph.

  • A merge commit, such as the one at the top of the graph, has more than one parent.

A commit with more than one child is the place where history began to diverge and formed a branch. In Figure 6-7, the commit Remove my poem is the branch point.

Tip

There is no permanent record of branch start points, but Git can algorithmically determine them using the git merge-base command.

Commit Ranges

Many Git commands allow you to specify a commit range. In its simplest instantiation, a commit range is a shorthand for a series of commits. More complex forms allow you to include and exclude commits.

A range is denoted with a double-period (..), as in start..end, where start and end may be specified using the forms from Identifying Commits. Typically, a range is used to examine a branch or part of a branch.

Earlier in Viewing Old Commits, you saw how to use a commit range with git log. The example used the range master~12..master~10 to specify the 11th and 10th prior commits on the master branch. To visualize the range, consider the commit graph in Figure 6-8. Branch M is shown over a portion of its commit history that is linear.

Linear commit history

Figure 6-8. Linear commit history

Recall that time flows left to right, so M~14 is the oldest commit shown, M~9 is the most recent commit shown, and A is the 11th prior commit.

The range M~12..M~10 represents two commits, the 11th and 10th oldest commits, which are labeled A and B. The range does not include M~12. Why? It’s a matter of definition. A commit range, start..end, is defined as the set of commits reachable from end end that are not reachable from start. In other words, the commit end is included while the commit start is excluded. Usually this is simplified to just the phrase in end but not start.

When you specify a commit, Y, to git log, you are actually requesting Git to show the log for all commits that are reachable from Y. You can exclude a specific commit, X, and all commits reachable from X with the expression ^X.

Combining the two forms, git log ^X Y is the same as git log X..Y and might be paraphrased as Give me all commits that are reachable from Y, and don’t give me any commit leading up to and including X.

The commit range X..Y is mathematically equivalent to ^X Y. You can also think of it as a set subtraction: use everything leading up to Y minus everything leading up to and including X.

Returning to the commit series from the earlier example, here’s how M~12..M~10 specifies just two commits, A and B. Begin with everything leading up to M~10, as shown in the first line of Figure 6-9. Find everything leading up to and including M~12, as shown in the second line of the figure. And finally, subtract M~12 from M~10 to get the commits shown in the third line of the figure.

Interpreting ranges as set subtraction

Figure 6-9. Interpreting ranges as set subtraction

When your repository history is a simple linear series of commits, it is pretty easy to understand how a range works. But when branches or merges are involved in the graph, things can become a bit tricky, and so it’s important to understand the rigorous definition.

Let’s look at a few more examples. In the case of a master branch with a linear history, as shown in Figure 6-10, the three sets B..E, ^B E, and the set of C, D, and E are equivalent.

Simple linear history

Figure 6-10. Simple linear history

In Figure 6-11, the master branch at commit V was merged into the topic branch at B.

Master merged into topic

Figure 6-11. Master merged into topic

The range topic..master represents those commits in master, but not in topic. Since each commit on the master branch prior to and including V, (..., T, U, V), contributes to topic, those commits are excluded, leaving W, X, Y, and Z.

The inverse of the previous example is shown in Figure 6-12. Here, topic has been merged into master.

Topic merged into master

Figure 6-12. Topic merged into master

In this example, the range topic..master, again representing those commits in master but not in topic, is the set of commits on the master branch leading up to and then including V, W, X, Y, and Z.

However, we have to be a little careful and consider the full history of the topic branch. Consider the case where it originally started as a branch of master and then merged in again, as shown in Figure 6-13.

Branch and merge

Figure 6-13. Branch and merge

In this case, topic..master contains only the commits W, X, Y, and Z. Remember, the range will exclude all commits that are reachable (going back or left over the graph) from topic (i.e., the commits D, C, B, A, and earlier) as well as V, U, and earlier from the other parent of B. The result is just W through Z.

There are two other range permutations. If you leave either the start or end commits out of range, HEAD is assumed. Thus, ..end is equivalent to HEAD..end and start.. is equivalent to start..HEAD.

Finally, just as start..end can be thought of as representing a set subtraction operation, the notation A...B (using three periods) represents the symmetric difference between A and B, or the set of commits that are reachable from either A or B but not from both. Because of the function’s symmetry, neither commit can really be considered a start or end. In this sense, A and B are equal.

More formally, the set of revisions in the symmetric difference between A and B, A...B, is given by:

$ git rev-list A B --not $(git merge-base --all A B)

Let’s look at an example in Figure 6-14.

Symmetric difference

Figure 6-14. Symmetric difference

We can compute each piece of the symmetric difference definition:

master...dev = (master OR dev) AND NOT (merge-base --all master dev)

The commits that contribute to master are (I, H, ..., B, A, W, V, U). The commits that contribute to dev are (Z, Y, ..., U, C, B, A). The union of those two sets is (A, ..., I, U, ..., Z). The merge base between master and dev is commit W. In more complex cases, there might be multiple merge bases, but here, we have only one. The commits that contribute to W are (W, V, U, C, B, and A); these are also the commits that are common to both master and dev, so they need to be removed to form the symmetric difference: (I, H, Z, Y, X, G, F, E, D).

It may be helpful if you can think of the symmetric difference between two branches, A and B, as Show everything in branch A or in branch B, but only back to the point where the two branches diverged.

Now that we have gone over what commit ranges are, how to write them, and how they work, it’s important to reveal that Git doesn’t actually support a true range operator. It is purely a notational convenience that A..B represents the underlying ^A B form. Git actually allows much more powerful commit set manipulation on its command line. Commands that accept a range are actually accepting an arbitrary sequence of included and excluded commits. For example, you could use git log ^dev ^topic ^bugfix master to select those commits in master but not in any of the dev, topic, or bugfix branches.

All of these examples may be a bit abstract, but the power of the range representation really comes to fruition when you consider that any branch name can be used as part of the range. As described in Tracking Branches, if one of your branches represents the commits from another repository, you can quickly discover the set of commits that are in your repository that are not in another repository!



[14] Yes, this is one of the few Git commands that is not considered a subcommand; thus, it is given as gitk and not git gitk.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.224.37.89