The primary command to show the history of commits is git log. It has more options, parameters, bells, whistles, colorizers, selectors, formatters, and doodads than the fabled ls. But don’t worry. Just as with ls, you don’t need to learn all the details right away.
In its parameterless form, git log acts like
git log HEAD, printing the log message associated
with every commit in your history that is reachable from
HEAD
. Changes are shown starting with the
HEAD
commit and working back through the graph.
They are likely to be in roughly reverse chronological order, but
recall Git adheres to the commit graph, not time, when traveling back
over the history.
If you supply a commit à la git log
commit
, the log starts at the
named commit and works backward. This form of the command is useful
for viewing the history of a branch:
$ git log master
commit 1fbb58b4153e90eda08c2b022ee32d90729582e6
Merge: 58949bb... 76bb40c...
Author: Junio C Hamano <[email protected]>
Date: Thu May 15 01:31:15 2008 -0700
Merge git://repo.or.cz/git-gui
* git://repo.or.cz/git-gui:
git-gui: Delete branches with 'git branch -D' to clear config
git-gui: Setup branch.remote,merge for shorthand git-pull
git-gui: Update German translation
git-gui: Don't use '$$cr master' with aspell earlier than 0.60
git-gui: Report less precise object estimates for database compression
commit 58949bb18a1610d109e64e997c41696e0dfe97c3
Author: Chris Frey <[email protected]>
Date: Wed May 14 19:22:18 2008 -0400
Documentation/git-prune.txt: document unpacked logic
Clarifies the git-prune manpage, documenting that it only
prunes unpacked objects.
Signed-off-by: Chris Frey <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
commit c7ea453618e41e05a06f05e3ab63d555d0ddd7d9
...
The logs are authoritative, but rolling back through the
entire commit history of your repository is likely not very practical
or meaningful. Typically, a limited history is more informative. One
technique to constrain history is to specify a commit
range using the form
since
..
until
.
Given a range, git log shows all commits from
since
up to and including
until
. Here’s an example:
$ git log --pretty=short --abbrev-commit master~12..master~10
commit 6d9878c...
Author: Jeff King <[email protected]>
clone: bsd shell portability fix
commit 30684df...
Author: Jeff King <[email protected]>
t5000: tar portability fix
Here, git log shows the commits between
master~12
and master~10
, or the
10th and 11th
prior commits on the master branch. You’ll see more about ranges in
Commit Ranges.
The previous example also introduces two formatting options,
--pretty=short
and --abbrev-commit
. The former flag adjusts
the amount of information about each commit and has several
variations, including oneline
,
short
, and full
. The latter
simply requests that hash IDs be abbreviated.
-p
prints the patch, or changes, introduced by
the commit:
$ git log -1 -p 4fe86488
commit 4fe86488e1a550aa058c081c7e67644dd0f7c98e
Author: Jon Loeliger <[email protected]>
Date: Wed Apr 23 16:14:30 2008 -0500
Add otherwise missing --strict option to unpack-objects summary.
Signed-off-by: Jon Loeliger <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
diff --git a/Documentation/git-unpack-objects.txt
b/Documentation/git-unpack-objects.txt
index 3697896..50947c5 100644
--- a/Documentation/git-unpack-objects.txt
+++ b/Documentation/git-unpack-objects.txt
@@ -8,7 +8,7 @@ git-unpack-objects - Unpack objects from a packed archive
SYNOPSIS
--------
-'git-unpack-objects' [-n] [-q] [-r] <pack-file
+'git-unpack-objects' [-n] [-q] [-r] [--strict] <pack-file
Notice the option -1
as well: it restricts the
output to a single commit. You can also type
-
to limit the output to
at most n
n
commits.
The --stat
option enumerates the files
changed in a commit and tallies how many lines were modified in each
file:
$ git log --pretty=short --stat master~12..master~10
commit 6d9878cc60ba97fc99aa92f40535644938cad907
Author: Jeff King <[email protected]>
clone: bsd shell portability fix
git-clone.sh | 3 +--
1 files changed, 1 insertions(+), 2 deletions(-)
commit 30684dfaf8cf96e5afc01668acc01acc0ade59db
Author: Jeff King <[email protected]>
t5000: tar portability fix
t/t5000-tar-tree.sh | 8 ++++----
1 files changed, 4 insertions(+), 4 deletions(-)
Compare the output of git log --stat with the output of git diff --stat. There is a fundamental difference in what’s displayed. The former produces a summary for each individual commit named in the range, whereas the latter prints a single summary of the total difference between two repository states named on the command line.
Another command to display objects from the object store is git show. You can use it to see a commit:
$ git show HEAD~2
or to see a specific blob object:
$ git show origin/master:Makefile
In the latter git show, the blob shown is the
Makefile from the branch named
origin/master
.
Object Store Pictures introduced some figures to help visualize the layout and connectivity of objects in Git’s data model. Such sketches are illuminating, especially if you are new to Git; however, even a small repository with just a handful of commits, merges, and patches becomes unwieldy to render in the same detail. For example, Figure 6-3 shows a more complete but still somewhat simplified commit graph. Imagine how it would appear if all commits and all data structures were rendered.
Yet one observation about commits can simplify the blueprint tremendously: each commit introduces a tree object that represents the entire repository. Therefore, a commit can be pictured as just a name.
Figure 6-4 shows the same commit graph as Figure 6-3 without depicting the tree and blob objects. Usually for the purpose of discussion or reference, branch names are also shown in the commit graphs.
In computer science, a graph is a collection of nodes and a set of edges between the nodes. There are several types of graphs with different properties. Git makes use of a special graph called a directed acyclic graph (DAG). A DAG has two important properties. First, the edges within the graph are all directed from one node to another. Second, starting at any node in the graph, there is no path along the directed edges that leads back to the starting node.
Git implements the history of commits within a repository as a DAG. In the commit graph, each node is a single commit, and all edges are directed from one descendant node to another parent node, forming an ancestor relationship. The graphs you saw in Figures 6-3 and 6-4 are both DAGs.
When speaking of the history of commits and discussing the relationship between commits in a graph, the individual commit nodes are often labeled as shown in Figure 6-5.
In these diagrams, time is roughly left to right.
A
is the initial commit as it has no parent, and
B
occurred after A
. Both
E
and C
occurred after
B
, but no claim can be made about the relative
timing between C
and E
; either
could have occurred before the other. In fact, Git doesn’t really care
about the time or timing (absolute or relative) of commits. The actual
“wall clock” time of a commit can be misleading because a computer’s
clock can be set incorrectly or inconsistently. Within a distributed
development environment, the problem is exacerbated. Timestamps can’t
be trusted. What is certain, though, is that if commit
Y
points to parent X
, then
X
captures the repository state prior to the
repository state of commit Y
, regardless of what
timestamps might be on the commits.
The commits E
and C
share
a common parent, B
. Thus, B
is
the origin of a branch. The master branch
begins with commits A
, B
,
C
, and D
. Meanwhile, the
sequence of commits A
, B
,
E
, F
, and G
form the branch named pr-17
. The branch
pr-17
points to commit G
. (You
can read more about branches in Chapter 7.)
Because it’s a merge, H
has more than
one commit parent—in this case, D
and
G
. Even though H
has two
parents, it is only present on the master branch as
pr-17
refers to G
. (The merge
operation is discussed in more detail in Chapter 9.)
In practice, the fine points of intervening commits are
considered unimportant. Also, the implementation detail of a commit
pointing back to its parent is often elided, as shown in Figure 6-6. Time is still vaguely left to right,
there are two branches shown, and there is one identified merge
commit (H
), but the actual directed edges
are simplified because they are implicitly understood.
This kind of commit graph is often used to talk about the operation of certain Git commands and how each might modify the commit history. The graphs are a fairly abstract representation of the actual commit history, in contrast to tools (such as gitk and git show-branch) that provide concrete representations of commit history graphs. In these tools, though, time is usually represented from bottom to top, oldest to most recent. Conceptually, it is the same information.
A graph, by its very nature, is a visual aid to help you visualize a complicated structure and relationship. The gitk command[14] can draw a picture of a repository DAG whenever you want.
Let’s look at our example website:
$cd public_html
$gitk
The gitk program can do a lot of things, but let’s just focus on the DAG. The graph output looks something like Figure 6-7.
Here’s what you must know in order to understand the DAG of commits. First of all, each commit can have zero or more parents as follows:
Normal commits have exactly one parent, which is the previous commit in the history. When you make a change, your change is the difference between your new commit and its parent.
There is usually only one commit with zero parents: the initial commit, which appears at the bottom of the graph.
A merge commit, such as the one at the top of the graph, has more than one parent.
A commit with more than one child is the place where history
began to diverge and formed a branch. In Figure 6-7, the commit Remove
my poem
is the branch point.
Many Git commands allow you to specify a commit range. In its simplest instantiation, a commit range is a shorthand for a series of commits. More complex forms allow you to “include” and “exclude” commits.
A range is denoted with a double-period
(..
), as in
,
where start
..end
start
and
end
may be specified using the forms from
Identifying Commits. Typically, a range is
used to examine a branch or part of a branch.
Earlier in Viewing Old Commits, you
saw how to use a commit range with git log. The
example used the range master~12..master~10
to
specify the 11th and
10th prior commits on the master branch. To
visualize the range, consider the commit graph in Figure 6-8. Branch M
is
shown over a portion of its commit history that is linear.
Recall that time flows left to right, so M~14
is the oldest commit shown, M~9
is the most recent
commit shown, and A
is the
11th prior commit.
The range M~12..M~10
represents two commits,
the 11th and
10th oldest commits, which are labeled
A
and B
. The range does not
include M~12
. Why? It’s a matter of definition. A
commit range,
,
is defined as the set of commits reachable from end
start
..end
that are not
reachable from end
. In
other words, “the commit start
end
is
included” while “the commit
start
is
excluded.” Usually this is simplified to
just the phrase “in end
but not
start
.”
When you specify a commit, Y
, to git
log, you are actually requesting Git to show the log for
all commits that are reachable from Y
. You can
exclude a specific commit, X
, and all commits
reachable from X
with the expression
^X
.
Combining the two forms, git log ^X Y is the same as git log X..Y and might be paraphrased as “Give me all commits that are reachable from Y, and don’t give me any commit leading up to and including X.”
The commit range X..Y
is mathematically
equivalent to ^X Y
. You can also think of it as a
set subtraction: use everything leading up to Y
minus everything leading up to and including
X
.
Returning to the commit series from the earlier example, here’s
how M~12..M~10
specifies just two commits,
A
and B
. Begin with everything
leading up to M~10
, as shown in the first line of
Figure 6-9. Find everything leading up to and
including M~12
, as shown in the second line of the
figure. And finally, subtract M~12
from
M~10
to get the commits shown in the third line of
the figure.
When your repository history is a simple linear series of commits, it is pretty easy to understand how a range works. But when branches or merges are involved in the graph, things can become a bit tricky, and so it’s important to understand the rigorous definition.
Let’s look at a few more examples. In the case of a
master
branch with a linear history, as shown in
Figure 6-10, the three sets
B..E
, ^B E
, and the set of
C
, D
, and E
are equivalent.
In Figure 6-11, the
master
branch at commit V
was
merged into the topic
branch at
B
.
The range topic..master
represents those
commits in master
, but not in
topic
. Since each commit on the
master
branch prior to and including
V
, (..., T
,
U
, V
), contributes to
topic
, those commits are excluded, leaving
W
, X
, Y
, and
Z
.
The inverse of the previous example is shown in Figure 6-12. Here,
topic
has been merged into
master
.
In this example, the range topic..master
,
again representing those commits in master
but not
in topic
, is the set of commits on the
master
branch leading up to and then including
V
, W
, X
,
Y
, and Z
.
However, we have to be a little careful and consider the full
history of the topic
branch. Consider the case
where it originally started as a branch of master
and then merged in again, as shown in Figure 6-13.
In this case, topic..master
contains only the
commits W
, X
,
Y
, and Z
. Remember, the range
will exclude all commits that are reachable
(going back or left over the graph) from topic
(i.e., the commits D
, C
,
B
, A
, and earlier) as well as
V
, U
, and earlier from the other
parent of B
. The result is just
W
through Z
.
There are two other range permutations. If you leave either the
start
or end
commits out of range, HEAD
is assumed. Thus,
..
is equivalent to
end
HEAD..
and
end
is equivalent to
start
..
.start
..HEAD
Finally, just as
can be thought of as representing a set subtraction operation, the
notation
start
..end
(using three periods) represents the symmetric
difference between A
...B
A
and
B
, or the set of commits that are reachable
from either A
or
B
but not from both. Because of the
function’s symmetry, neither commit can really be considered a
“start” or “end.” In this sense,
A
and B
are
“equal.”
More formally, the set of revisions in the symmetric difference
between A
and B
,
,
is given by:A
...B
$ git rev-list A
B
--not $(git merge-base --all A
B
)
Let’s look at an example in Figure 6-14.
We can compute each piece of the symmetric difference definition:
master...dev = (master OR dev) AND NOT (merge-base --all master dev)
The commits that contribute to master
are
(I
, H
, ...,
B
, A
, W
,
V
, U
). The commits that
contribute to dev
are (Z
,
Y
, ..., U
, C
,
B
, A
). The union of those two
sets is (A
, ..., I
,
U
, ..., Z
). The merge base
between master
and dev
is commit
W
. In more complex cases, there might be multiple
merge bases, but here, we have only one. The commits that contribute
to W
are (W
,
V
, U
, C
,
B
, and A
); these are also the
commits that are common to both master
and
dev
, so they need to be removed to form the
symmetric difference: (I
, H
,
Z
, Y
, X
,
G
, F
, E
,
D
).
It may be helpful if you can think of the symmetric difference
between two branches, A
and B
,
as “Show everything in branch A or in branch B, but only back to
the point where the two branches diverged.”
Now that we have gone over what commit ranges are, how to write
them, and how they work, it’s important to reveal that Git doesn’t
actually support a true range operator. It is purely a notational
convenience that A..B
represents the underlying
^A B
form. Git actually allows much more powerful
commit set manipulation on its command line. Commands that accept a
range are actually accepting an arbitrary sequence of
“included” and “excluded” commits. For
example, you could use git log ^dev ^topic ^bugfix
master to select those commits in master
but not in any of the dev
,
topic
, or bugfix
branches.
All of these examples may be a bit abstract, but the power of the range representation really comes to fruition when you consider that any branch name can be used as part of the range. As described in Tracking Branches, if one of your branches represents the commits from another repository, you can quickly discover the set of commits that are in your repository that are not in another repository!
[14] Yes, this is one of the few Git commands that is not considered a “subcommand”; thus, it is given as gitk and not git gitk.
18.224.37.89