20. Tips, Tricks, and Techniques

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 20. Tips, Tricks, and Techniques

With a plethora of commands and options, Git provides a rich resource for performing varied and powerful changes to a repository. Sometimes, though, the actual means for accomplishing some particular task are a bit elusive. Sometimes, the purpose of a particular command and option isn’t really clear or becomes lost in a technical description.

This chapter provides a collection of various tips, tricks, and techniques that highlight Git’s ability to do interesting transformations.

Interactive Rebase with a Dirty Working Directory

Frequently, when developing a multicommit change sequence on a local branch, I realize that I need to make an additional modification to some commit I’ve already made earlier in the sequence. Rather than scribbling a note about it on the side and coming back to it later, I will immediately edit and introduce that change directly into a new commit with a reminder note in the commit log entry that it should be squashed into a previous commit.

When I eventually get around to cleaning up my commit sequence, and want to use git rebase -i, I am often midstride and find myself with a dirty working directory. In this case, Git will refuse to do the rebase.

    $ git show-branch --more=10
    [master] Tinker bar
    [master^] Squash into 'More foo and bar'
    [master~2] Modify bar
    [master~3] More foo and bar
    [master~4] Initial foo and bar.

    $ git rebase -i master~4
    Cannot rebase: You have unstaged changes.
    Please commit or stash them.

As suggested, clean out your dirty working directory with git stash first!

    $ git stash
    Saved working directory and index state WIP on master: ed6e906 Tinker bar
    HEAD is now at ed6e906 Tinker bar

    $ git rebase -i master~4

    # In the editor, move master^ next to master~3
    # and mark it for squashing.
    pick 1a4be28 More foo and bar
    squash 6195b3d Squash into 'more foo and bar'
    pick 488b893 Modify bar
    pick ed6e906 Tinker bar

    [detached HEAD e3c46b8] More foo and bar with additional stuff.
     2 files changed, 2 insertions(+), 1 deletions(-)
    Successfully rebased and updated refs/heads/master.

Naturally, you will want to recover your working directory changes now:

    $ git stash pop
    # On branch master
    # Changes not staged for commit:
    #   (use "git add <file>..." to update what will be committed)
    #   (use "git checkout -- <file>..." to discard changes in working directory)
    #
    #    modified:   foo
    #
    no changes added to commit (use "git add" and/or "git commit -a")
    Dropped refs/stash@{0} (71b4655668e49ce88686fc9eda8432430b276470)

Remove Left-Over Editor Files

Because the git filter-branch command really drives a shell operation, either the --index-filter command or the --tree-filter command can use normal shell wild card matching in its command. That can be handy when you accidentally add, say, temporary editor files on first creating your repository.

    $ git filter-branch --tree-filter 'rm -f *~' -- --all

That command will remove all files matching the *~ pattern from -- --all refs in one command.

Garbage Collection

In The git fsck Command, which I expanded on the concept of reachability was first introduced in Chapter 4. In those sections, I explained how the Git object store and its commit graph might leave unreferenced or dangling objects within the object store. I also gave a few examples how some commands might leave these unreferenced objects in your repository.

Having dangling commits or unreachable objects is not necessarily bad. You may have moved away from a particular commit intentionally or added a file blob and then changed it again before actually committing it. The problem, however, is that over a long period, manipulating the repository can be messy and leave many unreferenced objects in your object store.

Historically, within the computer science industry, such unreferenced objects are cleaned up by an algorithm called “garbage collection.” It is the job of the git gc command to perform periodic garbage collection and keep your repository object stores neat and tidy.

This is neat, tidy, and small. Git’s garbage collection has one other very important task: optimizing the size of the repository by locating unpacked objects (loose objects) and creating pack files for them.

So when does garbage collection happen, and how often? Is it automatic or is it something that needs to be done manually? When it runs does it remove everything it can? Pack everything it can?

All good questions and, as usual, the answers are all, “It depends.”

For starters, Git runs garbage collection automatically at strategic times. At other times, you should run git gc directly by hand.

Git runs garbage collection automatically:

If there are too many loose objects in the repository
When a push to a remote repository happens
After some commands that might introduce many loose objects
When some commands such as git reflog expire explicitly request it

And finally, garbage collection occurs when you explicitly request it using the git gc command. But when should that be? There’s no solid answer to this question, but there is some good advice and best practice.

You should consider running git gc manually in a few situations:

If you have just completed a git filter-branch. Recall that filter-branch rewrites many commits, introduces new ones, and leaves the old ones on a ref that should be removed when you are satisfied with the results. All those dead objects (that are no longer referenced since you just removed the one ref pointing to them) should be removed via garbage collection.
After some commands that might introduce many loose objects. This might be a large rebase effort, for example.

And on the flip side, when should you be wary of garbage collection?

If there are orphaned refs that you might want to recover
In the context of git rerere^[44] and you do not need to save the resolutions forever
In the context of only tags and branches being sufficient to cause Git to retain a commit permanently
In the context of FETCH_HEAD retrievals (URL-direct retrievals via git fetch) because they are immediately subject to garbage collection

Git doesn’t spontaneously jump to life and carry out garbage collection of its own free will, not even automatically. Instead, what happens is that certain commands that you run cause Git to then consider running garbage collection and packing. But just because you run those commands and Git runs git gc doesn’t mean that Git acts on this trigger. Instead, Git takes that opportunity to inspect a whole series of configuration parameters that guide the inner workings of both the removal of unreferenced objects and the creation of pack files. Some of the more important git config parameters include:

gc.auto: The number of loose objects allowed to exist in a repository before garbage collection causes them to be packed. The default is 6700.
gc.autopacklimit: The number of pack files that may exist in a repository before pack files are themselves repacked into larger, more efficient pack files. The default is 50.
gc.pruneexpire: The period of time unreachable objects may linger in an object store. The default is two weeks.
gc.reflogexpire: The git reflog expire command will remove reflog entries older than this time period. The default is 90 days.
gc.reflogexpireunreachable: The git reflog expire command will remove reflog entries older than this time period only if they are unreachable from the current branch. The default is 30 days.

Most of the garbage collection config parameters have a value that means either “do it now” or “never do it.”

Split a Repository

You can use Git’s filter-branch to split a repository or to extract subdirectories. And in this case, we mean split a repository and maintain the history that lead to this point. (If you don’t care about the development and commit history and want to split a repository, just clone the repository and remove the parts from each that you don’t want!) This approach preserves the appropriate development and commit history.

For example, let’s say you had a repository with four top-level directories named part1, part2, part3, and part4, and you wanted to split the top-level directory part4 into its own repository.

For starters, you should work in a clone of the original repository and remove all of the origin remote references. This will ensure that you don’t destroy the original repository, nor will you think you can push or fetch changes from your original via a lingering remote reference.

Then, use the --subdirectory-filter option like this:

    $ git filter-branch --subdirectory-filter part4 HEAD

However, there are likely some extenuating circumstances that will cause you to want to extend that command to allow for incidental and tricky situations. Do you have tags and want them reflected in the new part4 repository too? If so, add the --tag-name-filter cat option. Might a commit end up empty due to its inapplicability to this sub-section of the original repository? Almost certainly, so add the --prune-empty too. Are you interested in only the one current branch indicated by HEAD? Almost certainly not. Instead, you might want to cover all branches from the original repository. In that case, you’ll want to use -- --all in place of the final HEAD parameter.

The revised command now looks like this:

    $ git filter-branch --tag-name-filter cat 
    --subdirectory-filter part4 -- --all

Naturally, you will want to verify the contents are as expected and then expire your reflog, remove the original refs, and do garbage collection on the new repository.

Finally, you might (or might not) need to return to your original repository and perform a different git filter-branch to remove part4 from it, too!

Tips for Recovering Commits

Time is the enemy of lost commits. Eventually, Git’s garbage collection will run and clean out any dangling or unreferenced commits and blobs. Garbage collection will eventually retire reflog refs as well. At that point, lost commits are lost and git fsck will no longer be able to find them. If you know you are slow to realize a commit has been lost, you may want to adjust the default timeouts for reflog expiration and retiring unreferenced commits during garbage collection.

    # default is 90 days
    $ git config --global gc.reflogExpire "6 months"

    # default is 30 days
    $ git config --global gc.reflogExpireUnreachable "60 days"

    # default is 2 weeks
    $ git config --global gc.pruneexpire="1 month"

Sometimes, using a graphical tool such as gitk or viewing a log graph can help find and establish necessary context for interpreting and understanding the reflog and other dangling or orphaned commits.

Here are two aliases that you might add to your global .gitconfig:

    $ git config --global 
        alias.orphank=!gitk --all `git reflog | cut -c1-7`&
    $ git config --global 
        alias.orphanl=!git log --pretty=oneline --abbrev-commit 
        --graph --decorate `git reflog | cut -c1-7`

Subversion Conversion Tips

General Advice

Maintaining an SVN repository and a Git repository in parallel is a lot of work, especially if subsequent new commits to the SVN repository are allowed. Make absolutely sure that you need to do this before you commit to this workflow. By far the easiest approach is to do the SVN to Git conversion once, making the SVN repository inaccessible when the conversion has been completed.

Plan on doing all of your importing, converting, and cleaning up once up front before ever publishing the first Git version of your repository. There are several steps in a well-planned conversion that you really should do before anyone else has a chance to clone the first version of your Git repository. For example, all of your global changes, such as directory renaming, author and email address cleanup, large file removal, branch fiddling, tag construction, etc., will be significantly more difficult for both you and your downstream consumers if they happen after they have cloned the conversion repository.

Do you really want to remove all the SVN commit identifiers from your Git commit logs? Just because recipes exist to do so and someone shows you how, doesn’t mean you should. It’s your call.

After doing a conversion, the metadata in the .git directory for the SVN conversion is lost upon cloning or pushing to a Git repository. Make sure you are done.

If you can, ensure that you have a good author and email mapping file prior to doing your import. Having to fix them up later with git filter-branch is just extra pain.

If creating and maintaining parallel SVN and Git repositories seems complicated, and you find you still must use both, using GitHub’s Subversion Bridge (see Subversion Bridge) is an easy alternative that meets this requirement.

Remove a Trunk After an SVN Import

Often, after creating a new repository from an SVN import, you are left with a top-level directory such as trunk that you don’t really want in your Git repository.

    $ cd OldSVNStuff

    $ ls -R .
    .:
    trunk

    ./trunk:
    Recipes  Stuff  Things

    ./trunk/Recipes:
    Chicken_Pot_Pie  Ice_Cream

    ./trunk/Stuff:
    Note_to_self

    ./trunk/Things:
    Movie_List

There is no real reason to keep trunk. You can use Git’s filter-branch to remove it:

    $ git filter-branch --subdirectory-filter trunk HEAD
    Rewrite b6b4781ee814cbb6fc6a01a91c8d0654ec78fbe1 (1/1)
    Ref 'refs/heads/master' was rewritten

    $ ls
    Recipes  Stuff  Things

Everything under trunk will be hoisted up one level and the directory trunk will be eliminated.

Removing SVN Commit IDs

First, run git filter-branch --msg-filter using a sed script to match and delete the SVN commit IDs from your Git log messages.

    # From the git-filter-branch manual page
    $ git filter-branch --msg-filter 'sed -e "/^git-svn-id:/d"'

Toss the reflog or else it will have lingering references:

    $ git reflog expire --verbose --expire=0 --all

Remember that after a git filter-branch command, Git leaves the old, original branch refs in refs/original/. You should remove them and take the garbage out with prejudice:

    # Careful...
    $ rm -rf .git/refs/original

    $ git reflog expire --verbose --expire=0 --all
    $ git gc --prune=0
    $ git repack -ad

Alternatively, clone away from it:

    $ cd /tmp/somewhere/else/
    $ git clone file:///home/jdl/stuff/converted.git

Remember to use a file:/// URL, because a normal, direct file reference will hard link the files rather than copy them; that won’t be effective.

Manipulating Branches from Two Repositories

I am occasionally asked the question, “How do I compare two branches from different repositories?” It is sometimes asked with slight variations as well: “How do I tell whether my commits from my repository have been merged into a branch in some other repository?” Or sometimes something like, “What does the devel branch in this remote repository have that isn’t in my repository?”

These are all fundamentally the same question in that they aim to resolve or compare branches from two different repositories. Developers are sometimes thrown off by the fact that the branches they wish to compare are in two or more different repositories, and that those repositories might also be remote or located on another server.

In order for these questions to make sense at all, the developer must know that, at some point back in time during the earlier development of these repositories, they must have had some common ancestor and were derived from a common basis. Without such a relationship, it makes little to no sense to even ask how two branches might compare to each other. That means that Git should be able to discover the commit graph and branch history of both repositories and be able to relate them.

The key technique for solving all these questions, then, is to realize that Git can compare branches only within one local repository. Thus, you need to have all the branches from all the repositories colocated in one repository. Usually, this is a simple matter of adding a new remote for each of the other repositories containing a needed branch, and then fetching from it.

Once the branches are all in one repository, use any of the usual diff or comparison commands on those branches as needed.

Recovering from an Upstream Rebase

Sometimes, when working in a distributed environment where you don’t necessarily control the upstream repository from which you derived your current development clone, the upstream version of the branch on which you have developed your work will undergo a non–fast-forward change or a rebase. That change destroys the basis of your branch, and prevents you from directly sending your changes upstream.

Unfortunately, Git doesn’t provide a way for an upstream repository maintainer to state how its branches will be treated. That is, there is no flag that says “this branch will be rebased at will,” or “don’t expect this branch to fast-forward.” You, the downstream developer, just have to know, intuit its intended behavior, or ask the upstream maintainer. For the most part, other than that, branches are expected to fast-forward and not be rebased.

Sure, that can be bad. I’ve explained before how changing published history is bad. Nevertheless, it happens sometimes. Furthermore, there are some very good development models that even encourage the occasional rebasing of a branch during the normal course of development. (For an example, see how the pu, or proposed updates branches, of the Git repository itself are handled.)

So when it happens, what do you do? How do you recover so that your work can be sent upstream again?

First, ask yourself whether the rebased branch is really the right branch on which you should have been basing your work in the first place. Branches are often intended to be read only. For example, maybe a collection of branches are being gathered and merged together for testing purposes into a read only branch, but are otherwise available individually and should form the basis of development work. In this case, you likely shouldn’t have been developing on the merged collection branch. (The Linux next branches tend to operate like this.)

Depending on the extent of the rebase that occurred upstream, you may get off easily and be able to recover with a simple git pull --rebase. Give it a try; if it works, you win. But I wouldn’t count on it. You should be prepared to recover an ensuing mess with a judicious use of reflog.

The real, more reliable approach is to methodically transfer your developed and orphaned commit sequence from your now defunct branch to the new upstream branch. The basic sequence is to:

Rename your old upstream branch. It is important to do this before you fetch because it allows a clean fetch of the new upstream history. Try something like: git branch save-origin-master origin/master.
Fetch from upstream to recover the current upstream content. A simple git fetch should be sufficient.
Rebase your commits from the renamed branch onto the new upstream branch using commands like cherry-pick or rebase. This should be the command: git rebase --onto origin/master save-origin-master master.
Clean up and remove the temporary branch. Try using the command git branch -D save-origin-master.

It seems easy enough, but the key can often be in locating the point back in the history of the upstream branch where the original history and the new history begin to diverge. It’s possible that everything between that point and your first commit isn’t needed at all; that is, the rewritten commit history changes nothing that intersects with your work. In this case, you win because a rebase should happen readily. On the other hand, it is also possible that the rewritten history touches the same ground that you were developing. In this case, you likely have a tough rebase road ahead of you and will need to fully understand the semantic meanings of the original and changed histories in order to figure out how to resolve your desired development changes.

Make Your Own Git Command

Here’s a neat little trick to make your own Git command that looks like every other git command.

First, write your command or script using a name that begins with the prefix git-. Place it in your ~/bin directory or some other place that is found on your shell PATH.

Suppose you wanted a script that checked to see if you were in the top level of your Git repository. Let’s call it git-top-check, like this:

    #!/bin/sh
    # git-top-check -- Is this the top level directory of a Git repo?

    if [ -d ".git" ]; then
        echo "This is a top level Git development repository."
        exit 0
    fi

    echo "This is not a top level Git development repository."
    exit -1

If you now place that script in the file ~/bin/git-top-check and make it executable, you can use it like this:

    $ cd ~/Repos/git
    $ git top-check
    This is a top level Git development repository.

    $ cd /etc
    $ git top-check
    This is not a top level Git development repository.

Quick Overview of Changes

If you need to keep a repository up to date by continually fetching from an upstream source, you may find yourself frequently asking a question similar to, “So, what changed in the last week?”

The answer to your wonderment might be the git whatchanged command. Like many commands, it accepts a plethora of options centered around git rev-parse for selecting commits, and formatting options typical of, say, git log such as the --pretty= options.

Notably, you might want the --since= option.

    # The Git source repository
    $ cd ~/Repos/git
    $ git whatchanged --since="three days ago" --oneline
    745950c p4000: use -3000 when promising -3000
    :100755 100755 d6e505c... 7e00c9d... M  t/perf/p4000-diff-algorithms.sh
    42e52e3 Update draft release notes to 1.7.10
    :100644 100644 ae446e0... a8fd0ac... M  Documentation/RelNotes/1.7.10.txt
    561ae06 perf: export some important test-lib variables
    :100755 100755 f8dd536... cf8e1ef... M  t/perf/p0000-perf-lib-sanity.sh
    :100644 100644 bcc0131... 5580c22... M  t/perf/perf-lib.sh
    1cbc324 perf: load test-lib-functions from the correct directory
    :100755 100755 2ca4aac... f8dd536... M  t/perf/p0000-perf-lib-sanity.sh
    :100644 100644 2a5e1f3... bcc0131... M  t/perf/perf-lib.sh

That’s dense. But we did ask for --oneline! So the commit log has been summarized in single lines like this:

    561ae06 perf: export some important test-lib variables

And each of those are followed by the list of files that changed with each commit:

    :100755 100755 f8dd536... cf8e1ef... M  t/perf/p0000-perf-lib-sanity.sh
    :100644 100644 bcc0131... 5580c22... M  t/perf/perf-lib.sh

That’s file mode bits, before and after the commit, the SHA1s of each blob before and after the commit, a status letter (M here means modified content or mode bits), and finally the path of the blob that changed.

Although the previous example defaulted the branch reference to master, you could pick anything of interest, or explicitly request the set of changes that were just fetched:

    $ git whatchanged ORIG_HEAD..HEAD

You can also limit the output to the set of changes that affect a named file:

    $ cd /usr/src/linux
    $ git pull

    $ git whatchanged ORIG_HEAD..HEAD --oneline Makefile
    fde7d90 Linux 3.3-rc7
    :100644 100644 66d13c9... 56d4817... M  Makefile
    192cfd5 Linux 3.3-rc6
    :100644 100644 b61a963... 66d13c9... M  Makefile

The workhorse behind this output is git diff-tree. Grab yourself a caffeinated beverage prior to reading that manual page.

Cleaning Up

Everyone enjoys a clean and tidy directory structure now and then! To help you achieve repository directory nirvana, the git clean command may be used to remove untracked files from your working tree.

Why bother? Perhaps cleaning is part of an iterative build process that reuses the same directory for repeated builds but needs to have generated files cleaned out each time. (Think make clean.)

By default, git clean just removes all files that are not under version control from the current directory and down through your directory structure. Untracked directories are considered slightly more valuable than plain files and are left in place unless you supply the -d option.

Furthermore, for the purposes of this command, Git uses a slightly more conservative concept of under version control. Specifically, the manual page uses the phrase “files that are unknown to Git” for a good reason: even files that are mentioned in the .gitignore and .git/info/exclude files are actually known to Git. They represent files that are not version controlled, but Git does know about them. And because those files are called out in the .gitignore files, they must have some known (to you) behavior that shouldn’t be disturbed by Git. So Git won’t clean out the ignored files unless you explicitly request it with the -x option.

Naturally, the -X option causes the inverse behavior: namely, only files explicitly ignored by Git are removed. So choose the files that are important to you carefully.

If you are skittish, do a --dry-run first.

Using git-grep to Search a Repository

You may recall from Using Pickaxe that I introduced the pickaxe option (spelled -Sstring) for the git log command, and then in git diff with Path Limiting, I showed it in use with the git diff command. It searches back through a branch’s history of commit changes for commits that introduce or remove occurrences of a given string or regular expression.

Another command that can be used to search a repository is git grep. Rather than searching each commit’s changes to a branch, the git grep command searches the content of files within a repository. Because git grep is really a generic Swiss Army knife with a multitude of options, it is more accurate to say that git grep searches for text patterns in tracked blobs (i.e., files) of the work tree, blobs cached in the index, or blobs in specified trees. By default, it just searches the tracked files of the working tree.

Thus, pickaxe can be used to search a series of commit differences, whereas git grep can be used to search the repository tree at a specific point in that history.

Want to do some ego surfing in a repository? Sure you do. Let’s go get the Git source repository and find out!^[45]

    $ cd /tmp
    $ git clone git://github.com/gitster/git.git

    Cloning into 'git'...
    remote: Counting objects: 129630, done.
    remote: Compressing objects: 100% (42078/42078), done.
    Receiving objects: 100% (129630/129630), 28.51 MiB | 1.20 MiB/s, done.
    remote: Total 129630 (delta 95231), reused 119366 (delta 85847)
    Resolving deltas: 100% (95231/95231), done.

    $ cd git

    $ git grep -i loeliger
    Documentation/gitcore-tutorial.txt:Here is an ASCII art by Jon Loeliger
    Documentation/revisions.txt:Here is an illustration, by Jon Loeliger.
    Documentation/user-manual.txt:Here is an ASCII art by Jon Loeliger

    $ git grep jdl
    Documentation/technical/pack-heuristics.txt:  <jdl> What is a "thin" pack?

Ever wonder where the documentation for the git-grep command itself is located? What files in the git.git even mention git-grep by name? Do you even know where it is located? Here’s how you can find out:

    # Still in the /tmp/git repository

    $ git grep -l git-grep
    .gitignore
    Documentation/RelNotes/1.5.3.6.txt
    Documentation/RelNotes/1.5.3.8.txt
    Documentation/RelNotes/1.6.3.txt
    Documentation/git-grep.txt
    Documentation/gitweb.conf.txt
    Documentation/pt_BR/gittutorial.txt
    Makefile
    command-list.txt
    configure.ac
    gitweb/gitweb.perl
    t/README
    t/perf/p7810-grep.sh

A few things to note here: git-grep supports many of the normal command line options to the traditional grep tool, such as -i for case insensitive searches, -l for a list of just the matching file names, -w for word matching, etc. Using the -- separator option, you can limit the paths or directories that Git will search. To limit the search to the occurrence within the Documentation/ directory, do something like this:

    # Still in the /tmp/git repository

    $ git grep -l git-grep -- Documentation
    Documentation/RelNotes/1.5.3.6.txt
    Documentation/RelNotes/1.5.3.8.txt
    Documentation/RelNotes/1.6.3.txt
    Documentation/git-grep.txt
    Documentation/gitweb.conf.txt
    Documentation/pt_BR/gittutorial.txt

Using the --untracked option, you can also search for patterns in untracked (but not ignored) files that have neither been added to the cache nor committed as part of the repository history. This option may come in handy if you are developing some feature and have started adding new files but haven’t yet committed them. A default git grep wouldn’t search there, even though your past experience with the traditional grep command might lead you to believe that all files in your working directory (and possibly its subdirectories) would otherwise be searched.

So why even bother introducing the git grep in the first place? Isn’t the traditional shell tool sufficient? Yes and no.

There are several benefits to building the git grep command directly into the Git toolset. First, speed and simplicity. Git doesn’t have to completely check out a branch in order to do the search; it can operate directly on the objects from the object store. You don’t have to write some script to check out a commit from way back in time, then search those files, then restore your original checked out state. Second, Git can offer enhanced features and options by being an integrated tool. Notably, it offers searches that are limited to tracked files, untracked files, files cached in the index, ignored or excluded files, variations on searching snapshots from the repository history, and repository-specific pathspec limiters.

Updating and Deleting refs

Way back in refs and symrefs, I introduced the concept of a ref and mentioned Git also had several symbolic refs that it maintained. By now, you should be familiar with branches as refs, how they are maintained under the .git directory, and that the symbolic refs are also maintained there. Somewhere in there a bunch of SHA1 values exist, get updated, shuffled around, deleted, and referenced by other refs.

Occasionally, it is nice or even necessary to directly change or delete a ref. If you know exactly what you are doing, you could manipulate all of those files by hand. But if you don’t do it correctly, it is easy to mess things up.

To ensure that the basic ref manipulations are done properly, Git supplies the command git update-ref. This command understands all of the nuances of refs, symbolic refs, branches, SHA1 values, logging changes, the reflog, etc. If you need to directly change a ref’s value, you should use a command like:

    $ git update-ref someref SHA1

where someref is the name of a branch or ref to be updated to the new value, SHA1. Furthermore, if you want to delete a ref, the proper way to do so is:

    $ git update-ref -d someref

Of course, the normal branch operations might be more appropriate, but if you find yourself directly changing a ref, using git update-ref ensures that all of the bookkeeping for Git’s infrastructure is done properly, too.

Following Files that Moved

If, over the history of a file, it is moved from one place to another within your repository directory structure, Git will usually only trace back over its history using its current name.

To see the complete history of the file, even across moves, use the --follow as well. For example, the following command shows the commit log for a file currently named file, but includes the log for its prior names as well:

    $ git log –-follow file

Add the --name-only option to have Git also state the name of that file as it changes:

    $ git log –-follow --name-only file

In the following example, file a is first added in the directory foo and then moved to directory bar:

    $ git init
    $ mkdir foo
    $ touch foo/a
    $ git add foo/a
    $ git commit -m "First a in foo" foo/a
    $ mkdir bar
    $ git mv foo/a bar/a
    $ git commit -m "Move foo/a to bar/a"

At this point, a simple git log bar/b will show only the commit that created file bar/a, but adding option --follow will trace back through its name changes, too:

    $ git log --oneline bar/a
    6a4115b Move foo/a to bar/a

    $ git log --oneline --follow bar/a
    6a4115b Move foo/a to bar/a
    1862781 First a in foo

If you want to use its original name, you have to work harder because only the current name of the file, bar/a, is able to be referenced normally. Adding option -- and then any of its current or former names will work. And adding --all will produce a comprehensive search as if all refs were searched, too.

    $ git log --oneline foo/a
    fatal: ambiguous argument 'foo/a': unknown revision or path not in the
           working tree.
    Use '--' to separate paths from revisions

    $ git log --oneline -- foo/a
    6a4115b Move foo/a to bar/a
    1862781 First a in foo

Keep, But Don’t Track, This File

A common developer problem, described here by Bart Massey, arises with Makefiles and other configuration files: the version that the developer works with locally may be customized in ways that are not intended to be visible upstream. For example, I commonly change my Makefile CFLAGS from -Wall -g -O2 to -Wall -g -pg during development. Of course, I also change the Makefile in ways that should be visible upstream, such as adding new targets.

I could maintain a separate local development branch, which differs only in the Makefile. Whenever I make a change, I could merge back to master and push upstream. I’d have to do an interactive merge in order to omit my custom CFLAGS (while maybe merging other changes). This seems hard and error prone.

Another solution would be to implement some form of Makefile snippet that provided local overrides for certain variable settings. But this approach is highly specific where an otherwise general problem remains.

It turns out that git update-index --assume-unchanged Makefile will leave the Makefile in the repository, but will cause Git to assume that subsequent changes to the working copy are not to be tracked. Thus, I can commit the version with the CFLAGS I want published, mark the Makefile with --assume-unchanged, and edit the CFLAGS to correspond to my development version. Now, subsequent pushes and commits will ignore the Makefile. Indeed, git add Makefile will report an error when the Makefile is marked --assume-unchanged.

When I want to make a published change to my Makefile, I can proceed via:

   $ git update-index --no-assume-unchanged Makefile
   $ git add -p Makefile

   #  [add the Makefile changes I want published]
   $ git commit
   $ git update-index --assume-unchanged Makefile
   $ git push

This work flow does require that I remember to perform the previous steps when I want a Makefile change published. But that is relatively infrequent. Further, initially forgetting carries a low price tag: I can always do it later.

Have You Been Here Before?

Ever have that feeling you’ve worked through a complex merge or rebase over and over again? Are you getting tired of it yet? Do you wish there was some way to automate it?

I thought so. And so did the Git developers!

Git has a feature named rerere that automates the chore of solving the same merge or rebase conflicts repeatedly. The seemingly alliterative name is a shortening of reuse recorded resolution. Sometimes long development cycles that use a branch to hold a line of development that undergoes many development iterations before finally being merged into a mainline development will have to be rebased or moved through the same set of conflicts and resolutions many times.

To enable and use the git rerere command, you must first set the Boolean rerere.enabled option to true.

    $ git config --global rerere.enabled true

Once enabled, this feature records the right and left side of a merge conflict in the .git/rr-cache directory and, if resolved, also records the manual resolution to that conflict. If the same conflict is seen again, the automatic resolution engages and preemptively solves the conflict.

When rerere is enabled and participates in a merge, it will prevent autocommitting of the merge, giving the opportunity to review the automatic conflict resolution before making it a part of the commit history.

Rerere has only one prominent shortcoming: the nonportability of the .rr-cache directory. Conflict and resolution recording happens on a per clone basis and is not transmitted in push or pull operations.

^[44]No, that’s not a typo. See Have You Been Here Before?.

^[45]I both elided an obsolete name reference, and shortened the actual output lines for this example. Oh, and apparently I’m a closet Git artist!

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 20. Tips, Tricks, and Techniques

Create new playlist

Sign In

Sign Up