Now you have the basis for some sophisticated sharing with Git. Without loss of generality, and to make examples easy to run on your own system, this section shows multiple repositories on one physical machine. In real life, they’d probably be located on different hosts across the Internet. Other forms of remote URL specification may be used since the same mechanisms apply as well to repositories on physically disparate machines.
Let’s explore a common-use case for Git. For the sake of illustration, let’s set up a repository that all developers consider authoritative, although technically it’s no different from other repositories. In other words, authority lies in how everyone agrees to treat the repository, not in some technical or security measure.
This agreed-on authoritative copy is often placed in a special directory known as a depot. (Avoid using the terms “master” or “repository” when referring to the depot, because those idioms mean something else in Git.)
There are often good reasons for setting up a depot. For instance, your organization may thereby reliably and professionally back up the filesystems of some large server. You want to encourage your coworkers to check everything into the main copy within the depot in order to avoid catastrophic losses. The depot will be the remote origin for all developers.
The following sections show how to place an initial repository in the depot, clone development repositories out of the depot, do development work within them, and then sync them with the depot.
To illustrate parallel development on this repository, a second developer will clone it, work with his repository, and then push his changes back into the depot for all to use.
You can place your authoritative depot anywhere on your filesystem; for this example, let’s use /tmp/Depot. No actual development work should be done directly in the /tmp/Depot directory or in any of its repositories. Instead, individual work should be performed in a local clone.
The first step is to populate /tmp/Depot with an initial repository. Assuming you want to work on website content that is already established as a Git repository in ~/public_html, make a copy of the ~/public_html repository and place it in /tmp/Depot/public_html:
# Assume that ~/public_html is already a Git repository $cd /tmp/Depot/
$git clone --bare ~/public_html public_html.git
Initialized empty Git repository in /tmp/Depot/public_html.git/
This clone command copies the Git remote repository from ~/public_html into the current working directory, /tmp/Depot. The last argument gives the repository a new name, public_html.git. By convention, bare repositories are named with a .git suffix. This is not a requirement, but it is considered a best practice.
The original development repository has a full set of project files checked out at the top level, and the object store and all of the configuration files are located in the .git subdirectory:
$cd ~/public_html/
$ls -aF
./ fuzzy.txt index.html techinfo.txt ../ .git/ poem.html $ls -aF .git
./ config hooks/ objects/ ../ description index ORIG_HEAD branches/ FETCH_HEAD info/ packed-refs COMMIT_EDITMSG HEAD logs/ refs/
Because a bare repository has no working directory, its files have a simpler layout:
$cd /tmp/Depot/
$ls -aF public_html.git
./ branches/ description hooks/ objects/ refs/ ../ config HEAD info/ packed-refs
You can now treat this bare /tmp/Depot/public_html.git repository as the authoritative version.
Because you used the --bare
option during this clone operation, Git did
not introduce the normal, default
origin
remote.
Here’s the configuration in the new, bare repository:
# In /tmp/Depot/public_html.git
$ cat config
[core]
repositoryformatversion = 0
filemode = true
bare = true
Right now, you have two repositories that are virtually identical, except the initial repository has a working directory and the bare clone does not.
Moreover, because the ~/public_html
repository in your home directory was created using git
init and not via a
clone, it lacks an origin
. In
fact, it has no remote configured at all.
It is easy enough to add one, though. And it’s needed if the goal is to perform more development in your initial repository and then push that development to the newly established, authoritative repository in the depot. In a sense, you must manually convert your initial repository into a derived clone.
A developer who clones from the depot will have an
origin
remote created automatically. In fact, if
you were to turn around now and clone off the depot, you would see it
set up for you automatically, too.
The command for manipulating remotes is git remote. This operation introduces a few new settings in the .git/config file:
$cd ~/public_html
$cat .git/config
[core] repositoryformatversion = 0 filemode = true bare = false logallrefupdates = true $git remote add origin /tmp/Depot/public_html
$cat .git/config
[core] repositoryformatversion = 0 filemode = true bare = false logallrefupdates = true [remote "origin"] url = /tmp/Depot/public_html fetch = +refs/heads/*:refs/remotes/origin/*
Here, git remote added a new
remote
section called origin
to
our configuration. The name origin
isn’t magical or
special. You could have used any other name, but the remote that
points back to the basis repository is named origin
by convention.
The remote establishes a link from your current repository to
the remote repository found, in this case, at
/tmp/Depot/public_html.git, as recorded in the
url
value. Now, the name origin
can be used as a shorthand reference for the remote repository found
in the depot. Note that a default fetch refspec that follows the
branch name mapping conventions has also been added.
Let’s complete the process of setting up the
origin
remote by establishing new tracking branches
in the original repository to represent the branches from the remote
repository. First, you can see that there is only one branch, as
expected, called master
:
# List all branches
$ git branch -a
* master
Now, use git remote update:
$git remote update
Updating origin From /tmp/Depot/public_html * [new branch] master -> origin/master $git branch -a
* master origin/master
Git introduced a new branch called
origin/master
into the repository. It is a tracking
branchtracking branches within the origin
remote. Nobody
does development in this branch. Instead, its purpose is to hold and
track the commits made in the remote origin
repository’s master
branch. You could consider it
your local repository’s proxy for commits made in the remote;
eventually you can use it to bring those commits into your
repository.
The phrase Updating origin
,
produced by the git remote update, doesn’t mean that the remote
repository was updated. Rather, it means that the
local repository’s notion of the
origin
has been updated based on information
brought in from the remote repository.
The git remote update caused every remote
within this repository to be updated by checking for and then
fetching any new commits from each repository named in a remote.
Instead of generically updating all remotes, you can restrict the
fetch operation to update a single remote by
simply supplying the -f
option when the remote is initially added:
git remote add -f origin repository
Now you’re done linking your repository to the remote repository in your depot.
Let’s do some development work in the repository and add another poem, fuzzy.txt:
$cd ~/public_html
$git show-branch -a
[master] Merge branch 'master' of ../my_website $cat fuzzy.txt
Fuzzy Wuzzy was a bear Fuzzy Wuzzy had no hair Fuzzy Wuzzy wasn't very fuzzy, Was he? $git add fuzzy.txt
$git commit
Created commit 6f16880:Add a hairy poem.
1 files changed, 4 insertions(+), 0 deletions(-) create mode 100644 fuzzy.txt $git show-branch -a
* [master] Add a hairy poem. ! [origin/master] Merge branch 'master' of ../my_website -- * [master] Add a hairy poem. -- [origin/master] Merge branch 'master' of ../my_website
At this point, your repository has one more commit than the
repository in /tmp/Depot.
Perhaps more interesting is that your repository has two branches, one
(master
) with the new commit on it and the other
(origin/master
) that is tracking the remote
repository.
Any change that you commit is completely local to your repository; it is not yet present in the remote repository. A convenient way to get your commit into the remote repository is to use the git push command:
$ git push origin
Counting objects: 4, done.
Compressing objects: 100% (3/3), done.
Writing objects: 100% (3/3), 400 bytes, done.
Total 3 (delta 0), reused 0 (delta 0)
Unpacking objects: 100% (3/3), done.
To /tmp/Depot/public_html
0d4ce8a..6f16880 master -> master
All that output means that Git has taken your
master
branch changes, bundled them up, and sent
them to the remote repository named origin
. Git has
also performed one more step here: it has taken those same changes and
added them to the origin/master
branch in your repository
as well. In effect, Git has caused the changes that were originally on
your master
branch to be sent to the remote
repository and then has requested that they be brought back onto the
origin/master
tracking branch as well.
Git doesn’t actually round-trip the changes. After all, the commits are already in your repository. Git is smart enough to instead simply fast-forward the tracking branch.
Now both local branches, master
and
origin/master
, reflect the same commit within your
repository:
$ git show-branch -a
* [master] Add a hairy poem.
! [origin/master] Add a hairy poem.
--
*+ [master] Add a hairy poem.
You can also probe the remote repository and verify that it, too, has been updated. If your remote repository is on a local filesystem, as it is here, you can easily check by going to the depot directory:
$cd /tmp/Depot/public_html.git
$git show-branch
[master] Add a hairy poem.
When the remote repository is on a physically different machine, a plumbing command can be used to determine the branch information of the remote repository:
# Go to the actual remote repo and query it
$ git ls-remote origin
6f168803f6f1b987dffd5fff77531dcadf7f4b68 HEAD
6f168803f6f1b987dffd5fff77531dcadf7f4b68 refs/heads/master
You can then show that those commit IDs match your current,
local branches using something like git rev-parse HEAD or git
show commit-id
.
Once you have established an authoritative repository, it is easy to add a new developer to a project simply by letting her clone the repository and begin working.
Let’s introduce Bob to the project by giving him his own cloned repository in which to now work:
$cd /tmp/bob
$git clone /tmp/Depot/public_html.git
Initialized empty Git repository in /tmp/public_html/.git/ $ls
public_html $cd public_html
$ls
fuzzy.txt index.html poem.html techinfo.txt $git branch
* master $git log -1
commit 6f168803f6f1b987dffd5fff77531dcadf7f4b68 Author: Jon Loeliger <[email protected]> Date: Sun Sep 14 21:04:44 2008 -0500 Add a hairy poem.
Immediately, you can see from ls that the clone has a working directory populated with all the files under version control. That is, Bob’s clone is a development repository and not a bare repository. Good. Bob will be doing some development, too.
From the git log output, you can see that the
most recent commit is available in Bob’s repository. Additionally,
since Bob’s repository was cloned from a parent repository, it has a
default remote called origin
. Bob can find out more
information about the origin
remote within his
repository:
$ git remote show origin
* remote origin
URL: /tmp/Depot/public_html.git
Remote branch merged with 'git pull' while on branch master
master
Tracked remote branch
master
The complete contents of the configuration file after a default
clone show how it contains the origin
remote:
$ cat .git/config
[core]
repositoryformatversion = 0
filemode = true
bare = false
logallrefupdates = true
[remote "origin"]
url = /tmp/Depot/public_html.git
fetch = +refs/heads/*:refs/remotes/origin/*
[branch "master"]
remote = origin
merge = refs/heads/master
In addition to having the origin
remote in
his repository, Bob also has a few branches. He can list all of the
branches in his repository by using git branch
-a:
$ git branch -a
* master
origin/HEAD
origin/master
The master
branch is Bob’s main development
branch. It is the normal, local topic branch. The
origin/master
branch is a tracking branch to follow
the commits from the master
branch of the
origin
repository. The
origin/HEAD
branch indicates through a symbolic
name which branch the remote considers the active branch. Finally, the
asterisk next to the
master
branch name indicates that it is the
current, checked-out branch in his repository.
Let’s have Bob make a commit that alters the hairy poem and then push that to the main depot repository. Bob thinks the last line of the poem should be “Wuzzy?”, makes this change, and commits it:
$git diff
diff --git a/fuzzy.txt b/fuzzy.txt index 0d601fa..608ab5b 100644 --- a/fuzzy.txt +++ b/fuzzy.txt @@ -1,4 +1,4 @@ Fuzzy Wuzzy was a bear Fuzzy Wuzzy had no hair Fuzzy Wuzzy wasn't very fuzzy, -Was he? +Wuzzy? $git commit fuzzy.txt
Created commit 3958f68: Make the name pun complete! 1 files changed, 1 insertions(+), 1 deletions(-)
To complete Bob’s development cycle, he pushes his changes to the depot, using git push as before.
$ git push
Counting objects: 5, done.
Compressing objects: 100% (3/3), done.
Writing objects: 100% (3/3), 377 bytes, done.
Total 3 (delta 1), reused 0 (delta 0)
Unpacking objects: 100% (3/3), done.
To /tmp/Depot/public_html.git
6f16880..3958f68 master -> master
Let’s suppose that Bob goes on vacation and, in the meantime, you make further changes and push them to the depot repository. Let’s assume you did this after getting Bob’s latest changes.
Your commit looks like this:
$cd ~/public_html
$git diff
diff --git a/index.html b/index.html index 40b00ff..063ac92 100644 --- a/index.html +++ b/index.html @@ -1,5 +1,7 @@ <html> <body> My website is alive! +<br/> +Read a <a href="fuzzy.txt">hairy</a> poem! </body> <html> $git commit -m"Add a hairy poem link." index.html
Created commit 55c15c8: Add a hairy poem link. 1 files changed, 2 insertions(+), 0 deletions(-)
Using the default push refspec, push your commit upstream:
$ git push
Counting objects: 5, done.
Compressing objects: 100% (3/3), done.
Unpacking objects: 100% (3/3), done.
Writing objects: 100% (3/3), 348 bytes, done.
Total 3 (delta 1), reused 0 (delta 0)
To /tmp/Depot/public_html
3958f68..55c15c8 master -> master
Now, when Bob returns he’ll want to refresh his clone of the repository. The primary command for doing this is git pull:
$ git pull
remote: Counting objects: 5, done.
remote: Compressing objects: 100% (3/3), done.
remote: Total 3 (delta 1), reused 0 (delta 0)
Unpacking objects: 100% (3/3), done.
From /tmp/Depot/public_html
3958f68..55c15c8 master -> origin/master
Updating 3958f68..55c15c8
Fast forward
index.html | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)
The fully specified git pull command allows
both the repository and multiple refspecs to be specified:
git pull options
repository
refspecs
.
If the repository is not specified on the command line, either
as a Git URL or indirectly through a remote name, the default remote
origin
is used. If you don’t specify a refspec on
the command line, the fetch refspec of the remote is used. If you
specify a repository (directly or using a remote) but no refspec, Git
fetches the HEAD
ref of the remote.
The git pull operation is fundamentally two steps, each implemented by a separate Git command. Namely, git pull implies git fetch, followed by either git merge or git rebase. By default, the second step is merge because this is almost always the desired behavior.
Before using the git pull --rebase mechanism, you should fully understand the history-altering effects of a rebase operation, as described in Chapter 10 and the implications for other users, as described in Chapter 12.
Because pull also performs the second merge or rebase step, git push and git pull are not considered opposites. Instead, git push and git fetch are considered opposites. Both push and fetch are responsible for transferring data between repositories, but in opposite directions.
Sometimes, you may want to execute the git fetch and git merge as two separate operations. For example, you may want to fetch updates into your repository to inspect them but not necessarily merge immediately. In this case, you can simply perform the fetch and then perform other operations on the tracking branch, such as git log, git diff, or even gitk. Later, when you are ready (if ever!), you may perform the merge at your convenience.
Even if you never separate the fetch and merge, you may do complex operations that require you to know what’s happening at each step. So let’s look at each one in detail.
In the first fetch step, Git
locates the remote repository. Since the command line did not
specify a direct repository URL or a direct remote name, it assumes
the default remote name, origin
. The information
for that remote is in the configuration file:
[remote "origin"] url = /tmp/Depot/public_html.git fetch = +refs/heads/*:refs/remotes/origin/*
Git now knows to use the URL /tmp/Depot/public_html as the source repository.
Next, Git performs a protocol negotiation with the source
repository to determine what new commits are in the remote
repository and are absent from your repository, based on the desire
to fetch all of the refs/heads/*
refs as given in
the fetch refspec.
You don’t have to fetch all of the topic branches from the
remote repository using the refs/heads/*
wildcard form. If you want only a particular branch or two, list
them explicitly:
[remote "newdev"] url = /tmp/Depot/public_html.git fetch = +refs/heads/dev:refs/remotes/origin/dev fetch = +refs/heads/stable:refs/remotes/origin/stable
The output prefixed by
remote:
reflects the negotiation,
compression, and transfer protocol, and it lets you know that new
commits are coming into your repository.
Git places the new commits in your repository on an appropriate tracking branch and then tells you what mapping it uses to determine where the new commits belong:
From /tmp/Depot/public_html 3958f68..55c15c8 master -> origin/master
Those lines indicate that Git looked at the remote repository
/tmp/Depot/public_html
, took
its master
branch, brought
its contents back to your repository, and placed them on
your origin/master
branch.
This process is the heart of branch tracking.
The corresponding commit IDs are also listed, just in case you want to inspect the changes directly. With that, the fetch step is finished.
In the second step of the pull
operation, Git performs, by default, a merge, or
a rebase operation. In this example, Git merges
the contents of the tracking branch,
origin/master
, into your
master
branch using a special type of merge
called a fast-forward.
But how did Git know to merge those particular branches? The answer comes from the configuration file:
[branch "master"] remote = origin merge = refs/heads/master
Paraphrased, this gives Git two key pieces of information:
When
master
is the current, checked-out branch, useorigin
as the default remote from which to fetch updates during a fetch (or pull). Further, during the merge step of git pull, userefs/heads/master
from the remote as the default branch to merge into this, themaster
branch.
For readers paying close attention to detail, the first part
of that paraphrase is the actual mechanism by which Git determines
that origin
should be the remote used during this
parameterless git pull command.
The value of the merge
field in the
branch
section of the configuration file
(branch.*.merge
) is treated like the remote part
of a refspec, and it must match one of the
source refs just fetched during the
git pull command. It’s a little convoluted, but
think of this as a hint conveyed from the fetch
step to the merge step of a
pull command.
Because the merge
configuration value
applies only during git pull, a manual
application of git merge at this point must name
the merge source branch on the command line. The branch is likely a
tracking branch name, such as this:
# Or, fully specified: refs/remotes/origin/master
$ git merge origin/master
Updating 3958f68..55c15c8
Fast forward
index.html | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)
And with that, the merge step is also done.
There are slight semantic differences between the merging behavior of branches when multiple refspecs are given on the command line and when they are found in a remote entry. The former causes an octopus merge, whereas the latter does not. Read the git pull manual page carefully!
In the same way that your master
branch can
be thought of as “extending” the development brought in
on the origin/master
branch, you can create a new
branch based on any remote tracking branch and use it to
“extend” that line of development.
If you create a new branch based on a remote tracking branch,
Git automatically adds a branch
entry to indicate
that the tracking branch should be merged into your new
branch:
# Create mydev based on origin/master
$ git branch mydev origin/master
Branch mydev set up to track remote branch refs/remotes/origin/master.
The preceding command causes Git to add the following configuration values for you:
[branch "mydev"] remote = origin merge = refs/heads/master
As usual, you may also use git config or a
text editor to manipulate the branch
entries in
the configuration file.
With the merge
value established, your
development branch is configured to readily accommodate your commits
from this repository and to merge in changes from the corresponding
tracking branch.
If you choose to rebase rather than merge, Git will instead
forward-port the changes on your topic branch to the newly fetched
HEAD
of the corresponding remote tracking branch.
The operation is the same as that shown in Figures 10-12 and 10-13.
To make rebase the normal operation for a
branch, set the rebase
configuration variable to
true
:
[branch "mydev"] remote = origin merge = refs/heads/master rebase = true
3.144.131.62