Appendix B: Embedded Git in Your Applications

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

APPENDIX B

Embedded Git in Your Applications

If your application is for developers, chances are good that it could benefit from integration with source control. Even non-developer applications, such as document editors, could potentially benefit from version-control features and Git’s model works very well for many different scenarios.

If you need to integrate Git with your application, you have essentially three choices: spawning a shell and using the Git command-line tool, Libgit2, and JGit.

Command-line Git

One option is to spawn a shell process and use the Git command-line tool to do the work. This has the benefit of being canonical, and all Git’s features are supported. This also happens to be fairly easy, as most runtime environments have a relatively simple facility for invoking a process with command-line arguments. However, this approach does have some downsides.

One is that all the output is in plain text. This means that you’ll have to parse Git’s occasionally changing output format to read progress and result information, which can be inefficient and error-prone.

Another is the lack of error recovery. If a repository is corrupted somehow, or the user has a malformed configuration value, Git will simply refuse to perform many operations.

Yet another is process management. Git requires you to maintain a shell environment on a separate process, which can add unwanted complexity. Trying to coordinate many of these processes (especially when potentially accessing the same repository from several processes) can be quite a challenge.

Libgit2

Another option at your disposal is to use Libgit2. Libgit2 is a dependency-free implementation of Git, with a focus on having a nice API for use within other programs. You can find it at http://libgit2.github.com.

First, let’s take a look at what the C API looks like. Here’s a whirlwind tour:

// Open a repository
git_repository *repo;
int error = git_repository_open(&repo, "/path/to/repository");

// Dereference HEAD to a commit
git_object *head_commit;
error = git_revparse_single(&head_commit, repo, "HEAD^{commit}");
git_commit *commit = (git_commit*)head_commit;

// Print some of the commit's properties
printf("%s", git_commit_message(commit));
const git_signature *author = git_commit_author(commit);
printf("%s <%s>
", author->name, author->email);
const git_oid *tree_id = git_commit_tree_id(commit);

// Cleanup
git_commit_free(commit);
git_object_free(head_commit);
git_repository_free(repo);

The first couple of lines open a Git repository. The git_repository type represents a handle to a repository with a cache in memory. This is the simplest method, for when you know the exact path to a repository’s working directory or .git folder. There’s also the git_repository_open_ext, which includes options for searching, git_clone and friends for making a local clone of a remote repository, and git_repository_init for creating an entirely new repository.

The second chunk of code uses rev-parse syntax (see “Branch References” in Chapter 7 for more on this) to get the commit that HEAD eventually points to. The type returned is a git_object pointer, which represents something that exists in the Git object database for a repository. git_object is actually a “parent” type for several different kinds of objects; the memory layout for each of the “child” types is the same as for git_object, so you can safely cast to the right one. In this case, git_object_type(commit) would return GIT_OBJ_COMMIT, so it’s safe to cast to a git_commit pointer.

The next chunk shows how to access the commit’s properties. The last line here uses a git_oid type; this is Libgit2’s representation for a SHA-1 hash.

From this sample, a couple of patterns have started to emerge:

If you declare a pointer and pass a reference to it into a Libgit2 call, that call will probably return an integer error code. A 0 value indicates success; anything less is an error.
If Libgit2 populates a pointer for you, you’re responsible for freeing it.
If Libgit2 returns a const pointer from a call, you don’t have to free it, but it will become invalid when the object it belongs to is freed.
Writing C is a bit painful.

That last one means it isn’t very probable that you’ll be writing C when using Libgit2. Fortunately, there are a number of language-specific bindings available that make it fairly easy to work with Git repositories from your specific language and environment. Let’s take a look at the preceding example written using the Ruby bindings for Libgit2, which are named Rugged, and can be found at https://github.com/libgit2/rugged.

repo = Rugged::Repository.new('path/to/repository')
commit = repo.head.target
puts commit.message
puts "#{commit.author[:name]} <#{commit.author[:email]}>"
tree = commit.tree

As you can see, the code is much less cluttered. Firstly, Rugged uses exceptions; it can raise things like ConfigError or ObjectError to signal error conditions. Secondly, there’s no explicit freeing of resources, because Ruby is garbage-collected. Let’s take a look at a slightly more complicated example: crafting a commit from scratch:

blob_id = repo.write("Blob contents", :blob) (1)

index = repo.index
index.read_tree(repo.head.target.tree)
index.add(:path => 'newfile.txt', :oid => blob_id) (2)

sig = {
    :email => "[email protected]",
    :name => "Bob User",
    :time => Time.now,
}

commit_id = Rugged::Commit.create(repo,
    :tree => index.write_tree(repo), (3)
    :author => sig,
    :committer => sig, (4)
    :message => "Add newfile.txt", (5)
    :parents => repo.empty? ? [] : [repo.head.target].compact, (6)
    :update_ref => 'HEAD', (7)
)

commit = repo.lookup(commit_id) (8)

Create a new blob, which contains the contents of a new file.
Populate the index with the head commit’s tree, and add the new file at the path newfile.txt.
This creates a new tree in the ODB, and uses it for the new commit.
We use the same signature for both the author and committer fields.
The commit message.
When creating a commit, you have to specify the new commit’s parents. This uses the tip of HEAD for the single parent.
Rugged (and Libgit2) can optionally update a reference when making a commit.
The return value is the SHA-1 hash of a new commit object, which you can then use to get a commit object.

The Ruby code is nice and clean, but because Libgit2 is doing the heavy lifting, this code will run pretty fast, too. If you’re not a rubyist, we touch on some other bindings in the section "Other Bindings."

Advanced Functionality

Libgit2 has a couple of capabilities that are outside the scope of core Git. One example is pluggability: Libgit2 allows you to provide custom “backends” for several types of operation, so you can store things in a different way than stock Git does. Libgit2 allows custom backends for configuration, ref storage, and the object database, among other things.

Let’s take a look at how this works. The code that follows is borrowed from the set of backend examples provided by the Libgit2 team (which can be found at https://github.com/libgit2/libgit2-backends). Here’s how a custom backend for the object database is set up:

git_odb *odb;
int error = git_odb_new(&odb); (1)

git_repository *repo;
error = git_repository_wrap_odb(&repo, odb); (2)

git_odb_backend *my_backend;
error = git_odb_backend_mine(&my_backend, /*...*/); (3)

error = git_odb_add_backendodb, my_backend, 1); (4)

Note Errors are captured, but not handled. We hope your code is better than ours.

Initialize an empty object database (ODB) “frontend,” which will act as a handle to the real ODB.
Construct a git_repository around the empty ODB.
Initialize a custom ODB backend
Set the repository to use the custom backend for its ODB.

But what is this git_odb_backend_mine thing? Well, that’s your own ODB implementation, and you can do whatever you want in there, so long as you fill in the git_odb_backend structure properly. Here’s what it could look like:

typedef struct {
    git_odb_backend parent;

    // Some other stuff
    void *custom_context;
} my_backend_struct;

int git_odb_backend_mine(git_odb_backend **backend_out, /*...*/)
{
    my_backend_struct *backend;

    backend = calloc(1, sizeof (my_backend_struct));

    backend->custom_context = ...;

    backend->parent.read = &my_backend__read;
    backend->parent.read_prefix = &my_backend__read_prefix;
    backend->parent.read_header = &my_backend__read_header;
    // ...

    *backend_out = (git_odb_backend *) backend;

    return GIT_SUCCESS;
}

The subtlest constraint here is that my_backend_struct's first member must be a git_odb_backend structure; this ensures that the memory layout is what the Libgit2 code expects it to be. The rest of it is arbitrary; this structure can be as large or small as you need it to be.

The initialization function allocates some memory for the structure, sets up the custom context, and then fills in the members of the parent structure that it supports. Take a look at the include/git2/sys/odb_backend.h file in the Libgit2 source for a complete set of call signatures; your particular use case will help determine which of these you’ll want to support.

Other Bindings

Libgit2 has bindings for many languages. Here we show a small example using a few of the more complete bindings packages as of this writing; libraries exist for many other languages, including C++, Go, Node.js, Erlang, and the JVM, all in various stages of maturity. The official collection of bindings can be found by browsing the repositories at https://github.com/libgit2. The code we’ll write will return the commit message from the commit eventually pointed to by HEAD (sort of like git log -1).

LibGit2Sharp

If you’re writing a .NET or Mono application, LibGit2Sharp (https://github.com/libgit2/libgit2sharp) is what you’re looking for. The bindings are written in C#, and great care has been taken to wrap the raw Libgit2 calls with native-feeling CLR APIs. Here’s what our example program looks like:

new Repository(@"C:path	o
epo").Head.Tip.Message;

For desktop Windows applications, there’s even a NuGet package that will help you get started quickly.

objective-git

If your application is running on an Apple platform, you’re likely using Objective-C as your implementation language. Objective-Git (https://github.com/libgit2/objective-git) is the name of the Libgit2 bindings for that environment. The example program looks like this:

GTRepository *repo =
    [[GTRepository alloc] initWithURL:[NSURL fileURLWithPath: @"/path/to/repo"] error:NULL];
NSString *msg = [[[repo headReferenceWithError:NULL] resolvedTarget] message];

Objective-git is fully interoperable with Swift, so don’t fear if you’ve left Objective-C behind.

pygit2

The bindings for Libgit2 in Python are called Pygit2, and can be found at http://www.pygit2.org/. Our example program:

pygit2.Repository("/path/to/repo") # open repository
    .head.resolve()                # get a direct ref
    .get_object().message          # get commit, read message

Table of Contents for Appendix B: Embedded Git in Your Applications

Create new playlist

Sign In

Sign Up

Table of Contents for
Appendix B: Embedded Git in Your Applications