Chapter 14

Project Setup

Once you have decided to use Git, the first step would be to allocate files and directories for a Git repository. It is important to decide whether the project should be versioned in one repository or in multiple repositories. Since Git can only create a branch or a tag for the whole repository, the decision very much depends on the release units of your project.

After the project division is done, you must create a repository for each module and fill it. Empty directories and files that are not to be versioned will have special treatment.

When working in a team, for each module you must define a repository as the central repository. All developers will take advantage of this central repository to fetch the current status and record their changes.

You have to decide how all developers in the team can access the central repository. Git supports access via a shared network drive, through a web server, a proprietary network protocol and the Secure Shell Infrastructure (SSH).

Which protocol you choose depends on the existing infrastructure, local distribution, and the requirements relating to the administration of rights.

This workflow describes the following.

  • how a project directory is converted into a repository,
  • how empty directories are versioned,
  • how to deal with the end of line problem
  • which access options are available to a central repository and how this is set up, and
  • how team members access the central repository.

Overview

This workflow is made up of two parts. In the first step, a repository for a project directory is created. In the second step, a central repository is made available to all developers.

Figure 14.1 shows how a project named projecta is transferred to a repository. Pay special attention to the empty directory named EmptyDir, because empty directories are not usually versioned by Git. You can force Git to version an empty directory by creating a file in it, such as .gitignore.

Likewise, during the initial commit you should make sure you do not version files that are not supposed to be in the repository, such as build results or temporary files. In the example you can see this in the TempDir directory. Backup files are stored in it even though this directory should not be versioned. To exclude this directory in future commits, you should create a .gitignore file in the root directory of the project and specify directories to be ignored in this file.

In the second step the new repository will be made available to other developers. In this case, Git supports different protocol variants:

  • file: Access via a shared network drive
  • git: Proprietary server service with network communication
  • http: Access through a web server
  • ssh: access via a secure shell infrastructure

In Git, multiple access points for the same repository can be deployed in parallel. For instance, it is common to configure a HTTP access for anonymous read access and an SSH access for write access.

Requirements

  • Shared server: For collaboration with Git, one of the following must already exist: a shared network drive, a server computer that can start a service, a web server with CGI support or an SSH infrastructure.
  • Assignment of rights at the project level: Git only recognizes reading and writing rights of the entire repository, i.e. it cannot assign fine-grain rights to individual directories.

Compact Workflow: Setting Up A Project

A project directory is imported into a new repository. This repository is provided as a central repository for the development team.

Figure 14.1: Workflow overview

Process and Implementation

The following procedures use the simple sample project projecta in Figure 14.1.

Create A New Repository from the Project Directory

This section shows how you can create a bare repository for an existing project. A bare repository is a precondition to share the repository with the team later.

The starting point is a directory in the file system that is gradually transformed into the finished bare repository.

Step 1: Prepare Empty Directories

Git is basically a content tracker, it can efficiently manage versions of files of different types. By contrast, directories are only considered structuring units and only versioned in conjunction with files.

Empty directories are not relevant in Git and cannot be added using the add command to a commit.

As long as the development environment is not dependent on the empty directories, you can just ignore them. Sometimes you cannot delete these empty directories because some development environments and tools assume the existence of these directories and they will complain if the empty directories are missing.

By adding a file to an empty directory, you can force Git to version the empty directory. Theoretically, you can add any file as long as it does not mean anything to the development environment. Adding a .gitignore or .gitkeep file should be okay.

As an example, you should do this to the EmptyDir directory in Figure 14.1. Using the Unix touch command, you can create an empty file in EmptyDir.

> cd projecta/EmptyDir

> touch .gitignore

Files whose names start with a dot are hidden files in Unix and ignored by many development environments.

Temporary files (such as build results) are often added to an empty directory. To prevent these files from being included by mistake in a commit, you can create a .gitignore file in the directory and insert a line with an asterisk (“*”) to indicate to Git that all files in the directory must be ignored and should not show up as “untracked” when the status command is called.

The following Unix command echo creates a new .gitignore file and inserts “*” to it.

> echo "*" > .gitignore 

Step 2: Ignore unnecessary files and directories

Development and build tools often create temporary files, such as class files in Java. These files should not be versioned. To prevent temporary files from being versioned, create a file named .gitignore and list all unwanted files and directories in it.

The .gitignore file can be created in each directory. The entries are always applied from this level and all its subdirectories.

The following is the content of a .gitignore file in the example in Figure 14.1. Each line specifies a pattern for a file name that should be ignored. In this case, the TempDir directory and all files with .bak extension should be ignored.

# Content of .gitignore
/TempDir
*.bak

To easily keep track of what files are ignored, create a .gitignore file only in the root of the project. Even deeper subdirectories can be excluded from that file. The only exceptions are the .gitignore files in empty directories where the files are only there to force Git to version the directories.

Step 3: Creating a repository

After the project files to be imported were prepared in the previous steps, in this step you create the repository.

> cd projecta 

> git init

Step 4: Define treatment of line endings

Prior to actually importing files, you need to decide how to deal with line endings in text files.

Problems with line endings always occur when you develop simultaneously on different operating systems or when you use text files in different operating systems.

Windows uses CRLF (Carriage Return and Line Feed) to encode line breaks. Unix systems and Mac computers use LF (Line Feed) for line breaks. Text editors on different platforms deal with the line breaks of the other platforms, and, as such, this problem is largely solved.

However, a text editor, with or without the user’s knowledge, adjusts the line breaks for the respective platform. This in turn means that Git recognizes a row as changed even though the content did not change. You can well imagine how many merge conflicts arise from it.

Git provides a solution to the problem by standardizing line breaks in the repository as LF. When the standardization is enabled, with every commit command Git converts all line endings to LF and slide in and out, if desired, in the respective platform-dependent default.

There are three different ways of dealing with line endings:

  • core.autocrlf false: Line endings are ignored. Git stores line endings in the repository as they are present in the files. Also, when the files are retrieved, line endings remain unchanged.
  • core.autocrlf true: Line endings are standardized (to LF) and changed to slide in and out for the respective platform.
  • core.autocrlf input: Line endings are not adjusted when adding standardized (LF), but slide in and out.

Since you usually cannot prevent a repository from being used in the future on other platforms, it makes sense to work from the outset with standardized line breaks.

Therefore, on Windows systems, core.autocrif is set to true and on Unix systems it is set to input before the first import. Note that setting core.autocrif to true or input can be problematic if Git identifies a file as a text file where in fact it is a binary file. Use the .gitattribute file to override auto-detection.

Here is how to set core-autocrlf to input.

> git config --global core.autocrlf input 

Step 5: Import files

Next, add all files for the first commit using the add command. All existing files, including the added .gitignore files, will then be committed and the ignored files left off.

Before you issue the add command, it is recommended to once again use the status command to check which files are reported as “untracked”. Sometimes you forget a temporary file or directory by unintentionally adding it to the repository.

> git status 

> git add . 

A commit is concluded with the commit command.

> git commit -m "init" 

Step 6: Create a bare repository

So far, we have created a normal repository with a workspace for the new project. To work in a team with a central repository using pull and push commands, the repository needs to be converted to a bare repository without a workspace. A bare repository consists only of the contents of the .git directory.

The conversion is done using the clone command and the --bare parameter. Bare repositories typically have the ending .git, to distinguish them from normal repositories.

> git clone --bare projecta projecta.git 

The --bare option causes the clone to have no workspace and include only objects in the repository. The projecta parameter is the name of the repository to be prepared. The projecta.git parameter is the name of the bare repository to be created.

Sharing A Repository via File Access

This section describes how a bare repository can be shared using a shared network drive.

Step 1: Copy the bare repository

After a bare repository with the project files has been created, it can be easily stored on a network drive that is accessible to all.

> cp -R projecta.git /shared/gitrepos/. 

In this example, we assume that the /shared/gitrepos directory is a network drive.

Step 2: Clone the central repository

When cloning a repository that has been shared over a network drive, simply specify the path to the central bare repository.

> git clone /shared/gitrepos/projecta.git 

The path can be specified using the file:// prefix.

> git clone file:///shared/gitrepos/projecta.git

Step 3: Manage the read and write access

The read and write access to the repository are managed through the file system. Each team member will have read access to the repository, because you need read permission to the bare repository directory. The same applies to the write permission.

Advantage and disadvantage

The advantage is the fact that in many corporate environments there are already shared network drives that are parts of a shared file system, the easiest option for a central repository.

The disadvantage is it is difficult to set up this option if you work in a different location than the central repository. Also, data access in Git is not the most efficient because remote Git commands (push, fetch and pull) must always work with remote data. In the following three server versions, however, Git can run remote commands on the server and only needs to transmit the result to the local machine.

Sharing A Repository Using the Git Daemon

The standard Git installation includes a built-in server service that provides access to the repository via a simple network protocol.

Note that the Git daemon in only available on Windows in Git version 1.7.4.

Step 1: Enable the bare repository for the Git daemon

When the Git daemon exports a repository, a git-daemon-export-ok file will be created in the root directory of the bare repository. The file can be empty and is only there to tell Git that it is okay to serve the project without authentication.

> cd projecta.git

> touch git-daemon-export-ok

Step 2: Start the git daemon

You start the Git daemon by using the daemon command.

> git daemon 

Afterward, you can access all the repositories in the current computer that are approved for export. For this purpose, the full path to the repository must be specified in the Git URL.

Here is an example URL:

git://server-42/shared/gitrepos/projecta.git 

The prefix git: indicates that the Git daemon must be used as the protocol. This is followed by the computer name (server-42) and the path to the directory (/shared/gitrepos/projecta.git) that is the location of the repository.

In order to make the URL not so dependent on a specific directory, it is often useful to specify a base path. This can be done using the --base-path parameter.

> git daemon --base-path=/shared/gitrepos

Now you can access the repository through git://server-42/projecta.git.

By default, the daemon command only exports a repository for reading. To enable write access to a repository, use the --enable=receive-pack parameter.

> git daemon --base-path=/shared/gitrepos --enable=receive-pack 

The Git daemon can also be configured as a service in the operating system. For more details, see the documentation for the daemon command.

Step 3: Clone the central repository

When you clone a repository which is released via the daemon, just type in the URL to the central bare repository.

> git clone git://server-42/projecta.git

Step 4: Manage read and write access rights

The read and write access rights cannot be defined separately for individual developers in this variant. That is, each repository that was released for export can be read by anyone who has access to the computer.

If the Git daemon started with write access enabled, anyone can also change all the exported repositories.

Advantage and disadvantage

Advantage: The Git daemon provides the most efficient and fastest data transfer to and from the central repository.

Disadvantage: Lacking the capability to authenticate users, i.e., in environments where the read and write access rights must be limited to repositories, the Git daemon cannot be used.

Another disadvantage: In distributed teams, the firewall can still be a problem since the Git daemon requires a shared port.

Sharing A Repository via HTTP

The standard Git installation provides a CGI script that allows access to repositories through a web server. The CGI script is only available with Git version 1.6.6. Before that, it was possible to access a repository via HTTP, but the “old” protocol was very inefficient and slow.

As an example, the following describes the integration of the CGI script in an Apache2 infrastructure.

Apache2 is typically configured through a file called httpd.conf. The following describes what changes need to be made to the Apache2 configuration file. For details and background information, please read the Apache2 documentation.

Step 1: Enable Apache2 modules

CGI scripts can only be integrated with Apache2 if the mod_cgi module is enabled. In addition, for Git integration, you also need the mod_alias and the mod_env modules. You must enable these modules if they are not yet enabled.

Note that the exact paths in the following example depend on the Apache2 installation and the operating system.

LoadModule cgi_module libexec/apache2/mod_cgi.so
LoadModule alias_module libexec/apache2/mod_alias.so
LoadModule env_module libexec/apache2/mod_env.so

Step 2: Allow access to the CGI script

A typical Apache2 installation restricts access to the web server to certain directories in the file system. If you want to use the CGI script directly from the installation directory of Git, this directory needs to be enabled for access.

In this example, the CGI script is located in /usr/local/git/libexec/git-core directory. The following snippet will allow Apache2 to call the CGI script from there:

<Directory "/usr/local/git/libexec/git-core">
    AllowOverride None 
    Options None 
    Order allow,deny 
    Allow from all 
</Directory>

Attention! It is important to ensure that the user under which the server is running Apache2, has read and execute permissions over the CGI script.

Step 3: Allow access to the repository via HTTP

In order for the CGI script to export a repository, a file named git-daemon-export-ok must be created in the root directory of the bare repository. The file can be empty and is only there to tell Git that it is okay to serve the project without authentication.

> cd /shared/gitrepos/projecta.git 

> touch git-daemon-export-ok

Attention! It is important to ensure that the Apache2 server has read and write access to the repository directory and all its files and subdirectories.

Now, you have to specify the root directory in the httpd.conf file, which contains the repositories to be exported. In this example it is the /shared/gitrepos/ directory.

SetEnv GIT_PROJECT_ROOT /shared/gitrepos 

Finally, you have to set up an alias for the CGI script. In this case it is /git.

ScriptAlias /git/ /usr/local/git/libexec/git-core/git-http-backend/ 

After you restart Apache2, access to all repositories under /shared/gitrepos/ will be allowed.

Step 4: Clone the central repository

When you clone a repository, simply point the URL to the central repository. In this case, the URL consists of the machine name, the script alias for the CGI script, and the directory name of the repository.

> git clone http://server-42/git/projecta.git

In this example, the repository projecta.git is located on computer server-42 under the alias script git.

Step 5: Manage read and write access rights

In this variant, the read and write permissions can be defined using the normal web server access rights.

For instance, to require a password when writing to the repositories (with the push command), add this entry in the Apache2 configuration file:

<LocationMatch "^/git/.*/git-receive-pack$"> 
    AuthType Basic 
    AuthName "Git Access" 
    AuthUserFile /shared/gitrepos/git-auth-file 
    Require valid-user 
</LocationMatch>

With this entry, all requests for git-receive-pack, which is required in every push command, will be intercepted and only allowed if the user is authenticated. Read access, on the other hand, are still possible without a password.

To protect both read and write access to a repository with a password, use this entry in the Apache2 configuration file.

<LocationMatch /git/projecta.git> 
    AuthType Basic 
    AuthName "Git Access" 
    AuthUserFile /shared/gitrepos/git-auth-file 
    Require valid-user 
</LocationMatch>

More examples of the web server configuration can be found in the documentation for the http-backend command.

Advantage and disadvantage

Advantage: The HTTP variant allows easy access to repositories in a web environment. Typical problems with firewalls are not expected from the use of the HTTP protocol. Authentication can be done via the web server. If you need it to be more secure, you can use the HTTPS protocol.

Disadvantage: You need a web server, which must be operated and administered.

Sharing A Repository via SSH

In order to share a repository via Secure Shell (SSH), the necessary infrastructure must be in place. That is, you must at least have a computer with SSH daemon and all participants must have an SSH account on the server.

Step 1: Copy the bare repository

Simply copy the bare repository with the project files to an SSH host to which all developers have access. The scp command can be used to copy a file or files over SSH.

> scp -r projecta.git server-42:/shared/gitrepos/projecta.git 

In this example, we assume that the computer (server-42) allows SSH access and that the /shared/gitrepos directory is allocated on this computer for storing the repository.

Step 2: Clone the central repository

When you clone a repository which is shared via SSH, you need a normal SSH path to the central repository.

> git clone ssh://server-42:/shared/gitrepos/projecta.git 

The prefix ssh:// can be omitted.

> git clone server-42:/shared/gitrepos/projecta.git 

Step 3: Manage read and write access rights

In this variant, read and write access to the repository are managed by administering the SSH and file system rights. That is, each team member will have read access to the repository and need SSH access and read access to the repository directory. The same applies to the write permission.

Advantages and disadvantage

Advantages: Access to a repository over SSH is very easy to set up with an existing SSH infrastructure. Network access is very efficient, since most of the Git commands take place on the SSH server and only the results are transmitted over the network. Furthermore, the access is encrypted.

Disadvantage: If there is no existing SSH infrastructure, setting up this infrastructure can be costly. Even with the existing infrastructure, the management of user accounts can be complex, because each user needs a separate account, even for read access.

Note that you can use Gitolite (https://github.com/sitaramc/gitolite) and Gitosis (https://github.com/tv42/gitosis) software to simplify SSH infrastructure administration for Git. Gitolite can even manage read and write access rights at branch level. There is also Gerrit (http://code.google.com/p/gerrit/), which can also act as a SSH server in addition to providing review functionality.

Why Not the Alternatives?

Why Not Give up push?

The workflow described assumes that each developer has write access to the central repository and thus can publish their commits with the push command.

Typically, in an open source project, a pure pull sequence is used. In this case, all developers only work on their local repository and only the integration managers (integrators) have the permission to update the central software version.

Figure 14.2 shows this pure pull workflow.

The developers clone the central repository and generate new local commits. They then send the integrators a pull request, which is a request to import a branch or a commit and to merge it with the integration branch in the central repository.

The integrator is now responsible for merging all changes from all the developers to the central repository with the pull command. The integrator also takes the role of quality assurance. Once the integrator has all the changes in the central repository, the developers can import the official version from the central repository again with the pull command.

Figure 14.2: Working with pull only

In the normal project work and product development, this process can quickly become an unnecessary brake. There are always high-frequenters in a team who need to see the changes from the other parties quickly, such as when many files are changed in refactoring. The release cycles are shorter in agile projects. In such a scenario, the integrator can become a bottleneck and the changes are not built fast enough in the central repository.

In most projects, the advantage of having more control over changes in the official version does not weigh on the higher cost.

Another problem is the backup of changes. Only after the pull request has been processed, will data be stored in the central repository. Usually only the central repository is backed up by a backup system in enterprises. If the data is destroyed in the computer of the developer before the pull, work will be lost.

Note: It is of course also possible to back up the developer repository. In the open-source environment GitHub (https://github.com/) is often used for this purpose. This also ensures that the integrator can access the developer’s repository.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.119.138.202