Linux is a popular development platform. If you are administering a system for programmers, it is important to understand the various tools they employ. In addition, you may find yourself writing complex shell scripts that require similar tools. In this chapter, we take you through the concepts of version control and the popular Git utility, which implements it.
Two software developers, Natasha and Bruce, are working furiously on a new project called StoneTracker. The project is broken up into several program files that now have names such as UI-3.2.A-73.py
due to all the modifications and revisions. Natasha and Bruce are constantly telling each other what file they are working on so the other one doesn’t accidentally overwrite it. In addition, they have created several directory trees to store the various project file amendments. The project has become bogged down with these complications and the extra communication it requires. These developers need version control. Version control (also known as source control or revision control) is a method or system that organizes various project files and protects modifications to them.
Version control methods or systems can control more than program files. They can typically handle plain-text files, executable files, graphics, word processing documents, compressed files, and more.
A version control system (VCS) provides a common central place to store and merge project files so that you can access the latest project version. It protects the files so that a file is not overwritten by another developer and eliminates any extra communications concerning who is currently modifying it.
Additional benefits include situations around new developers entering the project. For example, if Tony is a new team member, he can copy the latest StoneTracker project files via the version control system and begin work.
Distributed version control systems (VCSs) make projects even easier. The developers can perform their work offline, without any concerns as to whether or not they are connected to a network. The development work takes place locally on their own systems until they send a copy of their modified files and VCS metadata to the remote central system. Only at that time is a network connection required. A side benefit is that now the work is backed up to a central location.
A new version control system for Linux projects was created by Linus Torvalds in 2005. He desired a distributed VCS that could quickly merge files as well as provide other features that the Linux developers needed. The result was Git, which is a very popular high-performance distributed VCS.
Git is a distributed VCS, which is often employed in agile and continuous software development environments. To understand Git’s underlying principles, you need to know a few terms concerning its configuration. Figure 27.1 shows a conceptual depiction of the Git environment.
Working Directory The working directory is where all the program files are created, modified, and reviewed. It is typically a subdirectory within the developer’s home directory tree. The developer’s computer system can be a local server or laptop, depending upon workplace requirements.
Staging Area A staging area is also called the index. This area is located on the same system as the working directory. Program files in the working directory are registered into the staging area via a Git command (git add
). The staging area employs a hidden subdirectory named .git
, which is a created via the git init
command.
When files are cataloged into the staging area, Git creates or updates information in the index file, .git/index
, concerning those files. The data includes checksums, timestamps, and associated file names.
Besides updating the index
file, Git compresses the program file(s) and stores the compressed file as an object(s), also called a blob, in a .git/objects/
directory. If a program file has been modified, it is compressed and stored as a new object in the .git/objects/
directory. Git does not just store file modifications; it keeps a compressed copy of each modified file.
Local Repository The local repository contains each project file’s history. It also employs the .git
subdirectory. Project tree and commit information is stored as objects in the .git/objects/
directory via a Git command (git commit
). This data is called a snapshot. Every commit
creates a new snapshot. Old snapshots can be viewed, and you can revert to previous ones, if desired.
Remote Repository The remote repository is typically a cloud-based location. However, it could also be another server on your local network, depending upon your project’s needs. Prominent remote repositories include GitHub, GitLab, BitBucket, and Launchpad. However, by far, GitHub is the most popular.
Using Git as your VCS includes the following benefits:
commit
is completed to the local repository, Git creates and stores a reference to that snapshot in time.Older VCSs required developers to be on the same network, which didn’t provide a great deal of flexibility. They were also slower in operation, which is one reason Linus Torvalds decided to create Git.
The Git utility typically is not installed by default. Thus, you’ll need to install the git
package prior to setting up your Git environment. See Chapter 13 for details on package installation.
After you have the git
package installed on your system, there are four basic steps to setting up your Git environment for a new project:
.git/
directory.To begin the process for a new project, create a working directory. A subdirectory in your local home folder will suffice. An example is shown in Listing 27.1.
Listing 27.1: Creating a working directory using the mkdir
command
$ mkdir MWGuard
$
$ cd MWGuard
$ pwd
/home/Christine/MWGuard
$
In Listing 27.1 a simple subdirectory MWGuard
is created for the project. After the working directory is created, use the cd
command to move your present working directory into it.
Within the working directory, initialize the .git/
directory. This task employs the git init
command. An example is shown in Listing 27.2.
Listing 27.2: Initializing the .git/ directory via the git init
command
$ git init
Initialized empty Git repository in /home/Christine/MWGuard/.git/
$
$ ls -ld .git
drwxrwxr-x. 7 Christine Christine 119 Feb 6 15:07 .git
$
The git init
command creates the .git/
subdirectory. Because the directory name is preceded with a dot (.
), it is hidden from regular ls
commands. Use the ls -a
command or add the directory name as an argument to the ls
command, as was done in Listing 27.2, in order to view its metadata.
You can have multiple .git/
directories. Just create a separate working directory for each one.
If this is the first time you have built a .git/
subdirectory on your system, modify the global Git repository’s configuration file to include your username and email address. This information assists in tracking file changes. The git config
command lets you perform this task, as shown in Listing 27.3.
Listing 27.3: Modifying a local Git repository’s config
file using the git config
command
$ git config --global user.name "Christine Bresnahan"
$
$ git config --global user.email "[email protected]"
$
$ git config --get user.name
Christine Bresnahan
$
$ git config --get user.email
[email protected]
$
By including the --global
on the git config
command within Listing 27.3, the user.name
and user.email
data is stored in the global Git configuration file. Notice that you can view this information using the --get
option and passing it the data’s name as an argument.
Git configuration information is stored in the global ~/.gitconfig
file and the local repository, which is the working-directory
/.git/config
configuration file. (Some systems have a system-level configuration file, /etc/gitconfig
.) To view all the various configurations, use the git config ––list
command as shown in Listing 27.4.
Listing 27.4: Viewing Git configuration settings using the git config --list
command
$ git config ––list
user.name=Christine Bresnahan
[email protected]
core.repositoryformatversion=0
core.filemode=true
core.bare=false
core.logallrefupdates=true
$
$ cat /home/Christine/.gitconfig
[user]
name = Christine Bresnahan
email = [email protected]
$
$ cat /home/Christine/MWGuard/.git/config
[core]
repositoryformatversion = 0
filemode = true
bare = false
logallrefupdates = true
$
The settings that are displayed via the ––list
option use a file-section
.
name
format. Notice that when the two Git configuration files (global and project’s local repository) are displayed to STDOUT via the cat
command in Listing 27.4, the section names are shown along with the data they hold.
When your local Git environment is configured, it is time to establish your project’s remote repository. For demonstration purposes, we chose the cloud-based remote repository GitHub. If you desire to follow along, you can set up a free remote repository at github.com/join.
Though Git can work with any file type, its tools are aimed at plain-text files. Therefore, you will not be able to use all the git
utilities on non-text files.
After you have your project’s remote repository established, you’ll need to record the URL it provides. This web address is used for sending your project files to the remote repository, which is covered later in this chapter.
When you have your Git environment established, you can begin employing version control. There are four steps, as follows:
Depending on your team’s workflow, you may repeat certain steps before progressing to the next one. For example, in a single day, a programmer adds files as they are completed to the staging area. At the end of the day, the developer commits the project to the local repository and then pushes the project work to the remote repository for non-local team members to access.
In Listing 27.5, a simple shell script was created called MyScript.sh
to use as a Git VCS example.
Listing 27.5: Viewing a simple shell script named MyScript.sh
$ cat MyScript.sh
#!/bin/bash
#
echo "Hello World"
#
exit
$
After the program is created (or modified), it is added to the staging area (index). This is accomplished through the git add
command, as shown in Listing 27.6. The file is in the working directory, and you perform the git add
command while located in that directory.
Listing 27.6: Adding program to the staging area via the git add
command
$ pwd
/home/Christine/MWGuard
$
$ git add MyScript.sh
$
$ git status
# On branch master
#
# Initial commit
#
# Changes to be committed:
# (use "git rm ––cached <file>..." to unstage)
#
# new file: MyScript.sh
#
$
The git add
command does not provide any responses when it is executed. Thus, to see if it worked as desired, employ the git status
command as shown in Listing 27.6. The git status
command shows that a new file, MyScript.sh
, was added to the index. Notice also the name branch master
. Git branches are covered later in this chapter.
You can add all the files in the current working directory to the staging area’s index at the same time. To accomplish this, issue the git add
. command. Note the period (.
) at the end of the command. It is effectively a wildcard, telling Git to add all the working directory’s files to the index.
If you have files in your working directory that you do not want added to the staging area index, create a .gitignore
file in the working directory. Add the names of files and directories you do not want included in the index. The git add .
command will now ignore those files.
The staging area’s index file name is .git/index
, and when the file
command is used on it, in Listing 27.7, the file type is shown as a Git index. This file is employed by Git to track changes to the file.
Listing 27.7: Looking at the staging area index file with the file
command
$ file .git/index
.git/index: Git index, version 2, 1 entries
$
The next step in the process is to commit the project to the local repository. This will create a project snapshot, which contains information such as the project’s current tree structure and commit data. Git stores this data in the .git/
directory. The commit is accomplished via the git commit
command, as shown in Listing 27.8. The -m
option adds a comment line to the COMMIT_EDITMSG
file, which is used to help track changes. When you make commits later in the project’s life, it is useful to include additional information to the -m
option arguments, such as -m "Presentation Layer Commit"
.
Listing 27.8: Committing a file with the git commit
command
$ git commit -m "Initial Commit"
[master (root-commit) 6d2370d] Initial Commit
1 file changed, 5 insertions(+)
create mode 100644 MyScript.sh
$
$ cat .git/COMMIT_EDITMSG
Initial Commit
$
$ git status
# On branch master
nothing to commit, working directory clean
$
When you have committed the project to the local repository, the git status
command will display the message shown in Listing 27.8 indicating all the files have been committed.
If you do not add the -m
option and its argument onto the git commit
command, you are placed into the vim
editor to edit the .git/COMMIT_EDITMSG
file by hand. The vim
editor was covered in Chapter 4.
Now that the project is committed to the local repository, it can be shared with other development team members by pushing it to the remote repository. If the project is complete, you can also share with select others or the whole world.
If this is a new project, after you have set up your remote repository account, create a Markdown file called README.md
. The file’s content displays on the remote repository’s web page and describes the repository. It uses what is called Markdown language. An example of creating this file, adding it to the staging area index, and committing it to the local repository is shown in Listing 27.9.
Listing 27.9: Creating, adding, and committing a README.md file
$ pwd
/home/Christine/MWGuard
$
$ ls
MyScript.sh
$
$ echo "# Milky Way Guardian" > README.md
$ echo "## Programming Project" >> README.md
$
$ cat README.md
# Milky Way Guardian
## Programming Project
$
$ git add README.md
$
$ git status
# On branch master
# Changes to be committed:
# (use "git reset HEAD <file>..." to unstage)
#
# new file: README.md
#
$ git commit -m "README.md commit"
[master 4541578] README.md commit
1 file changed, 2 insertions(+)
create mode 100644 README.md
$
You can get really fancy with your README.md
file by using various features of the Markdown language. Find out more about Markdown at guides.github.com/features/mastering-markdown/.
At any time you can review the Git log, but it’s always a good idea to do so prior to pushing the project to a remote repository. An example of how to view the log is shown in Listing 27.10. Each commit is given a hash number to identify it, which is shown in the log. Also, notice the different comment lines along with dates as well as author information.
Listing 27.10: Viewing the Git log via the git log
command
$ git log
commit 45415785c17c213bac9c47ce815b91b6a9ac9f86
Author: Christine Bresnahan <[email protected]>
Date: Fri Feb 8 13:49:49 2019 -0500
README.md commit
commit 6d2370d2907345671123aeaaa71e147bd3f08f36
Author: Christine Bresnahan <[email protected]>
Date: Wed Feb 6 15:23:11 2019 -0500
Initial Commit
$
Before you can push your project to the remote repository, you need to configure its address on your system. This is done via the remote add origin
URL
command, where URL
is the remote repository’s address. An example is shown in Listing 27.11.
Listing 27.11: Configuring the remote repository via the git remote
command
$ git remote add origin https://github.com/C-Bresnahan/MWGuard.git
$
$ git remote -v
origin https://github.com/C-Bresnahan/MWGuard.git (fetch)
origin https://github.com/C-Bresnahan/MWGuard.git (push)
$
Notice in Listing 27.11 that you can check the status of the remote address via the git remote -v
command. It’s a good idea to check the address prior to pushing a project.
If you make a mistake, such as a typographical error, in the URL, you can remove the remote repository’s address via the git remote rm origin
command. After it is removed, set up the remote address again using the correct URL.
After the remote repository URL is configured, push your project up to its location. An example is shown in Listing 27.12.
Listing 27.12: Pushing project to remote repository via git push
command
$ git push -u origin master
Username for ’https://github.com': C-Bresnahan
Password for ’https://[email protected]':
Counting objects: 6, done.
Compressing objects: 100% (3/3), done.
Writing objects: 100% (6/6), 561 bytes | 0 bytes/s, done.
Total 6 (delta 0), reused 0 (delta 0)
To https://github.com/C-Bresnahan/MWGuard.git
* [new branch] master -> master
Branch master set up to track remote branch master from origin.
$
Typically the remote repository will demand a username and password, unless you have set it up to use SSH keys (OpenSSH was covered in Chapter 16). When the project is pushed to the remote repository, you should be able to view it. If it is a private repository, you’ll have to log into the remote repository in order to see your work. Figure 27.2 shows the remote repository for this project. Keep in mind that different providers will have different user interfaces for your projects.
What is really nice about the remote repository is that your project team can pull down the latest files for the project using the git pull
command. You’ll need to either set up access for them to the remote repository or make it public. A snipped example of pulling files is shown in Listing 27.13.
Listing 27.13: Pulling latest project files from remote repository via the git pull
command
$ whoami
Rich
$
$ git remote add origin https://github.com/C-Bresnahan/MWGuard.git
$
$ git pull -u origin master
[…]
Auto-merging MyScript.sh
[…]
$
If the individual pulling down the project files already has a modified version of a particular file in their local repository that was not uploaded to the remote repository, the git pull
command will fail. However, the error message will instruct how to rectify this problem.
A new development team member can copy the entire project, including the .git/
files, to their local system from the remote repository using the git clone
command. A snipped example is shown in Listing 27.14.
Listing 27.14: Cloning a project from remote repository via the git clone
command
$ whoami
Samantha
$
$ ls
$
$ git clone https://github.com/C-Bresnahan/MWGuard.git
Cloning into ’MWGuard’...
[…]
remote: Total 6 (delta 0), reused 6 (delta 0), pack-reused 0
Unpacking objects: 100% (6/6), done.
$
$ ls
MWGuard
$ cd MWGuard
$ ls -a
. .. .git MyScript.sh README.md
$
$ git log
commit 45415785c17c213bac9c47ce815b91b6a9ac9f86
Author: Christine Bresnahan <[email protected]>
Date: Fri Feb 8 13:49:49 2019 -0500
README.md commit
commit 6d2370d2907345671123aeaaa71e147bd3f08f36
Author: Christine Bresnahan <[email protected]>
Date: Wed Feb 6 15:23:11 2019 -0500
Initial Commit
$
When the project is cloned from the remote repository, the working directory is automatically created, along with the .git/
directory, the Git staging area (index), and the local repository. The git log
command shows the project’s history. This is an easy way for a new team member to grab everything needed to begin working on the project.
A helpful concept in Git is branches. A branch is an area within a local repository for a particular project section. By default, Git stores your work in the master
branch, as shown in Listing 27.15.
Listing 27.15: Viewing the branch in use via the git status
command
$ git status
# On branch master
nothing to commit, working directory clean
$
You can have multiple branches within a project. A simple example is having a branch for production software (master
), a branch for software in development (develop
), and a branch for testing development changes (test
). You can designate the branch you wish to work on to protect files in another branch from being changed. Using the example, you certainly would not want your development code files going into the production branch (master
). Instead, you want them maintained in the develop
branch until they are tested in the test
branch and ready for production.
Let’s take a look at a project that needs to use branches. The StoneTracker project is in production, and its files are managed via the master
branch, as shown in Listing 27.16.
Listing 27.16: Viewing the current branch in use via the git branch
command
$ git branch
* master
$
Notice in Listing 27.16 the * master
line. The asterisk (*
) indicates that the current branch is master
and that it is the only branch. If there were more branches, they would also be displayed. The current branch always has the asterisk next to it.
You can view the file names within a particular branch by employing the git ls-tree
command. The StoneTracker project’s committed files are shown in Listing 27.17.
Listing 27.17: Viewing the file names in the master
branch
$ git ls-tree ––name-only -r master
README.md
ST-Data.py
ST-Main.py
$
In the master
branch (production), the StoneTracker project currently uses a text-based user interface via its business tier, ST-Main.py
. The development team needs to add a presentation layer, which will provide a GUI user interface. To create this new program without affecting production, they create a new branch to the project, using the git branch
command shown in Listing 27.18.
Listing 27.18: Creating a new branch via the git branch branch-name
command
$ git branch develop
$
$ git branch
develop
* master
$
Notice in Listing 27.18 that when the new branch, develop
, is created, it is not set as the current branch. To change branches, the git checkout
command is needed, as shown in Listing 27.19.
Listing 27.19: Switching to a branch via the git checkout branch-name
command
$ git checkout develop
Switched to branch ’develop’
$
$ git branch
* develop
master
$
Now that the branch is switched, development on the new user interface (ST-UI.py
) can occur without affecting the master
branch. However, Git VCS is still employed, as shown in Listing 27.20.
Listing 27.20: Using GIT VCS on the develop
branch
$ ls
README.md ST-Data.py ST-Main.py ST-UI.py
$
$ git add .
$
$ git status
# On branch develop
# Changes to be committed:
# (use "git reset HEAD <file>..." to unstage)
#
# new file: ST-UI.py
#
$
$ git commit -m "New User Interface"
[develop 1a91bc3] New User Interface
1 file changed, 47 insertions(+)
create mode 100644 ST-UI.py
$
When development (and testing) on the new user interface is completed, the develop
branch is merged with the master
branch (production). To merge branches, use the git merge
branch-name-to-merge
command. Merges must be performed from the target branch. Therefore, in this case you must go back to the master
branch prior to issuing the command, as shown snipped in Listing 27.21.
Listing 27.21: Merging a branch via the git merge
command
$ git checkout master
Switched to branch ’master’
$ git branch
develop
* master
$
$ git merge develop
Updating 0e08e81..1a91bc3
Fast-forward
ST-UI.py | 47 […]
1 file changed, 47 insertions(+)
create mode 100644 ST-UI.py
$
$ git ls-tree ––name-only -r master
README.md
ST-Data.py
ST-Main.py
ST-UI.py
$
$ git log
commit 1a91bc30050ef1c0595894915295cc458b2539b7
Author: Christine Bresnahan <[email protected]>
Date: Fri Feb 8 18:03:18 2019 -0500
New User Interface
commit 0e08e810dd767acd64e09e45fff614288144da45
Author: Christine Bresnahan <[email protected]>
Date: Fri Feb 8 16:50:10 2019 -0500
Initial Production Commit
$
Notice in Listing 27.21 that now within the master
branch, the new production tier program, ST-UI.py
, is managed. Also notice that the Git logs between the two branches were also merged, as shown by the git log
command.
There is another flavor of merging called rebasing. Instead of simply merging the commits and history logs into a single branch, rebasing performs new commits for all the files, which simplifies the history logs. To rebase a project, replace git merge
with the git rebase
command. Which one you employ is dependent upon your organization’s development workflow as well team member preferences.
The distributed VCS utility Git is useful in many ways beyond the needs of developers. Understanding how to set up working directories, staging areas, and local and remote repositories is a wonderful skill set. You will not only be able to use the appropriate lingo with the programmers, you can implement Git for various other useful things to manage your Linux systems.
Describe version control. Version control is a method or system that organizes various project files and protects modifications to them. A distributed VCS allows developers to work offline and independently. The Git VCS provides a working directory, staging area (index), and local repository and uses a remote repository provided by a third party. It is popular due to high performance, maintained modification history, file protection, and decentralization.
Explain how to set up your Git environment. The git
package provides the various Git tools used for VCSs. Create a working directory for each project using the mkdir
command. The .git/
directory, used by both the staging area and the local repository, is initialized via the git init
command. Finally, a third party, such as GitHub, can provide the remote repository to use with the various Git tools.
Detail committing with Git. As needed, files are moved from the working directory to the staging area (index) via the git add
utility. The project’s workflow dictates when the programs are moved to the local directory via the git commit
command and then onto the remote repository via the git push
utility. If a remote developer needs the latest project files, the git pull
command is employed. For new team members, who need all the project files, including modification history, the git clone
command is used.
Summarize Git branches. A Git branch is a local repository area employed for a particular project section, such as development or project testing. By default, the main branch is called the master
branch. New branches are created using the git branch
branch-name
command. You can view the various branches available using the git branch
utility, which uses an asterisk to denote the current branch. To switch to another project branch, git checkout
branch-name
is employed. After work on the branch is completed, its VCS files and project files can be merged with another branch via the git merge
branch-name-to-merge
command.
Which of the following is true concerning version control? (Choose all that apply.)
Conceptually Git is broken up into distinct areas. Which of the following is one of those areas? (Choose all that apply.)
Which of the following are steps needed to set up a Git environment for the first time? (Choose all that apply.)
.git/
directory in the working directory.Natasha has created her working directory for a new project. What should she do next to set up her Git project environment?
mkdir
command.git config ––list
command.git init
command.When setting his Git configuration options, Bruce employs the ––global
option on his commands. What does this mean?
∼/.gitconfig
..git/config
file..git/index
file..git/objects
directory.Bruce has set up his Git environment and finished working on his new GreenMass.sh
script. What should he do next?
git init
command.git log
command.There are 25 files in Natasha’s working directory and she only wants to add 22 of them to the index. She plans on using the git add .
command to be efficient. What should she do?
git add
command..gitignore
file.Natasha has completed her open-source project, which is set to be released to the public today. She has moved the files to the staging area, committed her work to the local repository, and configured the remote repository’s address. What is her next step?
remote add origin
URL
command.Which of the following commands allows you to switch to a new Git branch called testing
?
git branch testing
git ls-tree ––name-only -r testing
git branch
git commit -m "testing"
git checkout testing
Tony created a new branch of the StoneTracker project called report
. He has completed and tested his work. He now needs to merge it with the StoneTracker project’s master
branch. After switching branches to the master
branch, what command should he employ?
git merge master
git merge report
git rebase master
git rebase report
git checkout master
3.143.247.53