© Johan Abildskov 2020
J. AbildskovPractical Githttps://doi.org/10.1007/978-1-4842-6270-2_6

6. Manipulating History

Johan Abildskov1 
(1)
Tilst, Denmark
 

It may seem very counterintuitive that I put a full chapter into manipulating history. Version control is at its core about traceability, reproducibility, and immutability. But Git lets you manipulate the history. For any public history, published to colleagues or available on the Internet, we must tread very carefully and use with care and responsibility the powers this chapter bestows us. But for local history, before we'd publish it can bring tremendous value to sculpt the version history to fit the logical units.

In this chapter, we will first cover undo a change that is present in our history with revert. This allows us to safely undo previous work while maintaining full traceability and immutability.

Next, we are going to cover reset which is the big red button for undoing large chunks of our history, and not just removing changes from our workspace but also removing them from our history. It also does less impactful stuff and is my favorite tool for juggling branches locally.

Last, we cover the interactive rebase which allows us to combine, split, delete, and reorder commits in our history. This is an extremely powerful tool, but can feel a bit scary, and again should be kept a long distance from public history. In terms of delivering the best possible history to colleagues or your future self, no tool is better.

Reverting Commits

There are many scenarios where we need to undo some change in our history. If we are lucky, it is the most recent change, but likely it is not. These changes that we’d like to remove from our applications can be bugs introduced, features no longer used, or simply some clutter that we would like to remove. In this scenario, where we have a specific commit that introduces a change that we would like to remove, we can use git revert. The logic of git revert is that it creates a commit that is the reverse changeset of the commit that we want to revert. This can be seen in Figure 6-1.
../images/495602_1_En_6_Chapter/495602_1_En_6_Fig1_HTML.jpg
Figure 6-1

(a) Two commits adding a file each. (b) History after running the command git revert a

In this scenario, we are not actively manipulating history, we are rather using Git as a shortcut to revert a change. Without Git, we would be forced to manually try and figure out how to undo the given changes and then create that commit ourselves.

This also means that we are not doing anything that can compromise the traceability established through Git. As such, it is safe from an auditing perspective to use revert on public history. Whether you are breaking any functionality that you did not intend is beyond the scope of Git. Always run your tests!

REVERT EXERCISE
In this exercise, we will go through reverting a commit. The repository for this exercise can be found in the source code in Chapter 6 in the folder revert/.
$ ls
a.txt  b.txt
$ git log --oneline
5be4a3d (HEAD -> master) Add File B
c8482f6 Add File A

We see a simple history and we want to undo the changes introduced in commit c8482 with the message “Add File A”.

First, we use git show to see what changeset the commit represents.
$ git show c8482
commit c8482f67747fd8dcb6ced373d89ce3e8dc7d7754
Author: Johan Abildskov <[email protected]>
Date:   Sun Jun 14 16:05:10 2020 +0200
    Add File A
diff --git a/a.txt b/a.txt
new file mode 100644
index 0000000..4ef30bb
--- /dev/null
+++ b/a.txt
@@ -0,0 +1 @@
+file a
Besides the ordinary commit information, we also see the diff. Here, we can see that the file a.txt was created. This is the basis for what we will revert.
$ git revert c8482
Removing a.txt
hint: Waiting for your editor to close the file...
[master 26dc609] Revert "Add File A"
 1 file changed, 1 deletion(-)
 delete mode 100644 a.txt
When we target the commit to revert, we get the usual commit message prompt. It is prefilled with a sane message, so we can save the file and have Git create the commit.
$ git log --oneline
26dc609 (HEAD -> master) Revert "Add File A"
5be4a3d Add File B
c8482f6 Add File A
We observe that Git has created a new commit, so let us see what it contains.
$ git show 26dc
commit 26dc6094fbbd6293bb2a69f354d78008194ea6c3 (HEAD -> master)
Author: Johan Abildskov <[email protected]>
Date:   Sun Jun 14 16:05:53 2020 +0200
    Revert "Add File A"
    This reverts commit c8482f67747fd8dcb6ced373d89ce3e8dc7d7754.
diff --git a/a.txt b/a.txt
deleted file mode 100644
index 4ef30bb..0000000
--- a/a.txt
+++ /dev/null
@@ -1 +0,0 @@
-file a
Here, we get the exact opposite of the commit we reverted, namely, that the file is no longer present. We get a bit more elaboration in the body of the commit message as the trace to the original commit is maintained.
$ ls
b.txt

As expected, we now only have b.txt in our workspace. As has been shown in this exercise, reverting commits can be a safe way to undo a change introduced at an arbitrary point in history.

Reverting commits can be done easily and safely if you as a developer take care of the semantics of the changes you are juggling. It will likely be safer than trying to revert changes manually, without tool assistance. Git tooling like revert and others are another good reason to make your commits atomic and self-contained.

Reset

Reset is one of my favorite Git commands, not only because of its powerful functionality but also because it is one of the commands that allow us to uncover the most knowledge on how Git works and how our intuition might be in conflict with this.

Git is overall very conservative with taking actions that might cause you to lose your work unexpectedly. Git reset, in its hard mode, is one of the ways that Git will throw away unsaved work without warning. It does require an active choice by the user, so this is not too bad in itself. Unfortunately, reset is also one of the commands that have a horrible user experience. I hope to guide you through the command and combined with the exercise and doing the katas that you will feel confident introducing the reset command in your everyday coding life.

Git reset has three modes: soft, mixed, and hard. We will go through them in turn and end up with an exercise covering all three.

Soft Reset

In the soft mode, git reset --soft <ref>, we are only manipulating HEAD. That is, the reference currently checked out will be changed to the target given as an argument. In other words, the soft reset can be used to move a branch pointer.

This can be useful if, for instance, you forgot to create your feature branch before you started your work and thus have created your commits on master. Then, you could make it look like you did the right thing all along by first creating your feature branch at master and then resetting --soft master to origin/master.

As the soft reset leaves both the working directory and the stage alone, it is a completely safe operation. Figure 6-2 shows updating the branch pointer.
../images/495602_1_En_6_Chapter/495602_1_En_6_Fig2_HTML.jpg
Figure 6-2

(b) Is the result from starting in (a) and running git reset --soft B

The soft reset can be used to squash a series of commits together into a single commit. It is done by resetting to the point from which your work started and then creating a commit. The squash works because all your work, represented by the newest commit, will then be in the stage that you can commit into a single commit. This is not a typical scenario and is usually better solved by the interactive rebase that we will cover later in this chapter.

Mixed Reset

The mixed reset is the default behavior when you do not pass a mode to git reset. Mixed reset, besides updating HEAD as soft does, also updates the stage to the targeted place. When we do not pass any ref to reset, HEAD is the default behavior. This leads to the confusing situation that the most common use case for reset --mixed is unstaging files. That is if you have at some point used git add to stage a path, and you no longer want that path to be staged, you can use the command git reset <path>. The logic is that you overwrite the stage with what is in the commit pointed to by the ref, which is HEAD by default. It took me some time to wrap my head around the fact that to remove something from the stage, you have to put something else there.

Figure 6-3 shows this scenario. In it, we also show the stage, which unless something has been added to it will be equivalent to the content in HEAD.
../images/495602_1_En_6_Chapter/495602_1_En_6_Fig3_HTML.jpg
Figure 6-3

Showing that git reset d.txt changes the stage, but not the workspace

Based on the earlier texts, a reasonable question would be, what would happen if we reset mixed to B, for instance? In this case, we would put B and only B into the stage and update HEAD.

Hard Reset

As mentioned before, the hard reset is one of the only dangerous commands in Git – at least from the perspective of how likely Git is to throw away your work without giving you a warning. The mixed reset updates HEAD and the stage, with the content of the target ref. Hard reset updates HEAD, the stage, and the working directory. This means that not only unsaved work but also work that is not a part of a commit will be lost. This is one of the few ways that Git can overwrite your work in an unrecoverable way. So, proceed with caution. The hard reset is part of my daily Git routine, and it could also be part of yours; just make sure that you do it deliberately. Figure 6-4 shows how the hard reset changes both the stage, workspace, and HEAD.
../images/495602_1_En_6_Chapter/495602_1_En_6_Fig4_HTML.jpg
Figure 6-4

git reset --hard B updates HEAD, stage, and workspace to the content of B

While the hard reset is considered off limits by some, it is part of my day-to-day workflow. If we are disciplined around making commits often and take care in running git status before we do a hard reset, we have a powerful and simple tool at our disposal. I have many times seen developers accidentally messing up their local histories with pulls when they did not mean to, or by having contaminated their master branch. The way I do this personally is by avoiding pull in all but the simplest cases. Most often, I will use git fetch to update my local cache and then use git reset --hard origin/master to start from the most up-to-date scratch. When I have made certain to keep my work on isolated branches, this is a safe command to run.

RESET EXERCISE
In this exercise, I will be going through the reset kata from the git-katas repository. This exercise can be found in the git katas and is called reset. In this exercise, we use HEAD~1 to refer to the parent of HEAD.
$ ls
1.txt  10.txt  2.txt  3.txt  4.txt  5.txt  6.txt  7.txt  8.txt  9.txt
$ git log --oneline
6742e05 (HEAD -> master) 10
76ac07a 9
c3e33b7 8
da46ca2 7
1d9b4de 6
21a5ff1 5
a7e2065 4
065ebe8 3
df9cfa3 2
89514e1 1
We note that we have a long history and a workspace containing a single file per commit. We do not investigate, but it is safe to assume that each file is added in the corresponding commit.
$ git reset --soft HEAD~1
$ git log --oneline
76ac07a (HEAD -> master) 9
c3e33b7 8
da46ca2 7
1d9b4de 6
21a5ff1 5
a7e2065 4
065ebe8 3
df9cfa3 2
89514e1 1

We note that the master branch is now pointing to the commit 9 rather than 10.

Investigating the workspace and git status shows us that indeed stage and workspace still have the content from 10.
$ ls
1.txt  10.txt  2.txt  3.txt  4.txt  5.txt  6.txt  7.txt  8.txt  9.txt
$ git status
On branch master
Changes to be committed:
  (use "git restore --staged <file>..." to unstage)
        new file:   10.txt
Now, we can reset --mixed and the log shows us that we have again moved on.
$ git reset --mixed HEAD~1
$ git log --oneline
c3e33b7 (HEAD -> master) 8
da46ca2 7
1d9b4de 6
21a5ff1 5
a7e2065 4
065ebe8 3
df9cfa3 2
89514e1 1
$ ls
1.txt  10.txt  2.txt  3.txt  4.txt  5.txt  6.txt  7.txt  8.txt  9.txt
$ git status
On branch master
Untracked files:
  (use "git add <file>..." to include in what will be committed)
        10.txt
        9.txt
nothing added to commit but untracked files present (use "git add" to track)
Looking in the workspace and checking the status shows us that we still have not changed our workspace, but now 9.txt and 10.txt are untracked, as the stage matches the content in 8.
$ git reset --hard HEAD~1
HEAD is now at da46ca2 7
$ git log --oneline
da46ca2 (HEAD -> master) 7
1d9b4de 6
21a5ff1 5
a7e2065 4
065ebe8 3
df9cfa3 2
89514e1 1
Resetting hard continues the trend of updating HEAD. But now, we are resetting hard, so we expect our workspace to change. Before moving on, I suggest you spend a few moments pondering how you expect the workspace to look.
$ ls
1.txt  10.txt  2.txt  3.txt  4.txt  5.txt  6.txt  7.txt  9.txt
A peculiar thing is happening here. 8.txt is missing, but 9.txt and 10.txt are still present in the workspace. This happens because 9 and 10 are untracked because of our previous actions. As such, Git does not care about them at this time, and they will be left in the workspace.
$ git status
On branch master
Untracked files:
  (use "git add <file>..." to include in what will be committed)
        10.txt
        9.txt
nothing added to commit but untracked files present (use "git add" to track)

So now we have seen the three different modes of the git reset command. It can be daunting and this kata is my favorite one because it encapsulates a lot of learning. This is why I really recommend you go through this kata a few times, until you have built your reset intuition and can wield git reset --hard like a ninja.

In this section, we have been using reset in all its modes for different purposes. One important thing to remember is that no matter what, if you put your data in a commit, you can restore it, even after a hard reset. I hope this section has shown you the power that this safety can give you.

Interactive Rebase

Some of the tricks we have been going through earlier can be used to manipulate history. But the real powerful and granular way to approach tweaking your local history is with the interactive rebase. Remember, if our history is local, we are free to tinker with it as we want. This capability gives us the opportunity and responsibility to consider the history we publish as a part of the delivery. The Git history that we deliver is also a form of communication, and it should be chopped up in the right commits in the right order, with good, clear commit messages. An interactive rebase is invoked by adding the flag --interactive to the git rebase command, for example, git rebase --interactive master.

The best way to go about preparing your Git history is the interactive rebase. Conceptually, you give Git a rebase target, which is what you want to rebase on top of. Then, Git provides you with a rebase plan that it intends to execute. You can change this plan, before Git executes it. This allows you to skip commits entirely, edit them, reorder them, or squash them together. The plan takes the form of Action Sha. And deleting a line will simply make the rebase skip that commit. If you do not edit the plan, it is the same as leaving out the --interactive flag on the rebase command.

The most common actions are as follows:
  • Pick adds the commit at this point.

  • Squash melds this commit into the previous commit.

  • Edit stops to edit this commit.

  • Drop does not pick this commit.

The preceding actions and reordering are how interactive rebases are most commonly used.

The following is an example rebase --interactive execution plan:
pick 8c1e4de file9
reword 921d2d0 file8
squash 3374035 file3
pick 5b3a4fc file4
pick f0d1634 file5
drop a7df72d file2
drop 3d7e5ea file6
pick 18bfdfe file7

The interactive rebase is perhaps the most powerful Git command, and almost any Git task can be solved using this command. I hope that becoming aware of this command will help you on your journey to always delivering a well-groomed history to your collaboration partners, and your future self.

Git Katas

To support the learning goals of this chapter, I suggest you do the following Git katas :
  • Revert.

  • Reset.

  • Reorder the history.

  • Then, I suggest you do the reset kata again; it is always a healthy exercise to revisit 1F642.

Summary

Manipulating the history is often proclaimed to be a big no-no in version control because of traceability. But as long as we only rewrite history that is local or only has been published to temporary branches, we have the obligation to make the history the most usable it can be. Whether that is to squash multiple commits together or even split commits into different bundles, it is all about considering the history you deliver as part of your deliverable.

Remember, all the commands we have covered here are safe, and in the chapter on Git internals, we will cover how to recover from accidents.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.118.137.243