Source code version control

Source code version control systems store different versions of the source code. These days, we cannot imagine professional software development without it. This was not always the case, but the availability of free online repositories encouraged hobby developers to use some version control, and when these developers worked for enterprises later, it was evident that the use of these systems is kind of a must.

There are many different revision control systems. The most widely used one is Git. The version control that was previously widely used was SVN and, even before that, CVS. These are less and less used these days. We can see SVN as a successor of CVS and Git as a successor of SVN. In addition to these, there are other version control systems such as Mercurial, Bazaar, or Visual Studio Team Services. For a comprehensive list of the available tools, visit the Wikipedia page at https://en.wikipedia.org/wiki/List_of_version_control_software.

My bet is that you will meet Git in the first place and there is a high probability of you coming across SVN when programming for an enterprise. Mercury may appear in your practice but any of the others that currently exist are very rare, are used for a specific area, or are simply extinct.

Version control systems allow the development team to store the different versions of the software in an organized manner on a storage that is maintained (backed up regularly in a reliable manner). This is important for different purposes.

The first thing is that different versions of the software may be deployed to different instances. If we develop software for clients and we have many clients with whom we hope to have to make a terrific business, then different clients may have different versions. This is not only because some clients are reluctant to pay for the upgrade, and we just do not want to give the new version for free. Many times, the costs that rise on the side of the customer prevent the upgrade for a long time. Software products do not work on their own in an isolated environment. Different clients have different integrated environments; the software communicates with different other applications. When a new version is to be introduced in an enterprise environment, it has to be tested for whether it works with all the systems it has to cooperate with. This testing takes a lot of effort and money. If the new features or other values that the new version delivers over the old one do not justify the cost, then it would be waste of money to deploy the new version. The fact that there is a new version of our software does not mean that the old versions are not usable.

If there is some bug at the customer's end, then it is vital that we fix the bug in that version. To do so, the bug has to be reproduced in the development environment, which eventually means that the source code for that version has to be available for the developers.

This does require the customer database to contain references to the different versions of our software products that are installed at the customer site. To make it more complicated, a customer may have more than one version at a time in different systems and may also have different licenses, so the issue is more complex than it first seems. If we do not know which version the client has, then we are in trouble.
Since the database registering the versions for the customers and real life may get unsynchronized, software products log their version at startup. We have a separate section about versioning in this chapter.

If the bug is fixed in the version that the client has, the incident at the customer's end may be solved after deployment. The problem, though, still remains if the version is not the previous version of the software. The bug fix introduced to an old version of the software may still be lurking around in the later or, for that matter, earlier versions. The development team has to identify which versions are relevant to clients. For example, an old version that is not installed any more at any of the clients' sites does not deserve the investigation. After that, the relevant versions have to be investigated to check whether they exhibit the bug. This can only be done if we have the source version. Some old versions may not have the bug if the code causing the bug is introduced in later versions. Some new versions may also be immune to the bug because the bug was already fixed in the previous version, or simply because the piece of code that caused the bug was refactored even before the bug manifested. Some bugs may even affect a specific version instead of a range of products. Big fixing may be applied to different versions and they may need slightly different fixes. All this needs a maintained source version repository.

Even when we do not have different customers with different versions, it is more than likely that we have more than one version of our software in development. The development of a major release is coming to an end, and therefore, one part of the team responsible for testing and bug fixing focuses on those activities. At the same time, the development of features for the next version still goes on. The code implementing the functionalities for the next version should not get into the version that is about to be released. The new code may be very fresh, untested, and may introduce new bugs. It is very common to introduce freeze times during the release process. For example, it may be forbidden to implement any new feature of the upcoming release. This is called feature freeze.

Revision control systems deal with these freeze periods, maintaining different branches of the code. The release will be maintained in one branch and the version for later releases in a different one. When the release goes out, the bug fixes that were applied to it should also be propagated to the newer version; otherwise, it might so happen that the next version will contain bugs that were already fixed in the previous version. To do so, the release branch is merged with the ongoing one. Thus, version control systems maintain a graph of the versions, where each version of the code is a node in the graph and the changes are vertices.

Git goes very far in this direction. It supports branch creation and merging so well that developers create separate branches for each change that they create and then they merge it back with the master branch when the feature development is done. This also makes for a good opportunity for code review. The developer making the feature development or bug fix creates a pull request in the GitHub application, and another developer is requested to review the change and perform the pull. This is a kind of four-eyed principle applied to code development.

Some of the revision control systems keep the repository on a server and any change gets to the server. The advantage of this is that any change committed gets to a server disk that is regularly backed up and is thus safe. Since the server-side access is controlled, any code sent to the server cannot be rolled back without trace. All versions, even the wrong versions, are stored on the server. This may be required by some legal control. On the other hand, if commit requires network access and server interaction, it may be slow and this will, in the long run, motivate developers not to commit their changes frequently. The longer a change remains on the local machine, the more risk we have of losing some of the code, and merging becomes more and more difficult with time. To heal this situation, Git distributes the repository and the commit happens to the local repository, which is exactly the same as the remote one on some server. The repositories are synchronized when one repository pushes the changes to another one. This encourages the developers to make frequent commits to the repository, giving short commit messages, which helps in tracking the change made to the code.

Some older version control systems support file locking. This way, when a developer checks out a code file, others cannot work on the same piece of code. This essentially avoids the collisions during code merging. Over the years, this approach did not seem to fit the development methodologies. Merge issues are less of a problem than files that are checked out and forgotten. SVN supports file locking but this is not really serious and does not prevent one developer to commit changes to a file that somebody else locked. It is more of only a suggestion than real locking.

Source code repositories are very important but should not be confused with release repositories, which store the compiled released version of the code in binary. Source and release repositories work together.

Table of Contents for Source code version control

Create new playlist

Sign In

Sign Up

Table of Contents for
Source code version control