Code Churn

Software systems evolve over time due to changes in requirements, optimization of code, fixes for security and reliability bugs, etc. Code churn measures the changes made to a component over a period of time and quantifies the extent of this change. It is easily extracted from a system’s change history, as recorded automatically by a version control system. Most version control systems use a file comparison utility (such as diff) to automatically estimate how many lines were added, deleted, and changed by a programmer to create a new version of a file from an old version. These differences are the basis of churn measures.

Relative churn measures are normalized values of the various churn measures. Some of the normalization parameters are total lines of code, file churn, file count, etc. In an evolving system it is highly beneficial to use a relative approach to quantify the change in a system. As we show, these relative measures can be devised to cross-check each other so that the metrics do not provide conflicting information. Our basic hypothesis is that code that changes many times pre-release will likely have more post-release defects than code that changes less over the same period of time.

In our analysis, we used the code churn between the release of Windows Server 2003 (W2k3) and the release of the W2k3 Service Pack 1 (W2k3-SP1) to predict the defect density in W2k3-SP1. Using the directly collected churn metrics such as added, modified, and deleted lines of code (LOC), we also collected the time and file count measures to compute a set of relative churn measures [Nagappan and Ball 2005]:

METRIC1: Churned LOC / Total LOC

We expect the larger the proportion of churned (added + changed) code to the LOC of the new binary, the larger the magnitude of the defect density for that binary.

METRIC2: Deleted LOC / Total LOC

We expect the larger the proportion of deleted code to the LOC of the new binary, the larger the magnitude of the defect density for that binary.

METRIC3: Files churned / File count

We expect the greater the proportion of files in a binary that get churned, the greater the probability of these files introducing defects. For example, suppose binaries A and B contain 20 files each. If binary A has five churned files and binary B has two churned files, we expect binary A to have a higher defect density.

METRIC4: Churn count / Files churned

Suppose binaries A and B have twenty files each and also have five churned files each. If the five files in binary A are churned 20 times and the five files in binary B are churned 10 times, then we expect binary A to have a higher defect density. METRIC4 acts as a cross check on METRIC3.

METRIC5: Weeks of churn / File count

METRIC5 is used to account for the temporal extent of churn. A higher value of METRIC5 indicates that it took a longer time to fix a smaller number of files. This may indicate that the binary contains complex files that may be hard to modify correctly. Thus, we expect that an increase in METRIC5 would be accompanied by an increase in the defect density of the related binary.

METRIC6: Lines worked on / Weeks of churn

The measure “Lines worked on” is the sum of the churned LOC and the deleted LOC. METRIC6 measures the extent of code churn over time in order to cross check on METRIC5. Weeks of churn does not necessarily indicate the amount of churn. METRIC6 reflects our expectation that the more lines are worked on, the longer the weeks of churn should be. A high value of METRIC6 cross checks on METRIC5 and should predict a higher defect density.

METRIC7: Churned LOC / Deleted LOC

METRIC7 is used in order to quantify new development. All churn is not due to bug fixes. In feature development the lines churned is much greater than the lines deleted, so a high value of METRIC7 indicates new feature development. METRIC7 acts as a cross check on METRIC1 and METRIC2, neither of which accurately predicts new feature development.

METRIC8: Lines worked on / Churn count

We expect that the larger a change (lines worked on) relative to the number of changes (churn count), the greater the defect density will be. METRIC8 acts as a cross check on METRIC3 and METRIC4, as well as METRIC5 and METRIC6. With respect to METRIC3 and METRIC4, METRIC8 measures the amount of actual change that took place. METRIC8 cross checks to account for the fact that files are not getting churned repeatedly for small fixes. METRIC8 also cross checks on METRIC5 and METRIC6 to account for the fact that the higher the value of METRIC8 (more lines per churn), the higher the time (METRIC5) and lines worked on per week (METRIC6). If this is not so, then a large amount of churn might have been performed in a small amount of time, which can cause an increased defect density.

Using the relative code churn metrics identified about a prediction model identifies with random splitting an accuracy of 89% the failure-prone and non-failure-prone binaries in W2K3-SP1. Further, the defect density of W2k3-SP1 is also predicted with a high degree of accuracy (shown in Figure 23-1). (The axes are normalized to protect confidential data). The spikes above the solid line indicate overestimations and below the solid line indicate underestimations. The correlation between actual and estimated defect density is strong, positive, and statistically significant—indicating that with the increase in actual defect density there is an increase in estimated defect density.

These results, along with results from prior published research studies (for example, [Ostrand et al. 2004] and [Nagappan and Ball 2005] have a more detailed list), indicate that code churn and code quality are highly and positively correlated. That is, the higher the code churn, the more the failures in a system. Because code churn is an essential part of new software development (say, new product features), we introduced the relative code churn measures to alleviate this problem by cross-checking the metrics. However, our results show that it is clear that code churn measures need to be used in conjunction with other internal metrics.

Actual versus estimated system defect density [Nagappan and Ball 2005]

Figure 23-1. Actual versus estimated system defect density [Nagappan and Ball 2005]

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
13.59.209.131