So Do We Need More Complexity Metrics?

The results shown in this chapter suggest that for non-header files written in C language, all the complexity metrics are highly correlated with lines of code, and therefore the more complex metrics provide no further information that could not be measured simply with lines of code.

However, these results must be accepted with some caution. Header files show poor correlation between cyclomatic complexity and the rest of metrics. We argue that this is because of the nature of this kind of file. In other words, header files do not contain implementations, only specifications. We are trying to measure the complexity of source code in terms of program comprehension. Programmers must of course read and comprehend header files, which means that header files can contribute to complexity to a certain extent. However, even though cyclomatic complexity is poorly correlated with lines of code in this case, that does not mean that it is a good complexity metric for header files. On the contrary, the poor correlation is due only to the lack of control structures in header files. These files do not contain loops, bifurcations, etc., so their cyclomatic complexity will always be minimal, regardless of their size.

For nonheader files, all the metrics show a high degree of correlation with lines of code. We accounted for the confounding effect of size, showing that the high correlation coefficients remain for different size ranges.

In our opinion, there is a clear lesson from this study: syntactic complexity metrics cannot capture the whole picture of software complexity. Complexity metrics that are exclusively based on the structure of the program or the properties of the text (for example, redundancy, as Halstead’s metrics do), do not provide information on the amount of effort that is needed to comprehend a piece of code—or, at least, no more information than lines of code do. This has implications for how these metrics are used. In particular, defect prediction, development and maintenance effort models, and statistical models in general cannot benefit from these metrics, and lines of code should be considered always as the first and only metric for these models.

The problem of code complexity versus comprehension complexity has been faced in the research community before. In particular, a semantic entropy metric has been proposed, based on how obscure the identifiers used in a program are (for instance, names of variables). Interestingly, those kind of measurements are good defect predictors [Etzkorn et al. 2002].

This does not mean there are no useful lessons to take from traditional complexity metrics. First, cyclomatic complexity is a great indicator for the amount of paths that need to be tested in a program. Halstead’s Software Science metrics also provide an interesting lesson: there are always several ways of doing the same thing in a program. So if you choose one way and use it in many parts of the program, you’ll make your code more redundant, in turn making it more readable and less complex—in spite of what other statistics might say.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.179.225