Editing Cyclomatic complexity (section)

==Applications==
===Limiting complexity during development===
One of McCabe's original applications was to limit the complexity of routines during program development. He recommended that programmers should count the complexity of the modules they are developing, and split them into smaller modules whenever the cyclomatic complexity of the module exceeded 10.<ref name="mccabe76" /> This practice was adopted by the [[NIST]] Structured Testing methodology, which observed that since McCabe's original publication, the figure of 10 had received substantial corroborating evidence. However, it also noted that in some circumstances it may be appropriate to relax the restriction and permit modules with a complexity as high as 15. As the methodology acknowledged that there were occasional reasons for going beyond the agreed-upon limit, it phrased its recommendation as "For each module, either limit cyclomatic complexity to [the agreed-upon limit] or provide a written explanation of why the limit was exceeded."<ref name="nist">{{cite web|
url=http://www.mccabe.com/pdf/mccabe-nist235r.pdf| title=Structured Testing: A Testing Methodology Using the Cyclomatic Complexity Metric|author1=Arthur H. Watson |author2=Thomas J. McCabe | year=1996|publisher=NIST Special Publication 500-235}}</ref>

===Measuring the "structuredness" of a program===
{{Main|Essential complexity (numerical measure of "structuredness")}} <!-- please update the link when that article is split, as it should be -->
Section VI of McCabe's 1976 paper is concerned with determining what the control-flow graphs (CFGs) of non-[[structured programming|structured program]]s look like in terms of their subgraphs, which McCabe identified. (For details, see [[structured program theorem]].) McCabe concluded that section by proposing a numerical measure of how close to the structured programming ideal a given program is, i.e. its "structuredness". McCabe called the measure he devised for this purpose [[Essential complexity (numerical measure of "structuredness")|essential complexity]].<ref name="mccabe76" />

To calculate this measure, the original CFG is iteratively reduced by identifying subgraphs that have a single-entry and a single-exit point, which are then replaced by a single node. This reduction corresponds to what a human would do if they extracted a subroutine from the larger piece of code. (Nowadays such a process would fall under the umbrella term of [[refactoring]].) McCabe's reduction method was later called ''condensation'' in some textbooks, because it was seen as a generalization of the [[condensation (graph theory)|condensation to components used in graph theory]].<ref>{{cite book|author=Paul C. Jorgensen|title=Software Testing: A Craftsman's Approach, Second Edition|url=https://books.google.com/books?id=Yph_AwAAQBAJ&pg=PA150|year=2002|publisher=CRC Press|isbn=978-0-8493-0809-3|pages=150–153|edition=2nd}}</ref> If a program is structured, then McCabe's reduction/condensation process reduces it to a single CFG node. In contrast, if the program is not structured, the iterative process will identify the irreducible part. The essential complexity measure defined by McCabe is simply the cyclomatic complexity of this irreducible graph, so it will be precisely 1 for all structured programs, but greater than one for non-structured programs.<ref name="nist"/>{{rp|80}}<!--note that the original paper has an error here in that it subtracts the number of nodes from that [and thus can give negative essential complexity measures for some non-structured programs], but the later technical report fixed that]-->

===Implications for software testing===
Another application of cyclomatic complexity is in determining the number of test cases that are necessary to achieve thorough test coverage of a particular module.

It is useful because of two properties of the cyclomatic complexity, {{mvar|M}}, for a specific module:
* {{mvar|M}} is an upper bound for the number of test cases that are necessary to achieve a complete [[branch coverage]].
* {{mvar|M}} is a lower bound for the number of paths through the control-flow graph (CFG). Assuming each test case takes one path, the number of cases needed to achieve [[path coverage]] is equal to the number of paths that can actually be taken. But some paths may be impossible, so although the number of paths through the CFG is clearly an upper bound on the number of test cases needed for path coverage, this latter number (of ''possible'' paths) is sometimes less than {{mvar|M}}.

All three of the above numbers may be equal: branch coverage <math>\leq</math> cyclomatic complexity <math>\leq</math> number of paths.

For example, consider a program that consists of two sequential if-then-else statements.

<syntaxhighlight lang="c">
if (c1())
    f1();
else
    f2();

if (c2())
    f3();
else
    f4();
</syntaxhighlight>

[[File:Control flow graph of function with two if else statements.svg|thumb|250px|right|The control-flow graph of the source code above; the red circle is the entry point of the function, and the blue circle is the exit point. The exit has been connected to the entry to make the graph strongly connected.]]
In this example, two test cases are sufficient to achieve a complete branch coverage, while four are necessary for complete path coverage. The cyclomatic complexity of the program is 3 (as the strongly connected graph for the program contains 8 edges, 7 nodes, and 1 connected component) ({{math|8 − 7 + 2}}).

In general, in order to fully test a module, all execution paths through the module should be exercised. This implies a module with a high complexity number requires more testing effort than a module with a lower value since the higher complexity number indicates more pathways through the code. This also implies that a module with higher complexity is more difficult to understand since the programmer must understand the different pathways and the results of those pathways.

Unfortunately, it is not always practical to test all possible paths through a program. Considering the example above, each time an additional if-then-else statement is added, the number of possible paths grows by a factor of 2. As the program grows in this fashion, it quickly reaches the point where testing all of the paths becomes impractical.

One common testing strategy, espoused for example by the NIST Structured Testing methodology, is to use the cyclomatic complexity of a module to determine the number of [[white-box testing|white-box tests]] that are required to obtain sufficient coverage of the module. In almost all cases, according to such a methodology, a module should have at least as many tests as its cyclomatic complexity. In most cases, this number of tests is adequate to exercise all the relevant paths of the function.<ref name="nist"/>

As an example of a function that requires more than mere branch coverage to test accurately, reconsider the above function. However, assume that to avoid a bug occurring, any code that calls either <code>f1()</code> or <code>f3()</code> must also call the other.{{efn|This is a fairly common type of condition; consider the possibility that <code>f1</code> allocates some resource which <code>f3</code> releases.}} Assuming that the results of <code>c1()</code> and <code>c2()</code> are independent, the function as presented above contains a bug. Branch coverage allows the method to be tested with just two tests, such as the following test cases:

* <code>c1()</code> returns true and <code>c2()</code> returns true
* <code>c1()</code> returns false and <code>c2()</code> returns false

Neither of these cases exposes the bug. If, however, we use cyclomatic complexity to indicate the number of tests we require, the number increases to 3. We must therefore test one of the following paths:

* <code>c1()</code> returns true and <code>c2()</code> returns false
* <code>c1()</code> returns false and <code>c2()</code> returns true

Either of these tests will expose the bug.

===Correlation to number of defects===
Multiple studies have investigated the correlation between McCabe's cyclomatic complexity number with the frequency of defects occurring in a function or method.<ref name="fenton">{{cite journal
|journal=IEEE Transactions on Software Engineering|author1=Norman E Fenton |author2=Martin Neil |
url=http://www.eecs.qmul.ac.uk/~norman/papers/defects_prediction_preprint105579.pdf|
title=A Critique of Software Defect Prediction Models|
year=1999|volume=25|issue=3|pages=675–689|doi=10.1109/32.815326|citeseerx=10.1.1.548.2998 }}</ref>  Some studies<ref name="schroeder99">{{cite journal| title=A Practical guide to object-oriented metrics|author=Schroeder, Mark|s2cid=14945518|year=1999|volume=1|issue=6|pages=30–36|journal=IT Professional |doi=10.1109/6294.806902}}</ref> find a positive correlation between cyclomatic complexity and defects; functions and methods that have the highest complexity tend to also contain the most defects. However, the correlation between cyclomatic complexity and program size (typically measured in [[lines of code]]) has been demonstrated many times. [[Les Hatton]] has claimed<ref name="taic">
{{cite web |url=http://www.leshatton.org/TAIC2008-29-08-2008.html |title=The role of empiricism in improving the reliability of future software |author=Les Hatton |year=2008 |at=version 1.1}}</ref> that complexity has the same predictive ability as lines of code.
Studies that controlled for program size (i.e., comparing modules that have different complexities but similar size) are generally less conclusive, with many finding no significant correlation, while others do find correlation. Some researchers question the validity of the methods used by the studies finding no correlation.<ref name="kan">
{{cite book |title=Metrics and Models in Software Quality Engineering |author=Kan |pages=316–317 |publisher=Addison-Wesley |year=2003 |isbn=978-0-201-72915-3}}</ref> Although this relation likely exists, it is not easily used in practice.<ref name=cherf>{{cite journal|
journal=Journal of Software Quality|
author=G.S. Cherf|
title=An Investigation of the Maintenance and Support Characteristics of Commercial Software|
year=1992|volume=1|issue=3|pages=147–158|
issn=1573-1367|doi=10.1007/bf01720922|
s2cid=37274091}}</ref> Since program size is not a controllable feature of commercial software, the usefulness of McCabe's number has been questioned.<ref name="fenton" /> The essence of this observation is that larger programs tend to be more complex and to have more defects. Reducing the cyclomatic complexity of code is [[correlation does not imply causation|not proven]] to reduce the number of errors or bugs in that code. International safety standards like [[ISO 26262]], however, mandate coding guidelines that enforce low code complexity.<ref name="ISO26262Part3">{{cite book | title =ISO 26262-3:2011(en) Road vehicles — Functional safety — Part 3: Concept phase| publisher =International Standardization Organization | url =https://www.iso.org/obp/ui/#iso:std:iso:26262:-3:ed-1:v1:en}}</ref>