Uncharted Waters

Jun 20 2018   11:33PM GMT

Considering Cyclomatic Complexity

Matt Heusser Matt Heusser Profile: Matt Heusser

Tags:
Code Complexity
Complexity

The cyclomatic complexity of a train stationThe longer the function, the more complex it is, the more likely we are to make mistakes. Knowing that makes measuring code complexity, and making the complexity of functions smaller, seems like a very good idea.

So how do we measure it? The most common measure of code complexity is probably “cyclomatic complexity.”

So what exactly is “cyclomatic complexity?”

The definition both confuses me and puts me to sleep at the same time – and I’m a formally trained mathematician with coursework in graph theory.

Seriously, try to make sense of this. You can just skim; I’m going to give an explanation in English later:

The formula of the cyclomatic complexity of a function is based on a graph representation of its code. Once this is produced, it is simply:

M = E – N + 2

E is the number of edges of the graph, N is the number of nodes and M is McCabe’s complexity.

To compute a graph representation of code, we can simply disassemble its assembly code and create a graph following the rules:

 

  • Create one node per instruction.
  • Connect a node to each node which is potentially the next instruction.
  • One exception: for a recursive call or function call, it is necessary to consider it as a normal sequential instruction. If we don’t do that, we would analyze the complexity of a function and all its called subroutines.

That’s crazy.

Here’s the Matt Heusser GoodEnough™ explanation of complexity, why it matters, and how sometimes, it doesn’t.

Measuring Cyclomatic Complexity

Measuring Cyclomatic ComplexityEvery method starts with a score of one. For each if statement, while, for loop, or other looping control structure, add one. Else-Statements do not add another one. The total number is the complexity of the function, as long as the function has only one entry and exit point, and doesn’t do any functional programming.

That’s it.

If your function has multiple exit points, say, inside of an if statement, well, don’t do that, refactor it into a couple of different methods.

You’re done.

Complexity in Action

Here’s a trivial little program. It’s hello world in C.

#include <stdio.h>
int main()
{
    printf("\n\nHello, World!\n\n");
    return 0;
}

The complexity here is 1 – there is just one path through the program. Now let’s add an “if” statement:

#include <stdio.h>
#include <string.h>
int main()
{
    char strName[100];
    printf("\n\nWhat is your name? ");
    fgets(strName, 100, stdin);
    if (strcmp(strName,"Matt\n")!=0 && strcmp(strName,"Justin\n")!=0)
      printf("\n\nHello, %s\n\n",strName);
    else
      printf("\n\nHello ITKE Writer!\n\n\n");
    return 0;
}

With a single if statement, we have two paths through the program. If there is no “else”, there are still only two paths through the software.

That does not mean we only run two test ideas. If the programmer used scanf() instead of fgets(), for example, any sentence would cut off to the first word. So a name of “Robert Smith” would become just “Hello, Robert.”

Looking at the branches is a good place to start for test ideas, and more complex methods will generally be harder to track, leaving more unexplored nooks and crannies for bugs to sneak in … but not always.

When Complexity Gets Tricky

What’s the cyclomatic complexity of this method?

/* Method get_month inspired by:

   https://www.cqse.eu/en/blog/mccabe-cyclomatic-complexity/ */

char* get_month(int month) {
    char *strMonth = malloc(37);
    switch (month) {
                case 1: strMonth = strcpy(strMonth, "January"); break;
                case 2: strMonth = strcpy(strMonth, "February"); break;
                case 3: strMonth = strcpy(strMonth, "March"); break;
                case 4: strMonth = strcpy(strMonth, "April"); break;
                case 5: strMonth = strcpy(strMonth, "May"); break;
                case 6: strMonth = strcpy(strMonth, "June"); break;
                case 7: strMonth = strcpy(strMonth, "July"); break;
                case 8: strMonth = strcpy(strMonth, "August"); break;
                case 9: strMonth = strcpy(strMonth, "September"); break;
                case 10: strMonth = strcpy(strMonth, "October"); break;
                case 11: strMonth = strcpy(strMonth, "November"); break;
                case 12: strMonth = strcpy(strMonth, "December"); break;
                default: strMonth = strcpy(strMonth, "Please enter a month from 1-12");
    }
    return strMonth;
}

The complexity of the method is 13. 

The function grant, in the badge_granter module of the open source collaboration tool discourse, has a CC number of thirteen as well. You can open the file and check it out; search for “def grant.”

Thirteen is actually reasonable for a legacy application. The grant method really isn’t that bad. Still, it’s a lot more challenging than get_month. Obviously, equal cycolmatic complexity does not mean equally easy to understand and track.

Where To Go For More

I’ve uploaded my sample code to github. The samples include a little shell script called “compile” that will compile and run the sample code using gcc on mac or linux. To compile and run a one of the source files on mac or linux, just type “./compile <fiename>.” You can run the source through the free lizard code analyzer, also on github, as easyily as “python lizard.py hello.c”

So check it out. Get out there and make some commits.

Next we’ll talk about how to use the idea of complexity to make better code.

 Comment on this Post

 
There was an error processing your information. Please try again later.
Thanks. We'll let you know when a new response is added.
Send me notifications when other members comment.

Forgot Password

No problem! Submit your e-mail address below. We'll send you an e-mail containing your password.

Your password has been sent to:

Share this item with your network: