Cover rage, when code quality matters
Today in this blog post, we’re going to talk about a very sensitive subject. Some of you deal with it all day every day. Others have no idea what it is. And then there are even some of you who pretend to not know what it is, in order to live in a YOLO mode while coding.
Coverage or not coverage? That is THE question!
So here is what we’re not going to do in this blog post:
- Explain to you that code coverage is the miracle KPI to measure your code quality.
- Tell you that you should work 25 hours per day and 6 weeks per month to increase it by 15%.
- Try to convince you that it changed our coding life (even if it did) and that it will change yours.
Here is what we are going to do:
- Give you a non-boring and simple definition of coverage.
- Show you the code coverage of 8 repositories and look in depth at our star performer!
- Propose a new and efficient definition of code coverage.
What is code coverage?
Code coverage measures how much of your code is executed when you run your test suite. A test suite is a series of tests written to verify that your code is doing what it is supposed to do.
Let’s illustrate what this means with a random example: biking. I love to bike. I love to tinker with my bike. I change the wheels, handlebars, the pedals, the colour, and so on. Scientists say we become new humans every 7-10 years because in that time every cell of our body has been replaced with a new cell. I believe my bike becomes a new bike every 1-2 months.
Every time I want to ride my transformed bike I check 4 things before riding:
- The bike can actually move (importance = 55%): I try riding it 10 meters
- The brakes are working (importance = 25%): I check that the bike can not go forward or backward when I pull the brakes
- The bell is working (importance = 10%): I check that a sound comes from my bell when I press the button.
- The light is working (importance = 10%): I check that the light turns on when I click on the light switch.
If every item in the above checklist is checked, it results in a 100% score of importance and I’m good to go!
In this example, we can say that the checklist is the ‘bike test suite’ and the percentage score of importance is my ‘bike coverage’. Let us now say I forget to put ‘The light is working’ in my ‘bike test suite’: it means that we’ll have checked only 3 out of the 4 elements in the checklist, leading to only 90% bike coverage. Got it?
In that case, everyone should agree on coverage’s usefulness, right?
In the bike example here, bike coverage is based on the score of importance of each functionality of my bike. We can probably all agree that the bike actually being able to move is more important than the lights working. But for some of the other elements, you might not be in agreement with the above percentages, or they might vary over time. If you ride by night your light working might be more important than your horn. If you are a yolo person, maybe you won’t care about your brakes (disclaimer: Ponicode is not responsible for any bike-related mishaps related to this article!).
So how are these “scores of importance” measured when it comes to code?
There are various different standards, only two of which will be addressed here (since I promised you this wouldn’t be boring!):
Lines coverage and Branch coverage
I will illustrate the definition with the function CoverRage
There are two scenarios for this function:
- If I enter a number divisible by two, I enter the First Branch
- If I enter a number not divisible by two, I enter the Second Branch
Branch coverage is the percentage of scenarios executed on the source code when the test suite is run (checklist). Each scenario has the same importance score.
If we get back to our bike example, in the case of branch coverage, the percentages would look like this:
- The bike can ride (importance = 25%)
- The brakes are working (importance = 25%)
- The horn is working (importance = 25%)
- The light is working (importance = 25%)
There are 4 possible scenarios so each scenario that you check counts as 1 / 4 = 25% towards branch coverage definition.
The limitation of this coverage is that it does not take into consideration the relative importance of a scenario compared to another. Treating the ability to ride and the light working as equally important is clearly incoherent.
Let’s get back to our example: the CoverRage function. Let’s say there is only one item in the checklist: a number divisible by two (let’s take x=2 for example)
When I run the jest code coverage I get the following result:
There are two possible scenarios and I only check one. So I get 1/2 = 50% branch coverage.
Lines coverage is the percentage of lines executed on the source code when the test suite is run (checklist). Lines coverage is widely used in the coding industry.
Let’s get back to our example: the CoverRage function. Let’s say I only have one item on my checklist: a number divisible by two (let’s take x=2 for example)
When I run the jest code coverage I get the following result:
7 out of 8 lines are executed in the file so we get a Line coverage of 7/8 = 87.5%
The limitation of lines coverage is that it considers every line with the same importance but in real life, some lines of your code are much more important than the others.
And for those of you with sharp eyes, I get this code coverage testing with Ponicode 😉
Code coverage of our favorite repositories
At Ponicode we evaluate the performance of our AI using repositories that are highly representative of all repositories.
After running jest –coverage on these repositories we get the following results (I can only show you the results for 8 of them). With the team, we have come to the decision to only display the name of the repo of our star performer (Strapi) and put the others in the witness protection program, by giving them the names of our favourite Dragon Ball characters.
If you are the lucky owner of one of our Dragon Ball repos and recognize yourself through the character or coverage, we would love to have a chat with you because we work very often with your code! Feel free to contact me at firstname.lastname@example.org
As you can see, the average branch and lines coverage is pretty low (14% and 20%) with a high standard deviation.
Imagine what that would mean for our bike example: if my bike had 20% coverage it would mean that we’ve only checked that the horn and the light are working without checking the brakes or the riding ability of my bike! Of course that interpretation considers that branch and line coverage represent a relevant score of importance which is not always the case.
Big thumbs up to our friends at Strapi, which is by far at the top of the rankings in terms of branch coverage and lines coverage. Strapi is a new super open source CMS that is great for blogs and e-commerce projects.
If you want to see the results for yourself you can run the following commands in your terminal
git clone https://github.com/strapi/strapi
npm install jest
The table of coverage will look like this:
At Ponicode, we believe in unit tests as much as we believe in the importance of measuring their effectiveness on your code.
Since lines coverage and branch coverage are not ideal, we are constantly evaluating other metrics to score the importance of each part of your code.
We see two smarter alternatives today:
- Dynamic weighted line coverage: While your code is running in production, Ponicode could calculate how frequently each line of your code is called. Line coverage would then calculated by weighting each line by the number of times it is called
- Static weighted function coverage: We compute the relationship between the functions of your code. Each function of your code is related to other functions of your code. We calculate the number of relations per function. We then calculate the line coverage by weighting each line by the number of functions related to that line of code.
What do you think? Do you have another even more brilliant way of defining coverage? Do you use low code platforms to increase your code coverage? Contact me and let’s chat ! email@example.com
Special thanks to Alexandre Bodin from Strapi who inspired me to write this post!
Edmond Aouad, data scientist and cofounder at Ponicode
P.S. stay tuned if you want to know who our Dragon Ball repos are…contact us if this message makes you sweat 😉