Balancing test coverage without chasing 100 percent
Welcome to No Compromises. A peek into the mind of two old web devs who have seen some things. This is Joel.
And this is Aaron. One of the many things I like about Joel is his passion and support for testing, and that's the thing we talk about a lot. And I think if you listen to some of our podcasts, read our tips and stuff, you might even get the idea that it's 100% of the things that we think about or that we go super in-depth into it. Which while true, we still want to balance that with moving our project forward. So today I wanted to talk a little bit about when we're doing testing, what are the areas that we should focus on testing in more broad strokes? And when do we personally kind of start drawing the line on maybe when's too much? So we can kind of talk a little bit about code coverage, but we can also talk about the types of things that we test.
I just want to say whenever you start out, "The thing I like about Joel is," I immediately feel it's some sort of trap. But you turned it around and then it felt sincere so thank you for that.
You're welcome. I think the first thing I wanted to kind of talk about was the thing that gets thrown around a lot of times when we talk about testing, which is the code coverage mechanism. And just to give a little high level on what that means, the code coverage is metric that we use that is supposed to indicate the quality of or the amount of test coverage or the amount of code that you have that's been ran through the testing framework. It doesn't necessarily also mean that every scenario in that code has been tested. A lot of times it just means that that code has been ran in the case of one test and that test could just simply be, "Run this code and then say you're good."
True equals true.
Yeah, you that's literally a covered test. While we don't want to focus on that too much, it is still a good usable metric. What's a sort of aim that you would go for with code coverage, Joel?
I view code coverage not as like necessarily a metric to be reached, but more of a sign where I for sure have not done tests yet. Like, if I see that it's covered I don't assume it actually is, but if I see it's not covered I can trust that. When we're adding test coverage to like a project we joined and maybe that's what they want us to help with, it's really useful guide for that. Is like, "Oh, this whole folder of controllers doesn't have a single feature test touching it so I know I have some work to do there." But that's kind of how I look at it. It's more like, what haven't I done? And I trust it less for, what has been done?
I would say I agree with that too. I use that as almost an indicator even on the individual files. Not every single file, I don't go too in-depth. But let's just say you're looking at middleware, for example, and you know you have some middleware in the project and it does its thing. But you don't really think about it because it's in the background, it's doing something else, I don't know what it does. And maybe you go and you look at the coverage and you see the middleware folder is 25%, you're like, "Oh, that's interesting." And you'll go and look in there and maybe see your custom middleware you wrote, "I even forgot I wrote that because I needed X, Y, Z to happen and I did it once, and now I never think about it." So you go there and you click that and you see that's only 10% and you see that maybe because you've forgotten about that code that you wrote so long ago and since it's in a sort of a far-off place, it's not really in the middle of your brain. You see that you've only written a happy path and that you're not really testing anything in that code. And I use it, like you said, kind of as a pointer on where are some of the things I'm missing.
And middleware is a great example because a lot of times those will actually show tons of coverage just because if they're attached at a route and you tested that route. But maybe none of your test logic has anything to do with the logic in the middleware. It's like that's one where I really seem like a false sense of, "Oh yeah, we got this covered." It's like, "No, you didn't test anything about the middleware. You just happen to call a feature that used the middleware." But when you said happy path, the thing I visualized in my head was literally conditional blocks of code. You know, an, if else somewhere in there or like a try-catch or something like that where it went down one path but there's other paths that if you look at the line level coverage shows nothing hit that. Again, that's another great signal. Like, I got to jump in and write something to at least hit that path.
Well, it feels that you did that thing that people do when they don't want to answer the question and then they talk about something else.
I don't know what you're talking about.
What's the code coverage percentage that you aim for?
Oh, I don't have one. I will say, for the project as a whole or folder by folder?
I don't know. I'm just leaving it open for you.
Okay. Project as a whole, I really don't... I can't think on any of our projects where we even have that published as a badge that we look at. It really is a foreign question for me. If you were forcing me to answer it, I mean I'd want to see higher than 75%. It feels like a very attainable goal without getting into the area where you're trying to test things that you couldn't hit this one helper's file or something. You know what I mean? That's a reasonable minimum, I don't know what a reasonable high-end goal to go for. Controller's folder I would want to see much higher than that, but is that close enough of an answer for you?
Well, I wanted to push on that because you got to remember that there's different audience, it has different requirements. Even though we know that that number isn't necessarily important, there are programmers I'm sure that listen that are in a corporate setting that they have to have a metric. And sometimes you have the metric so I'm going to answer that question from my experience working for a big box programmer.
Please.
The amount that you want of code coverage for your entire project is as low as you can possibly get anyone to agree to but then to never go lower. Also, the reason I bring that up is because it's a good metric to use slightly differently. People look at that metric and think, "Oh, it's meant to be a code coverage of the entire project," or whatever. I'm looking at it as a baseline of saying that I have a reasonable certainty of 50% of my project I know is at least ran without bugs or without fatal errors and I can feel comfortable about 50%. And hopefully, that stuff is tested but at least I know it runs. So I want to know that at least 50% of my project code, even brand new code, is covered. I don't maybe need to know it's exactly the same code all the time but I want to baseline and say, "Oh, I have something," so I want to push that as low as possible and then slowly start walking that up. And the reason I do that sound low too is because when we first start out, it's an insurmountable amount to get any sort of coverage, right?
Yeah.
So we want to get to a point where we can generate that first 25, that first 50%, because of the ways that the code runs. You can actually get there pretty fast without having a lot of value in the test, we'll say. And that's not where you want to stay there forever, but sometimes you have to play with those metrics and things too. That's why I wanted to dig in on there is because, yeah, I personally when given a choice don't necessarily care what that number is. But if forced I want to push that number low and then move that number up with the quality of code and the quality of coverage and the reduction in bugs that we're able to show. You can kind of use that as a negotiation tool as well is, "Okay, we have 50% coverage and we have on average 2% bug rate. As I move up this coverage, is my bug testing rate going lower?" If so, "Corporate, you need to support us writing more tests," that's how those metrics are used.
I like that. Because a lot of times the thing I think about with metrics is they're easy to gain, you kind of alluded to that in your discussion. But what you ended on was, no, actually using them to make strategic decisions in how the project is run, I like that. If everybody's being honest about it, the more data the better to actually guide your decision. And I like the idea of as low as possible and then not going lower. Because that sort of addresses the issue of, "Oh, somebody just went crazy with chatGPT and added like 30 files and none of them are tested," you know what I mean? That would drop that number and now I want to pay attention to that and be like, "No, you can't just dump that much code with no tests on it." I like that approach.
We talk a lot about this coverage being... it's just being ran, it's not necessarily testing something specifically. The second question I kind of wanted to dig into too is, what are we actually going to test? Or how do we know we're going far enough when we write this code? And I want to break that up into a couple different disciplines. There's different disciplines in the way that we look at programming and generally, we refer to them as backend or front end or logical and visual, and things like that. And obviously as we program, we kind of jump between those two. But in general, that's kind of the area where I tend to start drawing a line is when I jump from one discipline to another, usually then those have their own set of tools as well. When we're talking about testing here and where do we draw the line? Not saying not to test the other stuff, I'm saying it might be this time where we stop with one set of tooling and move to the next. I'll put that in a more concrete example. When we're loading up a page and we have some model data on there, maybe we have two or three different models that we're sending to our view, I might stop or I might draw the line at checking to make sure that a certain view has. I've made sure the proper models are in my view, they've passed a blade. But then I don't maybe test that it's showing the person's name on the screen or if it's a user model. I don't necessarily show address field. I might check those if that's important but another thing is I might just leave that to a different testing tool too. Because logically I want to kind of draw the line between what is some sort of logical testing I can do in PHP and what is more visual testing.
And I agree with that demarcation. And I would say, I know for us on our projects where we might assert something was shown is if there's logic around it. If this Boolean flag is set to true, then show the person's billing address, or whatever, we might test that with the tools that Laravel Blade helpers give us. But in general practice, if we're just echoing out five fields on a model, we're not going to try to do that from Laravel.
And of course, there is Dusk and there's tools like that too. But even so that those tools are in our Laravel project, I still consider them a separate discipline in a way. I mean, I'm using PHP to write them, but the way I'm testing things in there is different than the way I logically think about testing stuff with my PHP in PHPUnit or pass tests.
For sure, I agree. It's a nice convenience, but it's kind of covering over the fact that this is a totally different way of testing. It's just trying to make it accessible to the dev that only wants to be in PHP land.
So we have to kind of make sure that we draw the proper line too of saying, "I know this is code coverage, I know it's ran so I don't have to do anything else." No. Well, then I need to address all the logical things that have happened here. I need to check all my session data, I need to look at the view, I need to inspect HTML. Well, no, that's too far, there's a little bit of middle area there. It's like they're different logically? Have we presented the proper information? Did logical steps happen on the backend? And then are there other things that are happening that are logical-based that we are in control of from Blade or from PHP? Sure, test those. But if it's something like JavaScript functionality or something like that, reach for something like Cypress or Dusk or something like that and do those tests separately. And that's a different type, of like I said, discipline, different type of scenario. It doesn't necessarily count for my coverage in my book.
I agree. Just the takeaway I'm getting from this discussion, Aaron. The first thing was know what the metrics mean when we're talking about coverage, use them wisely, but don't get overconfident just because the number is good. And then the second piece is that balance and I think this is something developers struggle with in general. But balance between how far do you go with testing in balance. In terms of how far should I even go with this tool and when does it make sense to look at a different tool to test the right thing that I'm trying to test?
You got it. I know we've talked about other things that make us awkward and anxious in the real world. Like, when you don't buy anything at the store, and then you leave and you go through the scanners and you're like, "I know I haven't stolen something but have I?" and that sort of thing. But the other thing that I don't understand. I feel like I'm not the only person that has this problem, but I don't understand why it happens then. Is if you pay something with cash. First of all, I don't know if everyone knows what cash is. That's physical money, people use that anymore?
I saw that in a museum, I think.
If you pay with cash and that changes dollar or bills and coins that they hand it back to you with the paper at the bottom and the coins on top open and then it's like, "Thank you. Goodbye. Please leave." I'm sitting here with my wallet in one hand of a thing of money balancing precariously in the other and then there's pressure to leave. Like, hold on a second. Why don't you give me the coins first so it goes in my hand and at least I can grab this angrily as I try to run away and get my groceries, but no. Then when it's on top, you have to kind of be careful. If you close your hand on it, they could slide out sideways. Then you have to hold your hand up and maybe tilt them so the coins go on the other hand. But wait, that's where my wallet is right now and coins aren't where my wallet goes. So I get to put the wallet on the... ugh.
You're stressing me out, man.
Sorry. Do you know what I'm talking about or no?
I do. I don't use a lot of cash and I was just saying, first of all, you brought this problem on yourself by using cash. But I can relate and the more you explain it I'm like, "Oh, yeah, I hate that."
That's been the best part about using electronic payments is not that. Moving from 0% test coverage to 1% is the hardest step.
So that's something we can help you with. We've helped other companies do this, give us a call. Head to our website nocompromises.io and then book a call. Link is right there.