Why we have a separate test suite for external services

Welcome to No Compromises. A peek into the mind of two old web devs who have seen some things. This is Joel.

And this is Aaron.

One of the things that I like about the way we write tests, and I don't think I've ever joined a project that does it quite like this, is we segregate out any test that's going to talk to the real world into its own test suite. We call it external and they're used very sparingly. And then anything in the normal feature test or integration test that would normally talk to AWS or whatever, we'll have a mock in place for that. Did I explain that good enough, Aaron? What do you think?

I'm going to pretend like I don't know what you're talking about. Why would you ever hit external services? What are you talking about, Joel?

Oh, okay. I thought you were going to go the other direction. "Well, why wouldn't your normal tests hit external services?" Because that is a point of view.

Well, I'm going to ask that too so you can answer either of those.

Okay. Ooh, I'm on the hot seat already. Why would we ever hit external services? Well, if you don't then you don't actually know if what you wrote works. Let's just take Stripe as a classic example. Stripe has an awesome sandbox and you're probably going to drive the app and make some test transactions. So, this is just an automated way of doing that. It's not something we lean on heavily, but generally, any API we touch in an external service like that, we'd have at least one or two tests to cover maybe the happy path and an expected failure mode. Like in the case of Stripe, a payment was declined or something like that.

Okay. Let me make sure that I have this clear. For most of the tests set up, anything that would hit a third party, the closest to the third party you mock out. Like, a service or even HTTP calls, you'll mock out. And then you have a separate set of tests, or is it a separate test suite, or a separate XML file? I mean, how do you-

Oh, you're leading me down this path, Aaron?

Yes. Well, there's a lot here that you haven't necessarily explained yet that one might not understand.

I appreciate that, thank you for being dumb for me. I think that's great. Yes, when I say test suite, like if you open up phpunit.xml you'll see there are test suites in it. Out of the box with Laravel, you have feature and unit and those basically map two top-level folders in your tests folder. And when you run PHPUnit, it will run all of the test suites. And you could pass command line arguments to say, "I only want to run this one test suite." But what we've done is we actually have a second PHPUnit XML file, and we call it just by convention phpunit-external.xml. And in there, we define a single test suite called external, which maps to a tests/external folder. And the reason we really keep it separate is because it also has to do with environment settings. I don't know how we want to go down this side rabbit hole. But the other thing we do that maybe is not typical is we explicitly block any env settings that would allow communication with the outside world, like an API key. When I say communication, I mean successful communication. The AWS access key we'll set explicitly to "do not use" so if we ever let a call slip through in our normal feature tests, it would fail because that's not a valid access key. But that's the mechanics of it. We keep it separate and then obviously the external ones have to have real env keys in there, and we just let that fall through to our local .env file.

Well, yeah, that makes sense. Then for CI or locally or whatever, is the project set up that it runs the first PHPUnit file or the second one? Like, is the second one somehow chained to the first one?

No.

What's the point of all this? I mean, I did it for the-

We haven't actually got to the question I wanted to talk about, but this is good context.

Okay, I get the local env. The env, I get that. Okay, we want to keep them separate, but why not just have a folder called external and have it, I don't know, tighter? You know, just do your coding better.

The main difference is because we use the PHPUnit XML to override our env settings. Like, in Laravel, maybe the other way that I've seen it done is you have like .env.testing and it kind of swaps in that entire file. We like using the PHPUnit configuration, number one, because it's a convention that exists in PHPUnit, and number two, it's selective. It doesn't swap out the whole file, we just override like, "I just want APP_ENV to be testing. I just want BCRYPT_ROUNDS to be 4," or whatever. You can very clearly see the only things that have changed. But because we're using that mechanism to squash any potential outside access keys in that file, that's the main reason we do it. There could be some trickery in PHPUnit to do it a different way, but this just seems simplest to me. The other thing which, getting back to the topic I had in mind today, was by keeping them completely separate it also makes it a lot easier to control when we run them. When I'm in a development mode and I'm running locally, or I would even argue our normal CI pipeline on a pull request, we don't want to run those external tests. They are slower, they're hitting a third-party system.

You could have limits.

Right, exactly.

You could have limits on those third parties. Okay, so it's separate.

Yeah. Do you want to talk? What was the other question you said you were going to ask me about? Oh, why don't we just run them all the time in our normal test suite? I think we just answered that.

So, when do you run them then, I guess? It sounds like you just told when we don't.

Right. Well, no, that's what I wanted to talk through with you today. Because this was actually a discussion we started having and the discussion was prompted by a real-world bug in production that would have been caught had I run the external test suite and I didn't, and it failed. And it was not a huge deal, but it just got me thinking like, "Man, should we be running these more often?" Because clearly, we're not running them enough. So, we were going to hash out on the podcast for everybody today what our take is on this.

Yes. The first thought that sprung to my mind right away, and I decided that I should bring it up and see if you can talk me out of it, is that it should be attached to pushes on the main branch. Whenever anything goes to the main branch, it also runs those external because main is what we use then to deploy to our production or you could be taking off it. Point is, there's one branch that's going to production and maybe that's where we should run it. The only thing that comes to my mind that I'm a little reluctant to agree with my own thought is that for times when maybe you're trying to do a quick bug fix that has nothing to do with a third party and you have to do it over and over and over. But I guess we could just probably write better GitHub Actions to have some sort of option to skip the run on certain commit messages. But at the same time then, do we really want to set up a scenario where we allow ourselves or other developers to skip a process in our CI pipeline?

I'm actually going to go back before I get into that. I want to address one thing you said, which is that you have the sufficient knowledge to know if a particular PR is going to break an external dependency or not. Because in this case, I was not working on the external dependency. In fact, I had merely done a composer update, a very targeted composer update, to solve one particular unrelated dependency issue. But there was a shared dependency between what I updated and the SDK, I think it was a geolocation service or something, and it actually broke the contract for HTTP messages under the hood. It was a very subtle thing. It wasn't, "Oh, I'm adding a new Stripe endpoint," or, "I'm changing how it works." I agree that you would definitely want to run external tests and I would like to think if that's what I had been doing, I would've run them locally and caught this. But I wasn't, this was the smallest of maintenance bumps in the world to solve kind of a flaky other issue that I won't get into. But it caught me by surprise for that reason.

Yeah, I guess that's a good idea, is whenever you run a composer update or whenever the composer.lock file changes, maybe that's another time that that should be ran.

Okay, that heuristic would have caught this particular one so you're right. If you're directly working in a feature, talking to a third party, or you're doing something potentially unexpected, like a bump in the composer.lock file, and you're just not sure what it affected. I could get behind that.

Or I should stop thinking like, "I'm running a team of 50 developers and we just say it's always on main."

It's always, do it.

And if there's a real big problem, you can comment it out for a temporary time while you're trying to fix stuff.

Yeah. Just a question to push back a little bit is, like in our case, for everyone's knowledge we typically have two long-running branches, we have main and we have develop. And the develop branch usually tracks a QA, staging server, whatever you want to call it, and then main tracks production. Why wouldn't we do it in develop? Wouldn't we want to know before we were trying to push our production to deploy?

Good question.

Or is it just the frequency?

Yeah, I think maybe it's the frequency. But, I mean, that's a good question. I don't know.

You'd be open to it, it sounds like.

Yeah. Well, this is actually a good example of if you ever wondered how Joel and I make decisions, this is what happens.

I also want to point out, I think when I first brought this up to you, you had thrown out an idea which I rejected I believe. Which was, "Why don't we run it on a schedule?" Because you can do that in GitHub. You can say, "I want to run this every night," or every Friday, or whatever. And the reason I pushed back immediately is because, well, that wouldn't have helped me in this case. It's not like this thing would've been sitting there between when I coded it and when it got deployed and overnight it would've run and caught it. But there might even be some value to that too. I just think as we're talking through this, the immediacy of-

There's probably better tools out there to check the functionality on a schedule of your application versus... I mean, this checks your coupling to that external service, it doesn't necessarily check the full functionality of your product. So really it would have to be both tests and they should both be running at the same time. But there's other tests out there, for example, based off the schedule where you could test your API or the three processes or three steps, user experience steps, to see if your application's still working on schedule.

Like a more sophisticated health check. Like, it's running through the actual most important part of your app, whatever that is, periodically.

Yeah. I mean, that's probably not a great idea. I mean, to schedule the function of it.

Yeah, we don't do that on any projects right now. Okay, the schedule thing, it seems like you've soured on that and we're kind of landing on... For sure, I think I'm on board with doing it in main, and then maybe we just try doing it in develop, and as long as it's not slowing us down needlessly, as long as we're not hitting API limits. Again, we're not a huge team, 50 devs cranking out 20 PRs a day. I think that's probably a pretty good spot to start.

So, I had a pretty humbling experience a couple of days ago.

Go on.

I decided to reprogram a garage door opener to my vehicle. And I think it was on a Sunday or Saturday or one of these days. So, it's a weekend, I'm just doing my weekend stuff, I'm not really planning on leaving the house, all that kind of stuff. I go down to my vehicle and I start programming the remote. And the garage door is open, context is good to know as the rest of the place is locked up. I think you know where this is going.

Oh boy, I do now.

I have the garage door opener and I'm programming it to the vehicle. Well, apparently you have to hold the button down to send a signal and et cetera. Well, what I learned later was that the garage door opener had enough juice to shut the garage door, but not open it again.

What? It was right on the edge.

Yes. I'm sitting there so I'm holding the button to program the thing. Sitting in my vehicle, I'm holding the button to close the garage door. It starts closing, and the vehicle's supposed to triple beep or something like, "Hey, you got it programmed," and it's just not. And I'm like, "Uh-oh." So, then the garage door is shut, I'm sitting there and I realize, "Okay, I'm sitting, I have keys for the vehicle but that's it." I have shorts on, I don't have a wallet, my phone is inside too. This is-

Oh, Aaron.

Yeah, I had just ran out to do... I'm like there's no way that this is going to be a problem. And I even have an emergency key in my wallet, but I don't have my wallet with me. The clicker thing isn't working anymore, I'm like, "Uh-oh." I'm just mashing the button to open it up, I'm just like, "Maybe this is going to work." I'm like, "What do I do?" So, I drove over real quick to our apartment complex offices, but it was five or six so they're all gone. I'm like, "Well, if I do need to call the emergency place to let me in, I don't even have a phone." So, I have to get their phone number and then find a phone somewhere and be like, "Hey," like we used to do back in the day,

It's like the 80s again.

Yeah. You know how weird it would be these days that someone stops and says... it's weird when someone says, "Hey, can I borrow your phone?"

Back then there would've been a payphone with a phone book in it though so you'd have been just one step ahead.

So anyway, I'm panicking. I'm like, "What do I do now?" And to top it all off, this is a little too TMI, I really had to use the bathroom.

Okay.

So, I was starting to think, "What do I do here? What do I do?" At last, I thought. I was like, "Hey, maybe I'll go over to Ace Hardware", I don't know. I have my clicker thing with me, I can drive the vehicle. So, I went over there and I just happened to see the manager of Ace Hardware grabbing carts. And I drove up and I told him everything that happened. I'm just like, "Hey man, here's what had happened." Here's what I think," because, I mean, at this point I'm just assuming it's the battery that died. I don't know. I'm like, "And I don't have any money to pay for a new battery." I'm like, "I realize what this sounds like. So, I pull up with this vehicle, I'm like, 'Hey, I need a new battery but I don't have my wallet or my phone. I don't have anything. I'm going to owe you huge man. What can I do to get a battery?"

So?

Dude helped me.

Oh, customer for life.

Yep, it's that Midwest nice, I guess. But he went and got a battery and took apart the thing, put it in, drove back and I told him like, "Hey, I'll be back after this and make it right with." He's like, "No, no, no." And I mentioned again, I was like, "Oh no, I'm going to come back and pay for this and grab you a soda or something, man." He's like, "Oh, no, no, no." He is like, "Just do it forward." I was like, "All right." Now I have myself a little challenge too, like how do I pay it forward? Maybe people will tweet to you, Joel, and tell me how I should pay it forward.

Well, I'm right here. You can buy me something.

I don't really care what you think. Joel's been bragging about his consistency writing the Daily Tips newsletter. There's 300, 400 entries, I don't know how many. I'm not really impressed.

It's not bragging, it's pride. But if you want to check out what we've been writing, head over to masteringlaravel.io and sign up for the daily email. Or you can browse the archive of all of those hundreds of amazing tips that I've written.

No Compromises, LLC