Don't get overwhelmed by errors in your application
Welcome to No Compromises. A peek into the mind of two old web devs who have seen some things. This is Joel.
And this is Aaron.
If there's one topic that Aaron cares deeply about, it is error tracking. Or it's not. But I had an idea for a topic today and maybe it's boring. I might be weird but I enjoy looking at a nice dashboard to see the health of my app. And I know there's a lot of different tools for this, I'm just going to talk about one that we use, which is Sentry. And specifically two different features it has in it for unhandled errors and another one for performance. These topics might not be the most exciting but as we've joined projects... first of all, we've seen a number of projects that don't have something like this set up. So I think base layer is, "Why would you set something up this?" But then the other thing is, how do you use this tool? Because especially if you introduce it to a long-running Laravel project, when you first turn it on you might actually find lots of problems. And knowing how to triage that and how to prioritize that amongst your other new features or things that you wanted to do before you knew you had all these other problems, I think there's a little bit of an art to that. And we have some experience that maybe would be useful to you out there.
Yeah. There's multiple different levels and safeguards and stuff for performance and errors and stuff like that. We're kind of looking at it as if you're starting from scratch, it doesn't mean your product or your project is brand new. But you don't necessarily look at all your logs or maybe you're not even keeping them in a place so you're not seeing your errors. You maybe don't have the best test coverage, you haven't turned off the ability to have eager loading in tests like we kind of do sometimes. Maybe you use Telescope or something like that only when there's a problem or when you want to look at something so you don't necessarily have a workflow in place to know if you actually have bugs. You're basically relying on users to tell you. Or, if your application is slow because everything works fine locally and no one has told you any differently and you don't know if it's slow or just users have gotten used to it. That is the absolute best place of a project to make use of something like Sentry.
I agree. I think, it probably has some use early in a project's lifecycle but it really shines once it's in production and real people are using it. Yeah, a project that's been out there for, let's say 5, 6, 7 years, when you add Sentry to it or a similar tool, you're going to see things. You're going to learn things about your app that you didn't know before, even if you are looking at logs there are some interesting things that will surface for you. So maybe the first one to talk about is exceptions, unhandled errors. Sentry is, I don't know how unique it is amongst other products, but it will surface issues there that aren't strictly errors but are probably things you don't want to have happening and they might be performance related. But let's just stick with the exceptions first. If an exception is thrown and you don't handle it, you clearly want to know about that, right? So if you don't have a tool like this... You mentioned logs. Like, "Look, I can look at the logs," those should be logged somewhere. Maybe you even have your logs writing different severities. Warnings versus errors, whatever, to Slack or some other place that has visibility where you don't have to just look at a raw log file. But an exception handler like this will give you a ton of context. It might tell you which user encountered that, what browser were they on, what time of day was it, what other things were happening on the system around that time. There's just all of this really nice context. What are the things they clicked on before they got to it? You know, maybe an error was thrown on the third query on the page. "Well, what were the first two queries?" Like, that's context you're not going to get in a log but you will get in one of these nice exception handling tools.
It's like a decorated trace basically so it's going to show you kind of what happened. Along those lines is those tools also have the ability to kind of track context. I know the newest version of Laravel has the context for the logging and all that kind of stuff. But these tools usually, like Sentry, also have the ability to add breadcrumbs is usually kind of what they refer to them as. So you can even put in additional information along the way. Maybe, "I'm not sure if this condition is what causes that error message so I'm going to log the condition before it," and that won't get logged or anything if everything works fine. But if it does have an exception or does have an error, you'll see that breadcrumb that you've also specified and that'll kind of help you put together everything.
Yeah. I mean, sometimes you don't even know what questions to ask in advance to manually log things. That's why kind of getting some of this context collected for free is so invaluable because it might be hard to reproduce otherwise. So hopefully I've made a case why you want to have some sort of exception tracking and alerting with context. But now let's talk about that project you were kind of alluding to. It's been out there in the wild for a while and you turn Sentry on... I'm just going to be honest, you're probably, the first day, going to get way more errors logged than you expected. So how do you triage that? Like, do you stop everything and fix all of those? Or how do you kind of decide what the the priority is and factor that into what your team or you were already planning on working on the week before you launched Sentry into production?
Well, I think that those questions have to be answered before you even have the tool. So if you discovered an error, how would you triage the error regardless of the tool? There's triage steps. Is, does it affect customers or employees? Is it internal or external? Does it affect one customer, many customers or all customers? And those are the two little sort of matrixes that you use to determine how much of a priority this bug is. Then you have those rules and you're kind of in your team, "If it's this high or in this part of the matrix, we drop everything. If it's not, we put it into our path and prioritize it like everything else." The fact that the tool exists is not any different, that's something that you have to develop in your team as well. The tool just makes it easier to identify these things that you should have been handling already.
Yeah, that's a good point. This is a valid process regardless of the tool. I think the tool introduces a wrinkle because all of a sudden you will see all of them at once. Whereas before maybe they were trickling in, support people would get something and two weeks later would make it up the chain to you. When you're talking about those different metrics or ways of deciding the urgency, another one I thought of is like, is there data loss involved? Is somebody submitting something and it's disappearing? That's another one that for me... Is revenue being lost? That's a huge one. Somebody's trying to pay us for something and they're being prevented from it. But, yeah, it's going to be different for your tool, your team, your product, your organization. But don't feel overwhelmed, I guess that's my warning, when you turn this tool on. Number one. The other extreme is don't ignore everything because sometimes there's so much coming in. Like, "Well, I'm never looking at this." Then you might as well not have the tool so I think that happy medium of triaging is important. Maybe just a little note on performance because performance could be just a pure, "How fast does this page load?" right? Like that sort of performance. But there's also systemic performance issues like N+1 database queries. And this is where I think something like Sentry is actually really powerful. Even if you have some of the Laravel tools flipped on to prevent eager loading, you can actually have an N+1 that is not a lazy loaded thing. There are ways of creating multiple database queries that Sentry will catch and will report to you that some of those Laravel strictness checks or performance checks would not catch because it technically isn't strictly that N+1 thing happening that Laravel catches for you. So that's important.
I mean, that's just the same thing though. We want to make sure you put a sort of measurement around that as like, "Okay. So we found a performance issue in N+1. Is it actually a performance issue that's impacting us?" You know, "Does that matter?" There's also times, really rare... I don't feel like I should even bring this up now that I think about it, but there's also times when you have to work within memory constraints. Where multiple database queries may make sense because you don't want to keep everything in memory.
Sure.
That's pretty rare and if you're reaching for that, really think about it. But this idea that how important is this performance thing too? Because we don't want to go down that same route, which is a tiny little bug or a little bit of performance. You know, if it's still within a couple of milliseconds, who cares? It's fine.
Yeah. The rule about triaging and prioritization still applies. But I like that particular example because we're pretty strict on our projects about preventing eager loading, having good test coverage that would catch it. The first time Sentry reported that, I'm like, "Wait a second," but it actually was not a classic N+1, it was a different issue. And it wasn't a major performance issue, but we did fix it. It was still nice to fix. The other one I've seen thinking of a particular project we're working on is it will log and throw an error if you have consecutive HTTP calls. So, if you're making a post inside of a loop, it will catch that. And now that's not to say... Sometimes you want to do that, like maybe you're loading a paginated result, but maybe you don't intend to do that, it's an opportunity to sort of reuse a result. There's just these little things and you can tune and tweak it, you can prioritize and deprioritize things in Sentry for what your team cares about. But it's a lot of valuable information and I think it is useful to have that as a tool in the tool belt.
I have a really interesting question for you.
Oh boy.
What does cola taste like?
Cola?
Yeah.
Why aren't you naming a specific cola? Why are you saying cola generic?
I don't know, it's Coca-Cola.
Okay.
What is the flavor profile of those? Like, if you were to describe to someone who's never had a cola before, what would you say it tastes like?
I'd have to describe it in terms of other sodas. Like, if you mix this and that.
If they've never had a soda before they just had water and milk and whatever, and maybe sparkling water so they know that there's going to be sparkling in it. But what is the actual taste of Cola?
Brown. I don't know, I don't have a-
I know, isn't that weird? Like, I can't feel... I don't know how I would describe that.
Yeah, taste in general.
I think you should tweet Joel.
What?
You should tweet Joel and tell him what you think cola tastes like.
Please, yes, now I need to know. If this comes up I'm ill prepared, clearly.
Oh, hey, do you need help triaging your pile of s-sentry errors?
That is something we're good at. We help with legacy apps and getting them up to speed and nice and shiny. Head over to nocompromises.io and book a call to see how we can help.