Should we use a JSON field for this?

Welcome to No Compromises. A peek into the mind of two old web devs who have seen some things. This is Joel.

And this is Aaron.

Today's topic is brought to you by the fact that Aaron made fun of me recently because I was suggesting use of a JSON field in a particular feature. And he's like, "Joel, why is everything JSON fields with you?" He had a point that there were a of couple circumstances in a row. It's not a feature we reach for a lot, but we... You know, a few different things, I was pitching it. Anyways, it sparked this conversation idea of like, what is a good use for a JSON field and what are some things where maybe you're starting to stretch or squeeze, what is that, a round peg in a square hole or vice versa? Just to kind of talk through it and mainly so Aaron and I can hash this out and not have this argument in the future.

Well, we can prove that Aaron is right about this argument.

That too. Yep, exactly.

So, I think there is a bunch of context about this particular discussion and we'll get to that a little bit. But the main question we're really talking about is, how do we store more than one piece of specific information in a column of a database, and should we in Eloquent? And that's when we're talking about these JSON fields. Is like, "Hey, we have a column or we have a property, we want to have a list or the shape of data or something here. How does that fit in here in that column?" Or is that some sort of relational data that we should have normalized out like we have with other things? I think that's really the question.

That's a good starting point because, well, what's the alternative? And like you were saying, probably the most common alternative is just create a table with columns in it and those rows and use foreign keys and relate data that way. You know, if we go backwards in time, I think maybe the older version of the JSON field would be serializing PHP objects or something. And that's not even on the table right now for myriad reasons. We had to do that recently using saloon, it was one of the recommended ways of storing an OAuth authenticator. And we both just... We didn't like, "Oh, this feels wrong. Like, we shouldn't be doing this anymore." But, yeah, the JSON field is a nicer version of that and databases actually support it. So, if you need to query it or filter it or whatever, or update a piece of it in place, there are mechanisms to do that.

Yeah. I think one of the issues is, what is this data that we're trying to store and how are we using it? Is it Queryable? Like you said, it can be Queryable but there's different types of... So are tools that we're developing queries on support the JSON query language, for example. You know, there's all these different things that we have to consider with that data. Let's give an example of a piece of data and where that might go.

And I'll kind of come up with a sanitized version of a real-life thing here. But let's say you have a set of configuration options that are like, in the database you have different companies, right? And each company might want to customize, "Oh, this feature is turned on or off," or... And just to make it more concrete, let's say there's at least 10 configuration options and, I think this is valid too, the vast majority of companies are probably not going to want to customize it, at least not in the short term. Does that help set it up? Do you want me to describe the shape of the option or is that good enough?

No, that's good enough. Basically, we have a number of different options. That number first of all is... The first calculation I would do-

Yeah, I agree.

... is, how many things are we supposed to track here and how do they get updated and things like that. If it's maybe a one-off, maybe one setting that's a little bit different or two, then I'm starting to think, "Okay. And if a one-off for this one client maybe we can store it in that data structure." But when we started talking about 10, that's when it starts getting to a different level with me. It's like, well, there's 10 of something, there's a greater chance now that there's 10 that those things are going to be modified. Right? So, a single thing has X percent of modification, once you add many onto there there's a chance it's going to get modified more frequently. And that's kind of what I'll do. I'll use my next measurement on saying is, "Hey, how many are the thing in here? And then what is the frequency of that change?" and stuff like that.

So, which way is that pointing you though? Like, the more you have the more prone to change, does that make you more double down on it should be a table? Or more double down, well, it should be adjacent object because that'll flex as the changes come?

That's a good question. I know what you said like that flexes the data as the changes come, but that's not the point. I'm basing this off of we have reasonable certainty of the shape of this data for time to come, not forever, but we know. There's not a lot of change in regard to the shape so I don't have to worry about modifying tables and doing really complex key value things. But I'm looking at the fact that we have one property that now has the logical decisions of 10 things on it. And that's where I'm starting to get to an issue is when I'm starting to store data and I'm starting to store more than... just a ton of different decisions on that one property, that's when I start to think like, maybe that's not the right way to have this data.

Yeah. I think there's maybe a relevant bit of context for this particular feature that I didn't include. And it was, there was some inherent logic, like, oh, if they don't customize it. Like, if this field is null, this configuration field, then it should fall back to the current system default which those are going to be captured in code, in an enum somewhere. So that was part of the decision is, like, well, we need to already have an object that kind of encapsulates this logic. So if we have an object in the database, if it's there, we pull it out, we kind of merge in with the defaults. And that was the other thing too, is maybe this one company would only customize one or two things and the other eight would fall through to system default. So this is a much more nuanced thing now that I'm saying it but in general terms, I agree with you there.

Well, this example too. Because I was thinking about these are both key and values that mean something in this example. If they change the key, I mean, that's an... do you edit everything that has been using that key or whatever? Again, that's a different question. Like, is the shape of the data or of the key and value going to change? If so, then we definitely not going to use this JSON column like this.

Right. Yeah, that's a good point. It would make migrations a lot more complicated to have to reach in and start reshaping JSON objects throughout the database. This was helpful. I want to maybe throw out another example where you clearly talked me out of using the JSON object. And it was for a different reason and it was just for simplicity. So in this particular case, I'm trying to remember the feature now, but it was... we had a situation where they could change the labels of something. It was actually kind of similar to this, where there's a system set of default labels for a field and somebody might... One company actually said, "Well, hey, could we change this to that?" And I didn't want to create a whole CRUD interface and a whole extra table and you're like, "Joel, we have this. This is such a simple feature, just make it a CRUD feature." What was your thinking there? Was it purely about the size of the data or was it also... I think there was another layer on there of just making it the same as other features in the app.

Well, again, we're kind of going into the context of this particular thing. So I think we kind of went, we started generic, and now we're going really onto a specific problem.

That's right. Let's go.

But it was because we had a very similar setup and only a few of the requirements were slightly different. An example like this, we already had a system where there was relationship and a set of data is created when it was first instantiated and then those are CRUD. The ability to kind of copy that over, and it solved a lot of the problems that we thought. Basically what happens if you edit it or things like that, you know?

Right.

So copying this over, we already had their functionality. And I think it also allows us to have a discussion with a client and say, "Hey, I understand that you've said this one thing. Here's a different way that we could do it, and we think that this is better for a couple reasons. A, the real reason, B, we're also doing it in your app with this other setting. Do you think we should do that?" And a lot of times you'll find that even if you're looking at a requirement or you're looking at something you're trying to build, and it's very technical and you have to even ask yourself. Is this actually a requirement or did the client not understand the complexity of this and that we need to step that back too? I know I've gone kind of on a tangent here, but there's a lot of different things involved with, are we going to use adjacent structure or not for this?

Right. Well, the one you were just a little bit of a tangent, I think that could be a whole separate episode because it's like, how much do we trust these requirements? Like are they actually the real requirements or this is what they promised right now and what they need right this minute. But we know in a month or two it's going to expand and then we should have just done the full feature like the other one that already exists. You're trying to predict the future a little bit, but you're also playing the odds based on our experience of how this has gone on other projects.

Right. I think overall in general, when we consider these, I know... again we went on a tangent. But when we consider this JSON column, I think it really has a lot to do with how core is this to the rest of your application? How many are there? Like, what is the impact of this property on how the application runs? And maybe that's the kind of thing. Is like, if the impact is anything, you know... It's not even a measurement, but anything reasonably large amount of functionality changes with these properties. Maybe it's something that should be more inside of a tighter structure than just adjacent object. And by the way, so we don't get anyone just super angry. There are tools for JSON objects for storing that, there's SQL databases and stuff. We're not talking about that specifically for this. We're talking about my MySQL or Postgres or something like that.

No, that's good. A JSON field column type.

Yeah, there's a whole other database type.

That's right. No, it's good summary. It is a little bit of a gut feel. And there's probably other metrics, obviously, there are performance impacts, right? If this is something going to be heavily queried and you need an index on it. Again, depending on your database engine and what version it is, and what features are enabled, those are all things that are outside the scope of this discussion. But I kind of like your high-level algorithm-based approach for, does it make sense as a JSON field or not? Last weekend, I took a little bit of a road trip and as I do driving, I was having some deep thoughts, Aaron. Not really but more just random thoughts, trying to keep my myself occupied. The first one I want to share with you is, I don't know how noisy your car is but you're driving on the interstate and you're going fast enough, there's a little bit of road noise in your car. I mean, maybe your car's fancier than mine but there's a little bit of road noise. And we were trying to listen, for some reason we had queued up the soundtrack to Interstellar. Are you familiar with this movie or soundtrack?

Yeah.

It's symphonic. But the thing is long sections super quiet, then long sections super loud. So when it would get to the quiet parts, you literally... I couldn't hear anything at all. So you'd crank it up, up, up, and then boom, the pipe would come in. So I guess that was my first thought is when you're doing a road trip, do you have a bias to certain types of music to listen to? I know you're a music guy, would you ever listen to the Interstellar soundtrack? Was that just a fool's errand?

That was silly, man. That's something that you listen to at home when you're not going to crash the car.

No, it makes driving seem way more intense. Like, "Oh, I'm landing the spaceship," you know?

No. I mean, I listen to usually rock or rap, basically anything that will exercise the subwoofers in the car because I'm a toddler and have those. I like it when drive past and it sounds like there's rattling in the trunk and paper back there because it's going (sound).

All right. I got one more driving question for you. I'm just going to throw this at you. Also relates to being on the interstate. Are you a cruise control guy? You know, do you like to kind of pick your-

Yeah, absolutely.

Okay. I think your car may be adaptive cruise control so if you cruise and then you come up on somebody-

Oh, I don't trust that. Dude, I'm a programmer.

No.

I've seen how we program our stuff, man. I'm not going to trust my life to this car.

I'm not going to sleep but if there's something like five car lengths ahead and it starts to slow down, that's a useful feature.

Sometimes you have a question, like we started out this podcast, and it goes on a totally different tangent and you learn a bunch of new things.

Sounds a lot like what goes on in our Mastering Laravel Community. If you want to check that out, head to masteringlaravel.io and click the community menu.

No Compromises, LLC