How we use seeders in our applications

Joel Clermont (00:00):
Welcome to No compromises, a peek into the mind of two old web devs who have seen some things. This is Joel.

Aaron Saray (00:08):
And this is Aaron.

Joel Clermont (00:15):
Today we wanted to talk about something. We've sort of hinted at it or discussed it in a few different contexts, but never explicitly, and that is how we like to use seeders in our Laravel application. Like, what purpose do they serve. Because people use them for different things, but we sort of have a way of approaching it and thought it'd be nice to kind of lay out here what our approach is and why.

Aaron Saray (00:40):
You mean database seeders? Not those things that stop moths in your closet, cedar strips or whatever.

Joel Clermont (00:46):
Oh boy. We're already starting with the tree puns. Okay, yes, database seeders. The thing that lets you generate data by running a command.

Aaron Saray (00:56):
I think there's a couple of different things we want to clear up first before we get in there. Seeders are, like Joel said, running a command to generate data. That doesn't mean that there isn't other commands that can do that sort of stuff too. This is just sort of a set aside sort of you know if you're looking in the database/seeders folder you're going to find commands that are ran by the seeder command structure of Laravel that will generate data. They may use just direct SQL, they may use eloquent models, they may use eloquent factories, they can use any number of things. That's not necessarily the aim of what we want to talk about today. We want to talk more about what they're for and why you might use them.

Joel Clermont (01:40):
Yeah. Let me just kind of start with a concrete example. Because I think this is pretty universal for Laravel applications. You have a user's table and you log in as a user. So, you need a user to use the application, would we put those in a seeder, Aaron? What would you do?

Aaron Saray (02:02):
Well, do you need a user to use the application first of all? Is the question. In this case, let's just say we're talking about a one super admin user that gets created when the application first goes out or to a new environment or something like that. This might be controversial, but I would still put that in my migration. The reason we do that is because we kind of talk about migrations are setting up the structure of your database as well as, in my opinion, anything that's absolutely a thousand percent required for your application to succeed. We might've mentioned this before, for example, zip codes. Having all the zip codes in your database makes it work, the application work, but it doesn't necessarily... it's not required for the system to run. We're just talking about the very basics of running, right?

Joel Clermont (02:55):
Right.

Aaron Saray (02:55):
So, if you don't have a zip code table, well your code that tries to query the zip code's going to fail, okay? That's why the table creation goes into migration. But if you don't have any zip codes whatsoever, the workflow of your product might fail. Like, the user can't get any further along because you can't find their zip code. But it doesn't mean that the application is doing an error 500. It should be coded in such a way that says, "I can't find a zip code. This isn't valid." "No, that's not really accurate because there are no zip codes." But there's a slight difference there between the application can work but just not as expected versus the application fails with an error 500.

Joel Clermont (03:32):
Right. I think that is an important distinction because that distinction made it click in my head. Same thing with a user. If there's no users in the table, nobody can log in, but it's not going to throw an error, it's not going to throw that 500 server exception error. I'll admit, our thinking on this has matured. I was looking at a project that, it's not super old, maybe four years old, because I was analyzing like how have we used seeders in our applications? I was doing some research on something. This one we did actually seed I think a couple users, maybe one for us and one for the client, and none of our newer projects do that.
But we've done this approach in the past, it served a purpose but we just have a different approach now and kind of settled on this as something we prefer to do. I think it's also important to maybe shift gears a little bit or take a step back. It is to define what is the actual purpose of a seeder? Because you mentioned one that I think most people wouldn't necessarily think about, which is running it in production to set up a new production environment. That is one way that you would use it. Others I think of maybe more commonly is, "Well, I'm a developer using the app locally, I just want to have a thing I can click around and have some realistic-looking data without having to pull down a production copy of data or something like that." So that's another use case.

Aaron Saray (05:05):
No one does that anyway, right? They don't pull down production databases and put them in their local stuff.

Joel Clermont (05:10):
Of course not, never. That would never happen. Then the third one I think is we're testing, right? Some people like to have certain things ready to go to make writing tests easier or less verbose, or whatever. So maybe we should kind of clarify when we're talking about how we use them. Like, how do we use them in those different formats?

Aaron Saray (05:29):
I think that's a good point. Really what you're kind of saying is there's a tool set called the seeder, the seeder ecosystem almost, and then we can use those for individual things?

Joel Clermont (05:41):
Yeah.

Aaron Saray (05:41):
Now, it's important to understand one of the core little features that's helpful in Laravel is if you are doing a migration fresh for example or whatever, and you can go --seed or you can go db:seed it. If you don't specify what seeder you're aiming for specifically, it'll just run the main seeder and in there you could call different seeders. One of the things that we'll do is, first of all we'll hardly ever put any code or any data inside of the main seeder. We'll do that, inside of there we'll check for the triggers that indicate what we're using our seeders for. For example, if we're running we might have a check to see if it's a local environment. If that's the case we might then have it call a dev data seeder.
If it's in test, you know the testing environment, if we're going to do that... I normally don't ever use seeders in testing. But if you have a really, really, really tough situation... and we're not talking about making two or three models, use a factory then. Just do it. But if we're talking like your application for some reason needs 500 whatever to function and even test, then you might use that. But that way the first thing you remember about that then is if you run seed, we're making sure that we only run the proper seeders for the proper situation. So even in production if we ran it, none of the conditions are true so no seeding will happen. So that we don't overwrite any of the data that's already there if we were to happen to accidentally run it in production again.

Joel Clermont (07:12):
Right, that would not be good. But, yeah, relying on those flags or even what environment is this application running in allows that top level, I think it's just called database seeder, to decide which seeders are appropriate to run. Because you generally have one seeder per table, it sort of follows the same flow as the factories. It's kind of a one-to-one relationship. Doesn't have to be, but that seems to be a convention we follow and that we see used in most Laravel projects.

Aaron Saray (07:41):
I think when I take on a project then my first sort of goal is I want to make sure I can use the project locally for development. And in order to do that, it's kind of what you mentioned, which is the data that you want so you can click something around. I'll make that dev data seeder first because, for a number of different reasons, one, sometimes the database structure you have is non-existent or you have to run the migrations a couple of times over and over. And it's nice to be able to put that data in afterwards. I mean, one of the things that I'll do, we kind of talked about in a different podcast, migration up and down.
But in other cases, if you have your local stuff there and you're seeding in your data and all that kind of stuff anyway, and you want start over from fresh... Let's just say you've clicked around and done a bunch of stuff, you could do migrate fresh::seed or migrate fresh--seed, blow away all your data, rebuild the database and seed in all your data structure to begin with again. So, you're all at the position that you recognize and you can start again. Sometimes you want to keep your development data, sometimes you want to blow it away, so having that data structured in that development seeder is usually the first place I go because I really don't want to be downloading that production data.

Joel Clermont (08:52):
Right? Yeah, for sure, there's all sorts of concerns data privacy, security, things like that, that we're not going to get into today. But I'm glad you mentioned that because that is important to consider. So, if I had to kind of summarize our approach and you can tell me if you disagree. The seeders meant for production extremely minimal, and in fact I will say almost always nothing, no seeders intended to run a production. Seeders used for testing, same thing, almost never use them. We prefer to have the tests explicitly set up the data they need. But then that dev data seeder, the just to facilitate me clicking around the app, seeing a UI with a reasonable amount of data, that's where we focus the bulk of our seeding logic and most of the records get generated.

Aaron Saray (09:42):
Yeah, I would agree with that. I think I want to expand one more real quick, the testing.

Joel Clermont (09:46):
Okay.

Aaron Saray (09:47):
There's two different forms of testing that seeders probably are important for. The first is unit or feature testing. We talk about creating database information to run our tests against and we're really a fan of just creating that in each test, keeping your database empty and just laser-focusing on the things that you need. But the other one is load testing. A lot of times what people don't understand is, "What is my application going to function with?" A hundred thousand or a million records versus the 10 or 20, maybe even a thousand there in there in production now. You can release software and then as it starts to grow be like, "Why is everything slowing down?" Well, there's a way to find out if that's going to happen and that's load testing.
And that's a whole different topic. But one of the things in there is you need to be able to set up database structures maybe in the future or database amounts that are in the future. Let's just say businesses were going to grow by 10%, you know that's not always accurate, but let's just bump it up to 30%. So, in one year from now, we increase our load by, let's just say, even 50%. So, we have a thousand users in there now, let's even bump it up to 10,000, let's get some good growth here. And you can use seeders then to generate all the data and structures and everything you need for an application that is really in use and then you can run your load testing against that.

Joel Clermont (11:16):
Right. And those load testing seeders, let's say you're generating a million rows, you clearly don't want that in your dev data seeder because it's going to take time to run. Yeah, you can have a separate one entirely that's just special case for this one particular use but it's a lot nicer than... I mean, how else would you do it? You might go into MySQL and generate a bunch of stuff, but why not leverage everything that Laravel gives you in factories and relationships and all that and just build out a load testing seeder. That's a good use case, I'm glad you brought that up.

Aaron Saray (11:55):
Now, I am not sure if this is like this at all locations for places like McDonald's. In smaller cities, probably not or whatever. In larger ones where I live, definitely everyone has this setup.

Joel Clermont (12:11):
Okay.

Aaron Saray (12:12):
And that setup is the two lanes into one drive-through system.

Joel Clermont (12:20):
It's just inviting chaos. I don't like it.

Aaron Saray (12:23):
Oh man. There's nothing worse than... I mean there are, there's a lot of things worse. There's nothing worse than the terror of being in one of those two lanes placing your order, seeing the other person's place their order, then you both are supposed to go merge into one. And there's a car in front so you both kind of go forward and then you're like, "Which one of us is closer?" or "Who's going to win?"

Joel Clermont (12:46):
Oh man.

Aaron Saray (12:46):
And then that other person won't stop creeping forward. And here's the thing, if someone is in front of you, let them go. Don't keep creeping because in my head, all I'm thinking is every time I move forward, you creep forward you're just going to hit me pretty soon. What are you doing?

Joel Clermont (13:01):
Well, let me throw an alternate perspective because sometimes I'm the creeper. I don't know if that came across wrong.

Aaron Saray (13:07):
Yeah, I believe that.

Joel Clermont (13:09):
No, if it's busy, there's also somebody behind you that if they could just pull up one more foot, they could start placing their order. So sometimes the creeping isn't to edge in and take the next spot, but to let the person behind me make their order. Have you thought about that?

Aaron Saray (13:28):
I have and you're wrong. That's not going to make any difference.

Joel Clermont (13:34):
No, it's not. It might even make it worse because you're just bunching up further now.

Aaron Saray (13:38):
Yeah. I mean, so for those who haven't ever experienced this, we should probably explain this real quick. There is two different kiosks so when you go up to the drive-through lane, you can split left or and you can order at both of them. Usually it's the same person, just shifting back and forth between which one. And then they take your order and then you're supposed to merge and then I think they track what color of car. Like, black SUV or white car, or something like that.

Joel Clermont (14:08):
Or they ask your name sometimes, yeah.

Aaron Saray (14:09):
Yeah. I mean, at the fancy places where Joel goes but I'm talking about McDonald's.

Joel Clermont (14:13):
The ma�tre d is out there.

Aaron Saray (14:15):
Yeah. So, when you go through... but there's just this, "I don't want let them get in ahead of me," even though it doesn't really matter. I'm not that competitive but when we get in that line, "Ooh, don't you dare try to go in front of me if I ordered first." Or when you're even pulling up and there's two inside and one on the other side and you're like, "I should probably go to the one on the one side." But you never know because maybe that one just got through until the other one is just finishing and aargh.

Joel Clermont (14:43):
There's all these calculations that happen, yes.

Aaron Saray (14:52):
Load testing is awesome. But before you get to that, maybe you should just take a quick look through your application, make sure it's ready.

Joel Clermont (14:58):
That's something we can help with. If you'd like to learn more, head over to nocompromises.io and see how we can work together.

No Compromises, LLC