Over the last couple of years, I’ve built and dealt with applications at every scale and technical stacks — from simple CRUD metadata management Applications to apps storing and searching through data at 200k Records / Second. It’s amazing to see how much the world of databases has to offer. Some of these projects were either something that started from nothing or an existing app that needed some thought into by selecting the right database for it. For large-scale projects, we had a small requirement, but the load was huge.
Database decisions and issues are faced by every single application developer at some point in their careers. Databases form a critical component of practically every back-end system. And to add to scale and challenges, there are like hundreds of databases available these days to solve each and every one of the problems. While some databases can scale infinitely, some can provide real time synchronization. And others may guarantee consistency, availability, durability — or some of these — there are a lot of factors worth considering before a decision is made.
The motivation of this article is to share my experience over time working with various databases — focussing mostly on strategy that will help anyone quickly ideating from a simple idea to a product in a short amount of time. Believe me, the choice of database plays a very crucial role. Folks who are getting started with their projects can use this as a reference for selecting the right database. It can be a startup idea, or something that you always wanted to work upon as an automation, or it may even be something that is required at your current work.
This is not a comparison article. I’m merely expressing my opinion working with both SQL and NOSQL over time. One thing is not good or bad. It’s solely dependent on the requirements.
When I look at the growing choices, and the ease of adopting database technology, the first question that comes to my mind is whether I should go for a no SQL database or a relational NoSQL database? Mostly because it feels so natural and easy to code using a NoSQL database when we look at the examples.
A word of advice I would like to share right at the start is to never select a no SQL database just because it looks simple use and write code. Ensure that requirements and systems are carefully considered before opting for a NoSQL database.
We may think we can fit in your requirements into a NOSQL database and deliver an application extremely fast, but always Keep in mind that the requirements govern the database choices and not the other way around. We must carefully analyse every single requirement that you have and see if the database that you go with solves your current problems. And also accounts for some flexibility in your future requirements. In all these years of experience, I haven’t seen a single case where the requirements and data access patterns didn’t change over time.
Having the flexibility for some common use cases helps us better over time. (There is no fixed rule for this but unless it’s an extremely specific requirement for scale, data restrictions, or distribution — prefer flexibility over specific database tech for most simple applications. ) By that I simply mean go with a SQL database like MySQL or PostgreSQL. NoSQL and cloud databases solutions sound very tempting and cheap, but as your application grows in terms of product features, it may seem Extremely hard to manage them and hard to work with in some cases.
I must admit I did select a NoSQL database at work a few years back because the requirements were simple, and it fit all the use cases. At that point, all requirements pointed to a NoSQL database, because it was application and access patterns were extremely simple. But over several years, business just kept wanting more and these ad hoc requirements made it hard to work with NoSQL database. We decided to take a mixed approach and did have a solution, but my suggestion for a SQL database for most simple apps comes from seeing and experiencing a lot of these use cases. The effort to define a schema and work with SQL drivers is worth it. I’ve been a huge fan of PostgreSQL ever since I started working with it and migrating to it for a couple of projects.
As a single responsibility goes, databases are meant to provide us with a nice abstraction of storage and a good query processing layer. As we develop applications, what happens over time is that our database becomes more than just a database. Eventually a lot of logic, triggers, a lot of complexity that your application needs to handle will go to the database queries. At some point the database is under huge responsibility (and sometimes stress ), to deliver features which the applications were supposed to. There’s nothing wrong with calling it another service at this point. The reason this happens is because of the simplicity that one achieves to solve problems with SQL Database.
Personally, I think it’s a grey area. For instance, I would never use a database trigger unless it’s my last resort. But I could offload some logic to the data queries if the queries don’t put a lot of stress depending on my database. Imagine a case when you want to initially show an incremental news feed to a user for the ones which they have not seen, and your database handles the logic via SQL Query while you produce a proper architecture to scale out / fan out. These kind of use cases keep popping up, and it simplifies some stuff until we figure out the right way to design it. I’m not saying ditch everything and use a SQL solution, but for product development of business facing applications, having the flexibility and guarantees helps a lot.
Analyzing Query plans as application grows is an especially important aspect of working with Relational databases.
When we design microservices, it is considered a good practice for the service to access only the portion of the data that it’s managing and responsible for. Everything else happens via service calls. If you’re not considering a microservice architecture, are going with let’s say with a monolith, it is especially important that your code is as clean as possible. As developers, it becomes your responsibility to ensure that when the time comes, the effort to move a part of the application to another service or it’s own application is minimal.
My goal with this post was to share a couple of my experiences and advice while working on various projects in my past. Currently, I’m of the opinion that one should use PostgreSQL/MySQL if they’re getting started on an idea and want to quickly deliver features to production on a decent scale. Our database usually tends to act as another service layer when we get started, and we cannot avoid it completely. That’s the reality. It’s a boon and at the same time, can quickly create problems too. We must be incredibly careful with this.
All cloud offerings and NoSQL databases seem very affordable and easy to onboard, but it does get complicated over time. It’s worth making the right choices at the start than spending time migrating later. Of course, I cannot emphasize that the rule of thumb is that the choice of a database is governed by the requirements, and not the other way around.
Hope you enjoyed reading this, and please follow me for more!