Storing large amounts of complex data that can be retrieved rapidly - SQL or NoSQL?


Once upon a time in SQL Heaven, developers only knew SQL. This was great. Oracle, Sybase, and Microsoft all had their own leading products and developers had a clear way to design & build a system. This was back in the day when software mostly solved problems for business operations because your average person didn't even know how to use a computer and thus the SQL database loosely represented the business data structure.

Funny enough, back then your user interface was driven by the data structures - people had to suck it up and hate their SAP inventory ordering system in-house, or whatever it was. Then the internet came along and computers became mainstream, people want to book holidays, view porn, and look at cat videos; and there was a LOT of data to store now. But as this happened, the software environment became a more complex beast because people used web browsers, and the front end got quickly divorced from the back end of the system and evolved rapidly - browsers competed with each other, and UI complexity exploded as the expectations of UI went up and up and up.

To facilitate complex data structures on the UI, JSON was created around 2001 - a way to describe data in a tree structure. JSON became the defacto standard for communicating between client and server with REST APIs - another loose standard that is adopted almost everywhere. It's pretty simple to use JSON in a JS web front end to drive what the UI does. So as the software environment evolved, most software applications started to become web-based, SaaS / general public-facing systems as opposed to just business support systems.

The general software ecosystem changed dramatically - most development is centered around the user experience in these systems as a key differentiator, a great user interface that has to appeal to a very very wide/broad general public that requires a sort of dumbed-down interface (hence why you have all the beautiful React front ends, and so on) because the in-house build is dead, the SaaS build is king - a product to be used across the globe, to solve a specific use-case - and thus ease of use almost front and center. Now you enter the problem of rapid software development with a highly evolved UI - mostly based on JSON tree structures, you then have a back end system that is stuck in SQL - and now you have to convert all that JSON tree stuff to SQL and back again (by the way, go read about GraphQL). Oh my, what a nightmare!

Boss: "Hey can you add 5 new bits to the product view screen when it's an X-type product, showing images that the user can click on?"

Dev: "Yeah ok, I have to change the back end, give me like 100 days. Oh, I also have to migrate the data as well, and shut down the production server for a few hours..."

But all you're doing is adding stuff to the product UI view screen, it's not really useful to anything else but that product, and you may store it sometimes, other times maybe not. You can do that in SQL but in a Graph database it's almost a non-event because you don't describe all the attributes - they're in a document. The amount of change required to create this enhancement would be minimal if at all on the back end. And hey, if you're using a PaaS like Firebase - there's NO back-end change. Graph databases handle this well. So SQL, was handy for keeping things structured, when we needed to interrelate the data heavily, and when we had time up our sleeves to change the database. But NoSQL lets you evolve the data you change without the need for a new release or migration of the data, or anything. And I'd say it's one big draw-card for its use.

So as this all happened, the amount of data to the store was a big problem for SQL. There's a lot of cat videos and porn out there. These SaaS products (or even mainstream things like Twitter, Facebook, Google, etc etc) started to store data in volumes never ever seen before. up to billions of rows say. So a SQL database can store this. but, if you have it set out as a graph, the retrieval based on a hierarchy is much, much simpler and faster because the tree structure shows you how to traverse the data, you don't rely on indexed rows in the same fashion -- what's more, there are no locking and transactional issues that SQL databases face with such a hugely complex data set and an update - imagine updating indexes on a billion rows database.

Enter the NoSQL paradigm - storing big amounts of data that can rapidly retrieve and store without complex index problems and integrity maintenance hampering performance. Twitter, LinkedIn, Facebook all adopted NoSQL for these performance reasons. That's where NoSQL really took hold - and why you have MongoDB.So back to SaaS. Nowadays, you have a lot of SaaS products or web apps that are not that relationally complex (when compared to the business system counterparts) - that simply store and retrieve data to show a user and hey why would we structure it as relational if it's going to change lots - we can simply store the JSON and retrieve it for the front end, simple as.

The front end is front and center of the development. The database can be structured with a NoSQL/Graph database, the tradeoff is more rapid development time for a risk of data integrity not being upheld. Big deal though we're just storing blog posts. if we need relational stuff modeled, we can but we have to write more code. That's the tradeoff.

So in summary - people use Graph databases to rapidly build their system and/or to store vast amounts of unstructured data - that's where the paradigm came from, to solve those problems. By the way, Microsoft now has Graph database support even in their Microsoft SQL server - so, truly large systems that have complex UI's and complex data structures don't need a Graph DB & A SQL DB - both are supported. MS SQL Server also supports JSON object retrieval.

I'm pretty sure Oracle has the same. I would assume Postgres will come up with a similar offering.

Written by Kris Baum

Looking to get your project off the ground? We can help you with design, development and digital marketing services!

123/10 Pannipitiya Road, Battaramulla, Sri Lanka

+94 77 38 77 388