November 19, 2019
Let’s start with the basics. A database is an organized and systematic collection of data that can be stored and accessed electronically. A database management system (DBMS) is an integrated software package designed to allow users to access, manipulate, analyze, manage, and retrieve data in a database. Since the dawn of the first DBMS, the capabilities and performance of databases and their respective DBMS have grown exponentially. This technological evolution has led to the creation and development of various databases such as the relational database (RDBMS), object-oriented database (OODBMS), cloud database, and NoSQL database.
Developed in 1970 by Edgar F. Codd at IBM, the RDBMS is a tabular database that stores and provides access to data points that are in relation to one another. In an RDBMS, data is organized as logical, independent tables and is shown through established relationships among data points and supports pre-defined data types with a reference that links them together. Many RDBMS systems use the standard Structured Query Language (SQL) for querying and maintaining the database. The RDBMS is the most widely accepted model for databases as users can safely and easily categorize, store, query, and extract data. Furthermore, software programmers and develops began to treat data in databases as objects leading to the rise of the OODBMS. The OODBMS organizes and models data as a definable data object as opposed to an alphanumeric value. Programmers using OODBMS can enjoy the consistency in the programming environment as it is integrated and uses the same representation model with the programming languages. Cloud database is a database that runs on a cloud computing platform that collects structured and unstructured information and data. Organizations run cloud databases on virtual machines leading to higher infrastructure utilization, which can lead to cost savings. Database-as-a-Service (DBaaS) powered by a cloud database has high scalability and efficiency and failover support and maintenance.
As the advancement of databases continues, it is important to note that different types of databases have their own justifications for use. A cloud database is typically equipped with a better scale than on-premises RDBMS but still built on traditional relational architecture with challenges in scaling and limited flexibility due to being anchored to the cloud service provider. An RDBMS is known for its accuracy due to data deduplication, easy accessibility, flexible (as complex queries are carried out), and strong security due to the purpose of atomicity, consistency, isolation, and durability (ACID) to protect against data manipulation and ensure data integrity. However, the RDBMS falls short of scale-up architecture, which requires over-provisioning, auto-sharding, and replication when the data volume peaks. Additionally, the OODBMS has a representation of the complex structure that allows the creation of a more realistic model, better performance, and flexibility. Nonetheless, it lacks standardization as there is no consistent theoretical basis to support OODBMS products.
As we acknowledge the potential advantages and disadvantages of various databases, here comes the innovative approach of the NoSQL database. NoSQL databases provide a mechanism for access, storage, and retrieval of data that is not modeled in tabular relations like an RDBMS. Unlike an RDBMS where data is being structured in fixed relational columns, a NoSQL database involves various types of data structures such as the key-value store where data is stored and represented as a collection of key-value pairs, document database where data is assumed to be encapsulated and encoded in some standard format of encodings like XML and JSON. A NoSQL database has a cluster-friendly, non-relational structure with the ability to deal with heterogeneous and enormous amounts of data. NoSQL databases allow data to be stored in data schemas that are not as ‘fixed’ as RDBMS and have a flexible structure, essentially removing the rigidity of RDBMS.
The high scalability due to auto-sharding for scaling and geographically dispersed scale-out architecture makes a NoSQL database highly efficient in dealing with vast volumes of data while remaining cost-effective at the same time. The ability to enable complex analysis, flexible system, and managing unstructured data that changes over time prove to be superior compared to the RDBMS. NoSQL has dynamic schema and high agility that is better suited for big data and the Internet of Things (IoT) usage. We can look at the various examples and use cases such as the use of a NoSQL database in real-time data analytics, fraud detection, and risk management system that enables financial institutions to better consolidate and measure risk metrics. With the scale-out capability, a NoSQL database enables high-speed data ingestion and analytics to be used in market data management. Many use cases such as profile management, reference data management, and customer 360° view capability can be unlocked with the use of a NoSQL database.
The rise of the NoSQL database not only comes with its profitability and benefits it brings to database management but is also accompanied by disadvantages like lack of standardization, which can limit further expansion, limited community support, and problems with interfaces and interoperability. However, the issues with NoSQL databases are currently being solved, which points to the future development of the NoSQL database. The next thing in store for the NoSQL database is the improvement in the consistency model with ACID and Basic Availability, Soft state, and Eventual consistency (BASE), as well as an increase in standardization and benchmarking combined with the expansion of NoSQL to encompass functionality that other database platforms have. Companies like Aerospike, MongoDB, and AmazonWebServices are actively expanding and experimenting with the use of the NoSQL database, so it is exciting to witness the potential and future roadmap for NoSQL databases.