The business shift to digitization and customer experience has ushered enterprises into the digital revolution era. At the heart of the revolution are cloud technologies, mobile apps, big data applications, social and communication platforms, etc. Running and maintaining modern applications has created a new set of technology requirements that can accommodate unprecedented scalability, speed, and data variability. Relational databases cannot meet growing requirements, and enterprises are turning to NoSQL (Not Only SQL) database (DB) technology designed for distributed data stores for huge-scale data needs.
NoSQL encompasses a wide range of database technologies that can store structured, semi- and unstructured data. NoSQL is widely recognized for its functionality, agility, ease of development, performance at scale, and flexible schemas for building hi-tech applications.
Consider just a few examples of global businesses that are deploying NoSQL for their applications. From Twitter, Facebook, and Google, to Tesco, Ryanair, Marriott, Gannett, and other big and small enterprises, which accumulate Terabits of data every single day and use NoSQL for their Big data and real-time web apps.
What are NoSQL Databases?
The easiest way to understand what is a NoSQL database is to understand what it is not. Let’s start with the SQL understanding first.
SQL stands for Structured Query Language. People call it S.Q.L. or sequel. In short, it is the name of a standard language for communicating with relational databases that store the data. It can pull, edit, add, search, update, and delete information in the database records.
As the name suggests, NoSQL is NOT a Structured Query Language, NOT use SQL to query the data, NOT follow strict schemas like relational models; it is NOT a replacement for an RDBMS. Instead, it uses documents with data types of descriptions and values to store data.
NoSQL databases are designed to be used across large, distrusted systems. They are more scalable and much faster at handling large data loads than traditional relational databases. It is the core component of NoSQL that makes it an inexpensive solution for large datasets.
As your application grows and you start to add new fields, your schema evolves as needed. Your database is scaled horizontally. So, if you need to build something quickly, NoSQL is an excellent way to go.
RDBMS, on the other hand, scales by getting faster hardware and a larger memory.
NoSQL databases have the following characteristics:
- NoSQL is schemaless.
- Most implement an aggregate pattern.
- It is running well on clusters.
- Open source/fully managed cloud.
- Have high scalability.
- Use distributed computing.
- Simple API. Offers easy to use interfaces for storage and querying data provided.
- Able to process both unstructured and semi-structured data.
- No complex relationships, such as the ones between tables in an RDBMS.
NoSQL Emerged as a Leading New Data-Storage Technology
NoSQL databases have emerged as leading new data-storage technology. It is getting a lot of traction and hype these days, but it’s not that new of a thing in reality.
The first database was engineered to run on a single server by Oracle in 1979. The only way to scale up these databases was to upgrade the servers – processors, memory, and storage.
In 1998, Carlo Strozzi created a file system based database and used the term NoSQL for his lightweight, open-source relational database.
Google delivered the Bigtable exploration paper in 2006, and Amazon delivered the Dynamo research paper in 2007. These databases were built to meet another age of significant business necessities: Creating with deftness and working at any scale.
In 2009, the NoSQL term was re-surfaced when Eric Evans used it to name the current surge in non-relational databases.
NoSQL databases emerged in the era of mainframes and exponential development of web applications. When the cost of storage dramatically decreased, there was a need to create a complex data model to reduce data duplication. Developers were the primary cost of software development, so NoSQL databases optimized for developer productivity.
NoSQL was engineered to meet a new generation of business requirements:
⇒ Data Storage: The digital data is measured in exabytes. One exabyte of information is equal to one billion gigabytes (GB). The amount of stored data added in 2006 was 161 exabytes. Just four years later, in 2010, the amount of data stored was almost 1,000 ExaBytes. 90% of the data on the internet has been created since 2016, according to an IBM Marketing Cloud study. In other words, there is a lot of data being stored in the world, and it’s just going to continue growing.
⇒ Interconnected Data: Major systems are built to be interconnected. The web foster in hyperlinks, pingbacks, and tags that tie things together.
⇒ Complex Data Structure: NoSQL can handle hierarchical nested data structures easily. To do the same in SQL, you need multiple relational tables with all kinds of keys.
Advantages of NoSQL over Relational Databases?
When compared to relational databases, NoSQL databases are often more scalable and provide better performance. Relational databases store data in highly structured tabular form, with multiple rows and columns. While these data stores are highly flexible, easy to maintain, and useful for data stored on a single server, they do not scale very well in a distributed system compared to NoSQL.
Distributed systems using inexpensive storage and processing power are becoming much more common and are often used in environments where there is a need for high availability and speed.
NoSQL databases work significantly better across this kind of distributed system.
Advantages of NoSQL
- Simple to implement.
- High scalability.
- High availability.
- Big data capability
- Work with databases such as MongoDB and Cassandra.
- Store unstructured, semi-structured, or structured data with equal effect.
- Enable easy updates to schemas and fields.
- Is developer-friendly.
- No single point of failure.
- Easy replication.
- Provides fast performance and horizontal scalability.
- Often open-source and, therefore, lower cost. Can be an appealing solution for smaller organizations with limited budgets.
- Offers a flexible schema design which can easily be altered without downtime
Disadvantages of NoSQL
- No standardization rules.
- Limited query capabilities.
- RDBMS databases are comparatively mature.
- Staffing for NoSQL can be more costly.
- It does not offer consistency when multiple transactions are performed simultaneously.
- Difficult to maintain unique values as keys become difficult.
Types of NoSQL Databases
The most important feature of a NoSQL database to consider is the data type it uses. Unlike SQL, which uses a relational model, NoSQL uses a variety of different models. They are categorized into four DB types: Key-value pair, Column-oriented, Graph-based, and Document-oriented. Let’s take a look at these four models, how they’re different from one another.
⇒ Key-value databases
The Key-value DB of NoSQL is used as a collection, associative arrays, etc. Key-value stores help to store schema-less data and heavy load. It allows horizontal scaling at scales that other types of databases cannot achieve.
Examples of Key-value model: Redis, Dynamo, Riak.
⇒ Document model
Document-Oriented NoSQL DB stores and retrieves data as a Key-value pair, but the value part is stored as a document in JSON or XML formats. They are mostly used for CMS and content platforms, e-commerce applications, etc.
Examples Document originated DBMS systems: Amazon SimpleDB, CouchDB, MongoDB, Riak, Lotus Notes, MongoDB.
⇒ Graph model
A Graph type database stores entities as well the relations amongst those entities that are stored as a node with the relationship as edges. An edge gives a relationship between nodes. Every node and edge has a unique identifier. They are mostly used for social networks, logistics, spatial data.
Examples of Graph-based databases: Neo4J, Infinite Graph, OrientDB, FlockDB.
⇒ Column-oriented Graph
Column-oriented databases work on columns and are based on BigTable paper by Google. Column databases store each row separately, allowing for quicker scans when only a small number of rows are involved. They are mostly used to manage data warehouses, business intelligence, CRM.
Examples of Column-based NoSQL databases are Accumulo, Cassandra, Druid, HBase, Vertica.
Key-value and Document databases are similar. In Key-value – we can say that the value is a document but the structure of the Document is opaque. In Document databases, we often have Document ID as a key, but the Document’s structure is often exposed and used for querying. None of the above-specified databases is better to solve all the problems; every category has its unique attributes and limitations.
Major NoSQL Players
Popular SQL relational databases and RDBMSs are Oracle, IBM DB2, Sybase, MS SQL Server, Maria DB, PostgreSQL.
The major players in NoSQL have emerged primarily because of the organizations that have adopted them. Some of the largest NoSQL technologies include:
|DynamoDB||Dynamo was created by Amazon.com and is the most prominent Key-Value NoSQL database. Amazon required a highly scalable distributed platform for their e-commerce businesses, so they developed Dynamo. Being out of AWS DynamoDB is an extremely customizable NoSQL database that can scale as the use case decides.|
|MongoDB||One of the most famous names in NoSQL. MongoDB is a document DB using JSON like document schema to store data in the database. Also, see CouchDB. MongoDB stands apart from its peers with its Nexus Architecture that incorporates the strengths of relational databases along with the innovations of NoSQL.|
|CosmosDB||Azure's offering for globally distributed NoSQL database with scale. CosmosDB was built off the success of DocumentDB.|
|BigTable||The brainchild of Google. When Google released the whitepaper on BigTable, HBase was developed out of the research. Now available through Google Cloud Platform. BigTable is a Columnar database.|
|HBase||Top NoSQL open-source on Hadoop choice. Facebook is both a heavy user and contributor to HBase. HBase is a column store database written in Java.|
|Cassandra||Column-oriented open source from Apache. A distributed database that excels at handling extremely large amounts of structured data. Cassandra DB was created at Facebook. It is used by Instagram, Comcast, Apple, and Spotify.|
Deeper Dive: DynamoDB
Amazon DynamoDB stores data in partitions. A partition is an allocation of storage for a table, backed by solid-state drives (SSDs) and automatically replicated across multiple (3) Availability Zones within an AWS Region. Partition management is handled entirely by DynamoDB—you never have to manage partitions yourself.
DynamoDB is optimized for uniform distribution of items across a table’s partitions, no matter how many partitions there may be. We recommend that you choose a partition key that can have a large number of distinct values relative to the number of items in the table.
Core building blocks and the essential DynamoDB components:
⇒ Tables. DynamoDB stores data in tables. A table is a collection of data. For example, you might have a Users table to store data about users, their contact information, and Orders table to store data about your orders. This concept is similar to a table in a relational DB or a collection in MongoDB.
⇒ Items. Each table contains zero or more items. An item is a group of attributes that is uniquely identifiable among all of the other items. In a User table, each item represents a user. For an Order table, each item represents one order. Items in DynamoDB are similar in many ways to rows, records, or tuples in other database systems. In DynamoDB, there is no limit to the number of items you can store in a table.
⇒ Attributes. Each item is composed of one or more attributes. An attribute is a fundamental data element, something that does not need to be broken down any further. For example, an item in a User table contains attributes called PersonID, LastName, FirstName, and so on. For an Order table, an item might have attributes such as ProductID, Name, Manager, and so on. Attributes in DynamoDB are similar in many ways to fields or columns in other database systems.
How Much Do NoSQL Databases Cost?
The total cost of ownership of a NoSQL DB is a crucial consideration, and customers often overlook many factors impacting it.
The cost for NoSQL databases itself can be meager, or even free, compared to traditional relational databases. After initial costs, keeping up a NoSQL DB can be pricier, depending on its hosting solution. There are also a few hidden costs, including the added complexity of an endless choice of datastores, the inefficient use of hardware leads to server sprawl.
Consider that NoSQL specialists are less common than SQL specialists. Thus, staffing for NoSQL can be more costly.
- NoSQL is a non-relational DB, that does not require a fixed schema, avoids joins, and is easy to scale.
- The concept of NoSQL databases became popular with Internet giants like Google, Facebook, Amazon, etc. who deal with huge volumes of data.
- Relational databases are expected to stay for many use cases.
- Must use NoSQL for high scalability requirements applications.
- NoSQL has some advantages because we precalculate data ahead of time – based on known access patterns.
- RDBS has some tools – materialized views, aggregate tables, redundant data, and flat tables.
- NoSQL removes size and speed limitations and makes this solution more natural.
- If use NoSQL – expect to also use some relational engine for analytics and reporting.
- If you use RDBMS, you can still add NoSQL for applications such as config, events, blob/document applications, messaging, and audits.
- Four types of NoSQL Databases: Key-value, Column-oriented, Graphs based, Document-oriented.
- NoSQL can handle structured, semi-structured, and unstructured data with equal effect.
- NoSQL offers limited query capabilities.
Know When To Use the Right Databases for the Job
The ability to scale, thereby taking advantage of commodity storage hardware in the cloud, makes NoSQL preferred for eCommerce and retail, media and entertainment, fin-tech, logistics, and healthcare business applications.
But what is the most essential is having a clear sense of what you’re trying to accomplish and selecting the database that will better suit your fundamental needs.
Every project is unique and needs careful technical analysis. Choosing an IT consulting partner is a wise investment to ensure the project’s success.
DevCom is focused on data-driven enterprise solutions, database design, and development, support, and maintenance. We cover the popular relational databases such as PostgreSQL, Microsoft SQL Server, as well as NoSQL platforms such as MongoDB, DynamoDB, CouchDB, etc. We are committed to making your databases reliable, scalable, maintainable, and secure.
Reach us at firstname.lastname@example.org to talk about your custom database development needs.
Written by: Halyna Vilchynska, Head of Marketing at DevCom.