NoSQL Databases

NoSQL, which stands for Not Only SQL or No Structured Query Language is a database system which is employed for managing the massive collection of unstructured data. NoSQL solved two problems of scalability and flexibility. It is important to note that NoSQL and non relational data are not the same thing. Historically, large databases have run on expensive machines or mainframes. Modern enterprises are employing cloud architectures to support their applications and the distributed nature of these NoSQL databases helps in deployed and on operating on clusters of servers in cloud which reduces the cost.

Can be used for both relational and non relational data
Provides new ways of storing and querying the data
Designed to handle 'Big Data'
Types : Key-Value, Document, Graph and Column

History of how it all started :

Source : IBM , skills network

Many customers of Azure Cosmos DB use this NoSQL database to store relational data. Cosmos DB SQL API also support SQL like Queries, and as far as scaling is concerned, scaling is much cheaper for NoSQL database because in NoSQL we can scale horizontally when relational database had to scale vertically.

Note : Vertically scaling means that you are increasing the load on a single server by adding more and more RAM and CPU. Horizontally scalable involves adding more server to process request. Vertical scalable has a limit. You can't just keep adding more and more RAM and CPU in a single machine. But scaling is much cheaper for NoSQL database because you don't need to scale up. You can have many cheap machines and scale your database horizontally.

NoSQL Use cases :

1. Big data and real time web applications

2. Relationship between data is not important

3. Data changes frequently

NoSQL Limitations:

1. Schema-less data means data is inconsistent

2. Denormalized data means redundant data

3. Redundant data means inaccuracies and conflicts

4.Does not support many good features of Relational DB.

SQL vs NoSQL :

Types of NoSQL Databases :

Images : Google images

Document Store :

A document store database (also known as a document-oriented database, aggregate database, or simply document store or document database) is a database that uses a document-oriented model to store data.
Document store databases store each record and its associated data within a single document. Each document contains semi-structured data that can be queried against using various query and analytics tools of the DBMS.

Key characteristics:

Each document offers flexible schema
Each piece of data is considered as document(XML/JSON)
Values are visible and can be queried
Content of the document databases can be indexed and queried - keys and value range lookups and search, Analytical queries with MapReduce
Horizontal scalable
Allow sharding across multiple nodes

Suitable Use cases:

Event logging for apps and processes - each event instance is represented by a new document
Online blogs - Each user, post, comment, like is represented by a document

Unsuitable Use cases:

When ACID transactions are required
If data naturally falls into normalized tabular model

Here are examples of some of the leading document store DBMSs.

Web Applications - Content management systems, Blogging platforms, eCommerce applications, Web analytics, User preferences data

User Generated Content - Chat sessions, Tweets, Blog posts, Ratings , Comments
Catalog Data - User accounts, Product catalogs, Device registries for Internet of Things , Bill of materials systems

Gaming - In-game stats, Social media integration, High-score leaderboards , In-game chat messages

Networking/computing - Sensor data from mobile devices, Log files,Realtime analytics, Various other data from Internet of Things

Key Value store :

A key-value store, or key-value database is a simple database that uses an associative array (think of a map or dictionary) as the fundamental data model where each key is associated with one and only one value in a collection. This relationship is referred to as a key-value pair. In each key-value pair the key is represented by an arbitrary string such as a filename, URI or hash. The value can be any kind of data like an image, user preference file or document. The value is stored as a blob requiring no upfront data modeling or schema definition. In general, key-value stores have no query language. They provide a way to store, retrieve and update data using simple get, put and delete commands; the path to retrieve data is a direct request to the object in memory or on disk. The simplicity of this model makes a key-value store fast, easy to use, scalable, portable and flexible.

Key characteristics :

Least complex, data is stored with a key and corresponding value blob
Represented as hashmap
Ideal for basic CRUD operations
Scale well
Shard easily
Not meant for complex queries

Some popular key-value stores are:

Amazon DynamoDB
Aerospike
Apache Cassandra
Cosmos DB Table API
Oracle NoSQL Database
Voldemorte
Berkeley DB
Couchbase Server
Redis
Riak

Image and further reading - hazelcast.com

Suitable Use cases:

1. For quick basic CRUD operations on non-interconnected data
Storing and retrieving session information for web applications
2. Shopping cart data for online stores
3. Storing in-app user profiles and preferences

Unsuitable Use cases:

1. For data that is interconnected with many-to-many relationships such as social networks and recommendation engines , go for graph in these cases

2. High level consistency is required for multi-operation transactions with mutiple keys

3. When apps run queries based on value vs key, consider document catergory of NoSQL.

Document Store vs Key-Value Databases

Document databases are similar to key-value databases in that, there’s a key and a value. Data is stored as a value. Its associated key is the unique identifier for that value.

The difference is that, in a document database, the value contains structured or semi-structured data. This structured/semi-structured value is referred to as a document.

The structured/semi-structured data that makes up the document can be encoded using one of any number of methods, including XML, JSON, YAML, BSON, etc. It could also be encoded using binary, such as PDFs, MS Office documents, etc.

Column Store

A column store database is a type of database that stores data using a column oriented model.

Extremly quick to load and query.

Also refered as -

Column database
Column family database
Column oriented database
Wide column store database
Wide column store
Columnar database
Columnar store

A column family consists of multiple rows.
Each row can contain a different number of columns to the other rows. And the columns don’t have to match the columns in the other rows (i.e. they can have different column names, data types, etc).
Each column is contained to its row. It doesn’t span all rows like in a relational database. Each column contains a name/value pair, along with a timestamp. Note that this example uses Unix/Epoch time for the timestamp.

Here’s a breakdown of each element in the row:

Row Key. Each row has a unique key, which is a unique identifier for that row.
Column. Each column contains a name, a value, and timestamp.
Name. This is the name of the name/value pair.
Value. This is the value of the name/value pair.
Timestamp. This provides the date and time that the data was inserted. This can be used to determine the most recent version of data.

Benefits - Compression, Aggregation queries, scalabilty,fast to load and query

Suitable use cases :

Can be used for event logging and blogs, counters, data with expiration values

Examples of Column Store DBMSs

Source : https://database.guide/what-is-a-column-store-database/

Graph Store

A graph database stores nodes(entities) and egdes (relationships) instead of tables, or documents. Data is stored just like you might sketch ideas on a whiteboard. Data is stored without restricting it to a pre-defined model, allowing a very flexible way of thinking about and using it.

They are ACID compliant but dont shard well

Suitable use cases

Highly connected and related data like social networking sites

Routing, spatial and maps

Recommendation engines

UnSuitable use cases

When application needs to scale horizontally

Graph Database use cases - Credit card fraud, social media analysis, money laundering

Examples - CosmosDB Gremlin API, Neo4j, Blazegraph, OrientDB, JanusGraph, DGraph

Summary :

NoSQL means Not only SQL.
NoSQL databases have their roots in the open source community.
NoSQL database implementations are technically different from each other.
There are several benefits of adopting NoSQL databases including storing and retrieving session information, and event logging for apps.
The four main categories of NoSQL database are Key-Value, Document, Wide Column, and Graph.
Key-Value NoSQL databases are the least complex architecturally.
Document-based NoSQL databases use documents to make values visible for queries.
In document-based NoSQL databases, each piece of data is considered a document, which is typically stored in either JSON or XML format.
Column-based databases spawned from the architecture of Google’s Bigtable storage system.
The primary use cases for column-based NoSQL databases are event logging and blogs, counters, and data with expiration values.
Graph databases store information in entities (or nodes) and relationships (or edges).

Search This Blog

Data Talks and Musings

NoSQL Databases

Document Store vs Key-Value Databases

Benefits - Compression, Aggregation queries, scalabilty,fast to load and query

Examples of Column Store DBMSs

Comments

Post a Comment