Sharding Methods For MongoDB

Sharding Introduction

Sharding is a method for storing data across multiple machines.Sharding, or horizontal scaling, by contrast, divides the data set and distributes the data over multiple servers, or shards. Each shard is an independent database, and collectively, the shards make up a single logical database.

Sharding in MongoDB

MongoDB supports sharding through the configuration of a sharded clusters.

Components in Sharded Cluster :

Shard
Query Routers
Config Servers

Shards store the data. To provide high availability and data consistency, in a production sharded cluster, each shard is a replica set .
Query Routers, or mongos instances, interface with client applications and direct operations to the appropriate shard or shards. The query router processes and targets operations to shards and then returns results to the clients. A sharded cluster can contain more than one query router to divide the client request load. A client sends requests to one query router. Most sharded clusters have many query routers.
Config servers store the cluster’s metadata. This data contains a mapping of the cluster’s data set to the shards. The query router uses this metadata to target operations to specific shards. Production sharded clusters have exactly 3 config servers.

Data Partitioning

MongoDB distributes data, or shards, at the collection level. Sharding partitions a collection’s data by the shard key.

A shard key is either an indexed field or an indexed compound field that exists in every document in the collection. MongoDB divides the shard key values into chunks and distributes the chunks evenly across the shards.

Range Based Sharding
Hash Based Sharding
Customized Data Distribution with Tag Aware Sharding

For range-based sharding, MongoDB divides the data set into ranges determined by the shard key values to provide range based partitioning.
For hash based partitioning, MongoDB computes a hash of a field’s value, and then uses these hashes to create chunks.
For Customized Data Distribution, MongoDB allows administrators to direct the balancing policy using tag aware sharding

Facts of Data Partitioning

Range Based Sharding

supports more efficient range queries
router can easily determine which chunks overlap that range
route the query to only those shards that contain these chunks
uneven distribution of data(Unbalanced)

Hash Based Sharding

even distribution of data
expense of efficient range queries
Hashed key values results in random distribution of data across chunks
every shard in order to return a result
will not be able to target a few shards

Customized Data Distribution with Tag Aware Sharding

Administrators create and associate tags with ranges of the shard key
assign those tags to the shards
the balancer migrates tagged data to the appropriate shards
cluster always enforces the distribution of data that the tags describe
tag aware sharding serves to improve the locality of data for sharded clusters that span multiple data centers

Balancing In Data Distribution

Data distribution balancing has more importance within the cluster ,such as a particular shard contains significantly more chunks than another shard or a size of a chunk is significantly greater than other chunk sizes.

Splitting
Balancer

Splitting

Splitting is a background process that keeps chunks from growing too large
MongoDB splits the chunk in half when beyond a specified chunk size
Splits are an efficient meta-data change
To create splits, MongoDB does not migrate any data or affect the shards

Balancer

Balancer is a background process that manages chunk migrations
Balancer can run from any of the query routers in a cluster
If there’s an error during the migration, the balancer aborts the process leaving the chunk unchanged on the origin shard
MongoDB removes the chunk’s data from the origin shard after the migration completes successfully

More details of Sharding Methods For MongoDB are available on official site on mongodb.

Technical Issues & Solution

Search This Blog

Sharding Methods For MongoDB

Sharding Introduction

Sharding in MongoDB

Data Partitioning

Balancing In Data Distribution

Comments

Post a Comment

Popular posts from this blog

What exactly means MVW design pattern ?

File Upload & Download With ng-cordova File Transfer Plugin In Ionic Framework

Steps To Create Cordova Plugin