September 1, 2015

What is MongoDB?

In this blog post, I’ll provide an overview of MongoDB and some of its key features. MongoDB is an open-source document database that provides high performance, high availability, and automatic scaling.

Its document-oriented data model makes it easier to split up data across multiple servers. MongoDB automatically takes care of balancing data and load across a cluster, redistributing documents automatically, and routing user requests to the correct machines.

What is Semi-Structured Data?

Semi-structured data is data that does not conform with the formal structure of data. The semi-structured model is a database model where there is no separation between the data and the schema.

What is a Document-Oriented Database?

A document-oriented database is software designed for storing and retrieving information in the form of semi-structured data, also known as documents. It replaces the concept of a “row.” By allowing embedded documents and arrays, the document-oriented approach makes it possible to represent complex hierarchical relationships with a single record.

Key Features

  1. High Performance: MongoDB provides high performance data persistence. It supports embedded data models to reduce I/O activity on a database system, as well as indexes for faster queries, and can include keys from embedded documents and arrays
  2. High Availability: To provide high quality availability, MongoDb’s replication facility (known as replica sets) provide both automatic failover and data redundancy. A replica set is a group of MongoDB servers that maintain the same data set and provide both redundancy and increased data availability
  3. Automatic Scaling: MongoDB provides horizontal scalability as part of its core functionality. Automatic sharding distributes data across a cluster of machines, while replica sets can provide eventually-consistent reads for low-latency deployments

Core Components of MongoDB

  • Mongod: As the primary daemon process for the MongoDB system, Mongod handles data requests, manages data access, and performs background management operations
  • Mongos: Mongos is utilized as a routing service for MongoDB Shard configurations. This component processes queries from the application layer and determines the data’s location in the sharded cluster so as to complete the commanded operations. From the perspective of the application, a Mongos instance behaves identically to any other MongoDB instance
  • Mongo: Mongo is an interactive JavaScript shell interface for MongoDB and provides an interface to test queries and operations directly within the database. Mongo also provides a fully functional JavaScript environment to use with MongoDB

Document in MongoDB

A record in MongoDB is a document, which is a data structure composed of field and value pairs. The values of the fields may include other documents, arrays, and arrays of documents (a group of documents is a collection, such as a table in RDBMS).

BSON and MongoDB

BSON is a binary-encoded serialization of JSON-like documents and is designed to be lightweight, traversable, and efficient. BSON, like JSON, supports the embedding of objects and arrays within other objects and arrays. MongoDB uses BSON as the data storage and network transfer format for its documents.

Sample Representation of a Document in MongoDB

Document:

 

MongoDB

 

Referenced Documents:

MongoDB

 

Embedded Documents:

MongoDB

 

DB Objects Comparison

SQL Objects MongoDB Objects
Database (schema) Database
Table Collection
Index Index
Row Document
Column Field
Joining Linking & Embedding
Partition Shard

Replication in MongoDB

Replication is the practice of keeping identical copies of data on multiple servers to keep applications running and data safe. There are two set-up designs within MongoDB: a Replica Set with Replication Cluster and a Replica Set with Arbiter. All replica set members send heartbeats (pings) to each other every two seconds. If a heartbeat does not return within ten seconds, the other servers mark the unresponsive server as inaccessible.

  • Replica Set with Replication Cluster: A cluster of MongoDB servers that implements master-subordinate replication for automated failover. When the primary server is unavailable, it triggers an election for one of the remaining secondary servers to act as the new primary server.

MongoDB

  • Replica Set with Arbiter: Arbiter is a Mongod instance where a server is in the replica set but does not hold data. The arbiter participates in elections as a tie-breaker if a replica set has an even number of servers

MongoDB

 

Sharding in MongoDB

MongoDB’s sharding is the ability to break up a collection in to subsets of data to store them across multiple shards. This allows the application to grow beyond the resource limits of a standalone server or replica set.

Basic Recommended Sharded Cluster for MongoDB

MongoDB

 

  • Sharded cluster: The router (Mongos process) acts as a gatekeeper for all requests coming from the client. It resolves the requests by using the configuration servers to get the metadata information of each shard, then routes the query to the appropriate shards, and finally combines the results given by the shard(s) to return to the client
  • Router in sharded cluster: As shown above, more than one Mongos process runs as a router. If utilizing more than one router, both should be configured under a load balancer for an efficient production environment
  • Configuration server in sharded cluster: The Mongod process runs as a configuration server that keeps meta data, meaning it makes the decision of which data should be extracted from which shard
  • Shard in sharded cluster: Each shard is essentially a replica set that holds a portion of data. To avoid losing any data, it maintains a copy of its data on all secondary servers. In this sense, a collection in MongoDB is stored in multiple parts across multiple shards; to access the full collection, the data must be retrieved from all shards, as shown below

MongoDB

 

Difference Between Shard and Replication

The shard provides the ability to partition the data and store it across multiple servers. This increases the hardware capacity of the cluster, meaning that resources are not limited to a single machine. Replication, on the other hand, is a duplicate copy of the data in full to be used in the event of a hardware failure.

Default Ports for MongoDB

Default Port Description
27017 The default port for Mongod and Mongos instances. Change this port with port or -port
27018 The default port when running with the -shardsvr runtime operation or the shardsvr value for the clusterRole setting in a configuration file
27019 The default port when running with the -configsvr runtime operation or the configsvr value for the clusterRole setting in a configuration file
28017 The default port for the web status page. The web status page is always accessible at a port number that is 1000 times greater than the determining port

 Install/Uninstall MongoDB at RedHat Linux

Use the following steps to set up a single machine MongoDB database server for the developer/test environment

Mongo Installation on Linux

  1. Copy the following command and run it on a Linux shell to create a yum repository for Mongo V3.0
    echo '[mongodb-org-3.0]
    name=MongoDB Repository
    baseurl=https://repo.mongodb.org/yum/redhat/$releasever/mongodb-org/3.0/x86_64/
    gpgcheck=0
    enabled=1' >/etc/yum.repos.d/mongodb-org-3.0.repo
  2. Run the following command to install Mongo
    sudo yum install -y mongodb-org
  3. Run the following command to edit the mongod.config file. To change the value of the bind_ip attribute, set the Linux machine’s IP address to a value of this attribute, then press [ctrl+x], then press [Y]
    nano /etc/mongod.conf
  4. Run the following command to run the Mongod service and to login to MongoDB
    service mongod start
    mongo <IPADDRESS OF LINUX MACHINE> 27017
    

To uninstall Mongo, use the following code: 

yum remove mongo-10gen mongo-10gen-server

Important Files/Paths in Mongo

/var/lib/mongo Default Mongo database path
/etc/mongod.conf Mongo Configuration file
/var/run/mongodb/mongod.pid Mongo pid file path
/var/log/mongodb/mongod.log Mongo log path

Enable Authentication

Authentication is the process of proving the identity of a user. Use the following script to create two users, one with root role permissions and one with readwrite role permissions. The root role user will hold top level permissions in MongoDB, while the readwrite role permissions allow the user to perform CRUD on collections. For more details on built-in roles in Mongo, visit the official reference manual.

  1. Root role user: Apply the following command in the Mongo shell to create a new user (“rootuser” with a password “12345”) who has a root role on the admin database
    use admin
    db.createUser({user:"rootuser",pwd:"12345", roles:[{role:"root",db:"admin"}]})
  2. Readwrite user: Apply the following command in the Mongo shell to create a new user (“webuser” with password “12345”) who has readwrite permissions on the application database (“companydb” used in this example)
    use companydb
    db.createUser({user:"webuser", pwd:"12345", roles:[ "readWrite" ]})

Basic MongoDB Commands

Apply the following command on the Linux shell to connect to the MongoDB with the Mongo shell:

mongo IPAddress:PORT (e.g. mongo 192.168.1.10:27017)
SN# Command Purpose
1
.
db
Show current database
2
show dbs
show databases
Show all databases
3
use <databasename>
Switch to any database or create new database
4
db.createCollection("Movies")
Create collection
5
db.getCollectionNames()
show collections
Get collection names
6
db.Movies.insert({"Title": "Titanic","LeadActor": 
"Lionardo","LeadActress": "Kate Winslet",
"Genre": ["Action","Family"]})
db.Movies.insert([
{"Title": "Focus","Genre": ["Action"]}, 
{"Title": "Fright Night","Genre": ["Horror"]}
])
Insert single or multiple documents into collection
7
db. Movies.update({"Title": "Focus"},{$set:{"Genre": ["Action","Drama"]}})

By default, Mongo will update the first document that comes up under the specific search criteria. To update multiple documents, use the following commands

db. Movies.update({"Title": "Focus"},{$set:{"Genre": 
["Action","Drama"]}} ,{multi:true})
Update document(s) in “Movies” collection
8
db.COLLECTION_NAME.save()
Insert/update any document
9
db.COLLECTION_NAME.remove()
Remove documents that fit certain search criteria
10
db.COLLECTION_NAME.drop()
Drop collection
11
db.COLLECTION_NAME.find()
db.Movies.find({"Title":"Titanic"})
Find documents that fit specific search criteria
12
db.COLLECTION_NAME.findOne()
Find the first document that fits specific search criteria
13
db.COLLECTION_NAME.find().limit(1)
Limit number of documents displayed that fit certain search criteria
14
db.COLLECTION_NAME.find().limit(10).skip(1)
Skip a certain number of documents displayed in output
15
db.COLLECTION_NAME.find().limit(1).pretty()
Display formatted output
16
db.serverCmdLineOpts()
Return documents that report on arguments or configuration options used to start Mongod or Mongos instance
17
db.COLLECTION_NAME.find().sort({"FIELDNAME":1})
Sort documents on basis of any field and pass value. Use 1 for ascending order and -1 for descending order
18
db.COLLECTION_NAME.drop()
Drop collection

HTTP Status Page

MongoDB provides a web interface that exposes diagnostic and monitoring information on a simple webpage. The web interface is available at the 28017 port: https://IP-Address:28017. If the Mongod process is not running on its default port, add 1000 in the Mongod process port.

Configuration to Enable HTTP Status Page

Run the following command on the Linux shell to open mongod.config. Then find the httpinterface and rest key and set the value as “true.”

nano /etc/mongod.conf

.Net Interaction with MongoDB (3.0.5)

MongoDB drivers for .net can be downloaded here. Below is the basic sample C# code, which will allow interaction with MongoDB. This code will enable users to read a Mongo document from a collection, as well as write to the Mongo collection itself.

class Program
    {
        static void Main(string[] args)
        {
            
            MongoDB mongo = new MongoDB();
            mongo.WriteToCollection().Wait();
            mongo.ReadFromCollection().Wait();            
        }
    }



    public class MongoDB
    {
        IMongoDatabase DB;
        public MongoDB()
        {        
            string MongoConnectionString = "mongodb://192.168.85.128:27017/dnc";
            IMongoClient _client = new MongoClient(MongoConnectionString);
            DB = _client.GetDatabase("dnc");         
        }

        public async Task WriteToCollection()
        {
            IMongoCollection<BsonDocument> _collection = DB.GetCollection<BsonDocument>("Employee");            
            BsonDocument _document = new BsonDocument();
            List<BsonDocument> Documents = new List<BsonDocument>();
            BsonElement _elementEmpCode = new BsonElement("EmpCode", "1001");
            BsonElement _elementEmpName = new BsonElement("EmpName", "Goofy");
            _document.Add(_elementEmpCode);
            _document.Add(_elementEmpName);                     
            await _collection.InsertOneAsync(_document);
        }

        public async Task ReadFromCollection()
        {
            IMongoCollection<BsonDocument> _collection = DB.GetCollection<BsonDocument>("Employee");                        
            FilterDefinition<BsonDocument> filter = Builders<BsonDocument>.Filter.Eq("EmpCode", "1001");
            List<BsonDocument> Result = await _collection.Find(filter).ToListAsync();
            foreach (var document in Result)
            {
                List<BsonElement> element = document.Elements.ToList();
                element.ForEach(delegate(BsonElement obj)
                {
                    Console.WriteLine(obj.Name + ":" + obj.Value);
                });
            }
   
        }
    }