C100dba mongodb exam sample questions answers

C100dba is the mongodb dba certification track. To prepare for this c100dba examination start learning with the following mongodb certification exam questions


1. What is the use of mongodump and mongorestore tools?

a. replicate mongodb deployments

b. performance tune mongodb deployment

c. backup mongodb deployment

d. audit mongodb deployment

Answer : c

Explanation: These tools are used to perform back of simple mongodb deployments.

2. In what format does mongodump creates backup files?

a. XML

b. JSON

c. BSON

d. SOAP

Answer : c

Explanation: Mongodump is a simple backup utility that does read the information from mongodb database. Backup files are created in the form of BSON files

3. Which mongodb tool is used to report details on number of database operations in MongoDB?

a. mongodump

b. mongostat

c. mongorestore

d. mongotop

Answer : b

Explanation : Mongodb comes with set of reporting tools. Among these mongostat captures count of database operations like insert, update, delete, read counts. This gives details on load distribution profiles in server

4. Which command is used to determine storage capacity of specific database?

a. mongotop

b. mongostat

c. dbstats

d. collstats

Answer : c

Explanation : dbstats is a mongodb reporting tool used to monitor the state and storage usage of a particular database

5. Mongodb does provide high availability via which option?

a) Sharding

b) Replication

c) Indexing

d) Journaling

Answer : b

Explanation: In mogodb high availability is possible using replica sets

6. What is the on-premise solution having functionality equivalent to cloud manager?

a. Journaling

b. Ops Manager

c. Service Manager

d. Replica Manager

Answer : b

Explanation: Ops manager has functionality similar to cloud manager. As opposed to mongodb cloud manager backup this is an on-premise solution that subscribers can install

7. Which mongodb tools allow us to work with our data in a human readable format?

a) mongodump

b. mongostat

c. mongoexport

d. mongoimport

Answer: c,d

Explanation : mongoexport and mongoimport tools help in working with data in extended JSON or CSV format

8. Which of the following node is used during election in a replication cluster?

a. hidden

b. arbiter

c. primary

d. secondary

Answer: b

Explanation: Lets first understand what an arbiter is. An arbiter refers to mongodb instances that are part of replica set but dont hold data. As they dont hold data arbiter hardware requirement is minimum. They do not need separate serer, instead they can be deployed alongside application servers, monitoring host etc. In a replica set there should always be even number of members. Arbiters are preferred nodes to be elected during failure of replica set

8. Which node in a replica set does accept write operation?

a. primary

b. arbiter

c. secondary

d. hidden

Answer : a

Explanation : In a replica set primary is the only member that can accept write operations

9. What is the need of election in replica set?

a. secondary becomes unavailable

b. arbiter is unavailable

c. primary is unavailable

d. hidden node

Explanation : Election in a replica set happens when the primary becomes unavailable. Elections are needed for independent operation of replica set. In a replica set primary is the only member that can accept write operations. If the primary becomes unavailable replica set becomes read only

10. You are in a sharded cluster. What will you do prior to initiating backup in sharded cluster?

a. db.stopserver()

b. sh.stopserver()

c. db.stopBalancer()

d. sh.stopBalancer()

Answer : d

Explanation : sh.stopBalancer() is used with the arguments timeout, interval. Typical usage is sh.stopBalancer(timeout,interval) used to disable balancer in a sharded cluster. Timeout defaults to 60000 milliseconds. Interval is in milliseconds

11. You have a replicated cluster with 1 primary, 3 secondary, 1 arbiter. One of the secondary is hidden. What is the replication factor of this replicated cluster?

a. 6

b. 7

c. 4

d. 3

Answer : c

12. In mongodb how do you update a document partially?

a. $project

b. $update

c. $set

d. $modify

Answer : c

Explanation : The $set operator replaces the value of a field with the specified value

13) Which operations add new documents to a collection?

a) Create

b) insert

d) update

d) delete

Answer: a,b

14) You have designed a web application with mongoDB. You have configured replication. The replica set is in place and function properly. What happens in case of failure?

a) Failover happens automatically

b) Failover needs to be done manually

c) Switchover happens automatically

d) Switchover needs to be done manually

Answer: a

15) Which mongodb tool is used to report details on number of database operations in MongoDB?

a) mongodump

b) mongorestore

c) mongostat

d) mongotop

Answer: c

16) Which mongodb tools allow us to work with our data in a human readable format?

a) mongodump

b) mongostat

c) mongoexport

d) mongoimport

Answer: c,d

17) What is the on-premise solution having functionality equivalent to cloud manager?

a) Journaling

b) Ops Manager

c) Service Manager

d) Replica Manager

Answer: b

Explanation : Ops manager has functionality similar to cloud manager. As opposed to mongodb cloud manager backup this is an on-premise solution that subscribers can install

18) You are comparing values of different BSON types in mongodb. You want to compare from lowest to highest. Which comparison order is used?

a) MinKey, Null, Numbers,Symbol, String,Object,Array,BinData

b) MinKey, Null, Numbers,Object,Array,BinData,Symbol, String

c) Object,Array,BinData,Symbol, String,MinKey, Null, Numbers

d) Object,Array,BinData,Symbol,MinKey, Null, Numbers,String

Answer: a

Mongodb certification exam questions will help you prepare and crack the mongodb certification

1) How do you monitor mongodb instances?

a) mongodb utilities

b) Ops manager

c) database commands

d) All of the above

Answer : d

Explanation : Mongodb instance should be monitored starting with set of utilities that come pre-packaged as part of mongodb. These are mainly used for reporting purposes. Database commands come handy to get details on current database statistics. In addition to this mongodb cloud manager a cloud monitoring GUI, ops manage an on-premise install that has features equivalent to mongodb cloud manager help with visualization and alerts real-time from database

2) How do you start mongod and mongos instances using config file?

a) mongod -f /etc/mongod.conf; mongos -f /etc/mongos.conf

b) mongod -a /etc/mongod.conf; mongos -a /etc/mongos.conf

c) mongod -h /etc/mongod.conf; mongos -h /etc/mongos.conf

d) mongod -s /etc/mongod.conf; mongos -s /etc/mongos.conf

Answer : a

Explanation : We can start mongod and mongos instances from command-line as well as config files. To make use of config file, we specify option -f

In Mongodb mongodb.lock file has more than zero bytes. What does this mean?

mongodb.lock is the lock file that is available in /data/db directory the data dbpath of MongoDB. This file in a normal state will be accessed by mongodb daemon and is system internal. At times when we try to start a mongodb this file may be of non-zero bytes.

This simply means that mongodb had an abnormal shutdown and attempt to startup the database in normal mode will leave db in inconsistent state

lr_mongob_lock_not_zero_bytes

How to fix this?

Start mongodb with repair (or) repairpath option

mongod --repair command will copy the contents from datafile onto a temp file, copy back to datafile

mongod --repairpath command can be used as well

Can we delete mongodb.lock file?

mongod --repairpath command allows the deletion of this lock file

Give details on Mongodb professional paid support on community edition :-

MongoDB the most popular nosql database is gaining popularity and customer base. They are planning to go public real soon. They have recently changed their CEO who is an expert with IPO. This shows the vision of this company.

Currently MongoDB is an open-source software that can be downloaded for free. MongoDB Enterprise is catered towards enterprises

MongoDB has recently announced a bold move on supporting community edition on paid basis called as mongodb professional

Give details on MongoDB Write operation protocol enhancement :-

MongoDB modifies the document informations via write operations. This includes usual CRUD operations like insert,update, delete, bulk insert. In versions before MongoDB 2.4 all the write operations issued will return results that determines the status of write operation. rEporting the status is made possible using write concern. Write concern is integrated with write operation starting MongoDB 2.6 which comes as part of enhanced write protocol. This eliminates the need to make use of functions like db.getlasterrormessage() from clients connecting to mongoDB server

How does MongoDB make use of Journaling to handle instance crash?

MongoDB journaling mechanism is the write-ahead redo logs written to journal files that can be used to recover the database datafiles after a mongoDB daemon crash. When journaling is enabled, mongodb creates a sub-folder within dbPath folder the default location of storing datafiles.

MongoDB lets journal file grow upto 1GB in size

After hitting this size limit a new journal file is created

Journal directory holds journal files and last-sequence files

Journal files are append only files. Their names start with j._

Using mongod's storage.smallfile runtime option we can limit size of journal file to 128MB

After a crash mongodb instance does make use of journal files and sequence number files to perform crash recovery

Give details on mongodb index types a quick overview :-

MongoDB the most popular NoSQL database offers performance improvement via index usage. As with an RDBMS indexes in mongodb are of different types and it makes good sense to choose optimal index during application design phase based on he need of application, data pattern, data distribution etc

By default all the _id primary keys in a collection have an internal index created on them. All indexes created by users are secondary indexes in a mongoDB environment. Here are some good secondary indexes:

1) TTL Index - Time to Live index is created with the setting expiretotime in seconds. All the documents are purged from collection based on this setting. For clickstreams, logs retained these kind of indexes come handy as there may not be need to store data indefinitely

2) Geospatial index - For applications utilizing geographical latitudinal. longitudinal co-ordinates these indexes come handy

3) Hash index - Hash value is generated and stored to reference the documents

4) Sparse index - Unique advantage of document data model in mongodb is that schema collections can be altered on the fly. Hence not all the documents contain all fields. For fields that contain minimal value sparse index comes handy

5) Compound indexes - Indexes are created on a single field, combination of fields and can be sorted based on combination of values. They provide optimal performance particularly in case of filter search

6) Unique indexes - The value in a field cna be made unique using unique indexes. This can be used to enforce unique constraints as well

7) Array indexes -Collection contain fields that contain arrays. All the values in an array are stored as an index for optimal search

8) Text-search index - This is an advanced index option with mongoDB that uses advanced linguistic rules for stemming, tokenization and stop words. Text search index can be created in a single field as well as more than one field

An unique advantage of mongodb over databases like mysql is that all the index types can be created with a single storage engine and there is no need to switch storage engines

Give details on MongoDB _id primary key :-

MongoDB the leading NoSQL database does store data in collections which are table equivalents in RDBMS. The row in RDBMS becomes document in mongoDB. Every document in MongoDB is uniquely identified by a primary key which takes the form _id. Lets see some quick voerview of what the _id is , how it is made , meaning of it

1) _id is the primary key in any collection that uniquely identifies a document

2) This can be inserted by user (or) system generated

3) This uniquely identifies a document

4) This is a 12 byte hesadecimal number

5) First 4 bytes represent current timestamp, next 3 bytes represents machine id, next 2 bytes represents mongod server process id, last 3 bytes is generated by system which simply increments the count

6) An unique index is internally created on _id field[column equivalent in RDBMS]

What is the significance of ulimit settings in Linux environment?

In environments running Linux/UNIX like operating system resources in the system including files, threads, network connections need to be properly regulated. It becomes essential to avoid the over usage of resources by restricting resource usage at user level.

This is possible by making use of ulimit commands. It becomes necessary to allocate resources appropriately. Too much resource leads to slow system performance, too low might lead to issue while users connect to environment.

How is an OS resource utilized in MongoDB environment?

Mongodb processes including mongod, mongos make use of threads, file descriptors to track connections and manage internal operations. Here is a general recommendation while allocating system resources for mongoDB instances:

1) 1 file descriptor for one datafile in use if this is used by mongodb instance

2) 1 file descriptor for one journal file being used by mongodb instance. This is utilized while storage.journal.enabled is set to true

3) While a replica set configuration is in place, each mongodb maintains connection to all other members of the set to replicate the information from master to slaves

Give details on Mongodb syncdelay an advanced tuning parameter :-

Syncdelay a parameter by default set to 60s in mongoDB should not be changed in most cases. Before we proceed further let us first see what is the prime purpose of syncdelay parameter?

Syncdelay determines the time interval within which mongodb flushes data to datafiles.By default it is set to 60 seconds and mongo recommends not to change this. Flushing of dirty pages from memory to datafiles on the disk in an mongodb environment happens via fsync() function. The fsync() function is called at a time duration set for syncdelay

Say, if syncdelay=90 then fsync() function is called once in 90 seconds

It is interesting to note that while we perform backup of mongodb to write-lock a mongoDB fsync function is used.If set to 0 mongodb does not flush data from memory onto disk

Give details on Mongodb remove() function to purge collection :-

Most of the database developers and DBA's from typical RDBMS background are familiar with DML - Insert, Update, DElete. The function to insert into a collection will be db.collectionname.insert({}). With that in mind if you choose to purge an existing document from a collection using delete() function, you are not correct

> db

test

> db.mycollection.delete({hello:"world12"})

Sat Jan 02 12:26:40.474 TypeError: Property 'delete' of object test.mycollection is not a function

> db.mycollection.remove({hello:"world12"})

> db

test

> db.collection.find()

>

The above command shows that to pure a document from collection make use of remove() function




What is mongoDB?

MongoDB is a nosql database that doesnot make use of RDBMS concept. Mongodb does make use of document database concept

What operations are permitted in mongodb?

CRUD - Create , read/select,update, delete operations are permitted as part of mongodb

Can we make use of mongodb in heavy transactional environment?

It is not recommended. Mongodb is primarily used in web based environments

What is a table called in mongoDB?

Table in mongodb terms is a collection

Why is MongoDB not used for heavy transactional environment?

MongoDB the NoSQL database that is gaining rocket popularity is a document based database. Many sectors primarily healthcare, financial, retail do make use of mongoDB. Popular mongodb case studies show that it is widely used in call centre customer feedsystems.

So, what is primary use of mongoDB?

MongoDB is document oriented and doesnt follow normalization concept of relational databases. MongoDB as such has no support for concurrency the primary criteria that permits transactional consistency in RDBMS systems

Hence mongoDB is primarily used in web based systems. However, mongoDB has CRUD the create, read,/select,update ,delete operation permission

What is row in table called in mongodb?

A row in mongodb is usually referred as document

Give details on MongoDB announces paid support on community edition :

MongoDB the most popular nosql database is gaining popularity and customer base. They are planning to go public real soon. They have recently changed their CEO who is an expert with IPO. This shows the vision of this company.

Currently MongoDB is an open-source software that can be downloaded for free. MongoDB Enterprise is catered towards enterprises

MongoDB has recently announced a bold move on supporting community edition on paid basis

For full details visit :http://www.mongodb.com/lp/contact/mongodb-rescue-2014

What is rowid oracle equivalent in mongodb?

In mongodb the oracle rowid equivalent is referred as _id

What is reason behind TypeError: Property 'findone' of object dbname.collection is not a function :

This error occurs while using the findone() function in mongodb. findone() function is used to return first value in a database collection

db.collection.findOne()- correct format

The reason for above error being - db.collection.findone()

Make sure mongodb is javascript oriented and is case sensitive

Give MongoDB Replication A quick Look:

In case of mongoDB the client app talks to mongodb server using drivers in client application. These drivers are aware of replica set configuration in place and are prepared to failover automatically

When primary fails, the secondary becomes new primary. In this scenario the client app drivers are smart enough to recognize the new primary and have connection routed .

Usually the failover is not instantaneous and involves some lag of 10s of seconds before the failover occurs

Writes always go to primary and there comes lag into picture. However, reads can happen from secondary as well and clientapp configured to read like reporting apps are not aware of this lag. This depends on Read Preference configuration

Is MapR MongoDB Connector certified officially?

MapR the big data solution provider owns Apachae Hadoop and has come up with a partnership with mongoDB the premiere NoSQL database. MongoDB connector is officially certified to be used with MapR. This makes the bigdata migration, data mining, user segmentation adn such business decisions easy

What is a NoSQL database? Can we use it without SQL?

NoSQL is the terminology coined to represent the new generation databases. However, we can access them only using traditional SQL concept

What SQL concepts form basics of Mongo DB?

CRUD - Create, Read, Update, Delete is the basic concepts on which this database operate just like any other database

What is needed to learn mongodb?

It is a new concept and needs no background as such. As MongoDB is based on javascript good knowledge of BSON,JSON, Javascript helps

1. Give details on iterating cursor in mongoDB?

Iterating cursor in mongoDB is accomplished using db.collection.find() command. This find() command gets a cursor of documents. By default first 20 documents in collection are returned by mongoDB. To fetch and process/print more documents, use hasNext() function

2. When does a cursor gets closed in mongoDB?

MongoDB server by default closes cursor after 10 minutes of inactivity. Cursor is closed when a client exhausts the cursor

3. How to avoid the cursor inactivity and timeout behaviour?

This default behaviour can be overridden using notimeout flag

3. What are the options available in mongo shell that tweak cursor behaviour?

Mongo shell provieds many options that tweak cursor behaviour. These are the cursor flags:

DBQuery.Option.tailable

DBQuery.Option.slaveOk

DBQuery.Option.oplogReplay

DBQuery.Option.noTimeout

DBQuery.Option.awaitData

DBQuery.Option.exhaust

DBQuery.Option.partial

4. How can we increment the _id sequence number?

Adopt one of the following two procedures :

1) Make use of counters() collection

2) Make use of user coded looping

5. What is the maximum size of mongoDB document?

MongoDB can grow upto 16MB the maximum size of a BSON document. When creating a record mongoDB allocates some extra space that can be utilized in future. This is an proactive measure to allocate additional space to growing mongoDB document. This is called record padding

6) What is the default primary key field in mongoDB?

By default every collection comes with _id column

7) What is CRUD in mongoDB?

Basic database operations - create, Read, Update, Delete

1) What command in mongodb tells us more details on mongoDB cursor?

Use the following db.runcommand() to get details on cursor

db.runcommand({cursorinfo:1})

2) What are the possible CRUD operations in mongodb?

As with any other databases mongodb offers Create, Update, Delete, Projection/retrieve/select options

3) When you spcify 0 against a field what does tht mean?

Usually when we try to access/project details on documents in collection this is used

db.collection.doc({field1:0,field2:1});

The above command will show _id,field2 in output. field1 is not displayed as part of output

4) Is it possible to hide _id in projection output?

In mongodb by default whenever we issue query to get details on documents in collection, _id is included as part of output

However, it is possible to mask it in output as follows

db.collection.document({_id:0,fields:1.});

5) What is user interfaceto interact with mongoDB?

Mongo shell is the command line client to interact with mongoDB.

6) How does mongoDB process cursor?

MongoDB processes cursors in batch. The first batch size is 1MB. Subsequent batch sizes will be around 4MB

7) What are advanced concepts that mongoDB more scalabe?

Following two interesting concepts make mongoDB scalable and make it a candidate to store Big Data:

1) Sharded clusters

2) Aggregation framework

8) What are data models supported by mongoDB?

MongoDB is a popular NoSQL database. Here are the data models supported by it

1) References are supported - This is a relational concept but still supported by monogoDB

2) Embedded documents - This is a denormalised form in which references are avoided. At higher level this improves query performance as it avoids joins

9) What is used to import json file onto mongoDB?

Mongoimport is the command line utility from MongoDB that helps to import collections into database. Store file in json format and perform import

What is the difference between compact and repairdatabase command in mongodb?

MongoDB stores data in datafiles. While a datafile is created OS disk space is allocated to datafiles. Once the informaiton is stored, deleted, processed this creates fragmentation and unused space remains intact in datafiles. To de-fragment the datafiles one of the following commands can be used

1) Mongodb compact command - This command compacts the datafiles . but unused space remains in datafiles and is not released to OS

2) repairDatabase - This command de-fragments datafiles. In addition unused space is released to operating system

What is the necessity behind mongodb log rotation?

MongoDB has the capability to rotate the current logfile by archiving it [save it as a copy] and create a new logfile.

How is log rotation achieved?

Mongod (or) Mongos instance renames the current logfile by appending timestamp to current logfile.. The date is by default in ISODate format.

The old logfile is closed, new logfile is opened, all the entries are sent to the new logfile.

When does a mongodb log rotation occur?

MongoDB log rotation happens under following circumstances:

1) Whenever a mongod (or) mongos process receives a SIGUSR1 signal from the operating system

2) In response to logrotate command

Can we take advantage of operating system syslog to log the mongod logs?

Yes. Mongod makes it possible to log the information onto syslog

Give detailson gridfs in mongodb:-

MongoDB 2.6.1 the latest version of mongoDB offers GridFS support to store documents.

MongoDB does support BSON documents. What is specific need of GridFS?

BSON documents has a size limitaiton of 16MB. To store files that are larger than 16MB which is ideally case with most video files, image files etc that get uploaded onto twitter, facebook ,youtube on daily basis there comes a need to store this informaiton as document chunks in GridFS instead od a single document

What is basic architecture of GridFS?

It consists of two collections stored in fs namespace

1) fs.chunks

2) fs.files

chunks - store file information

files - stores metadata on file information

What is size of chunk in GridFs?

256KB is the typical chunk size as of MongoDB 2.6.1

So, why is mongodb ideal for heavy transactional environment?

MongoDB the NoSQL database that is gaining rocket popularity is a document based database. Many sectors primarily healthcare, financial, retail do make use of mongoDB. Popular mongodb case studies show that it is widely used in call centre customer feedsystems.MongoDB is document oriented and doesnt follow normalization concept of relational databases. MongoDB as such has no support for concurrency the primary criteria that permits transactional consistency in RDBMS systems.Hence mongoDB is primarily used in web based systems. However, mongoDB has CRUD the create, read,/select,update ,delete operation permission

Give details on Use of Map reduce Framework in MongoDB:-

MongoDB the most popular NoSQL database supporting Big data offers many ways to aggregate data.

This is a programming model wherein problem is expressed as map and reduce model

1) Aggregation Framework - This is default. New in version 2.2. Helps us do aggregation easily

2) Map Reduce facility which is built-in - This is used when it is not possible using aggregation framework

3) Hadoop connector - This flavour is available as part of hadoop framework - Done using Java code

How to fix TypeError: Property 'findone' of object dbname.collection is not a function?

This error occurs while using the findone() function in mongodb. findone() function is used to return first value in a database collection

db.collection.findOne()- correct format

The reason for above error being - db.collection.findone()

Make sure mongodb is javascript oriented and is case sensitive

10) Give details on projection operators in mongodb:-

MongoDB has some inbuilt projection operators that are primarily used for data retrieval. Lets take a quick look at the mongoDB projection operators and their usage:

1) $ - When a query is issues on a collection, documents are retrieved. $ projects the first element in the array after the query condition is matched

2) $elemMatch - This operator projects the first element in an array. Array is returned when specified $elemmatch condition is matched. Same as LIKE in RDBMS

3) $meta - This operator is used to project documents score assigned during $text operation

4) $slice - Limits the number of elements projected from an array. Supports skip and limit the number of slices. Same as limit condition in RDBMS

10) How to separate tables from indexes at physical level in mongodb?

11) What is the memory recommendation of mongodb?

While installing and configuring mongodb databases in production it would be appropriate to follow best recommendation values. Here are the mongodb best recommendation during production deployment

Give details on mongodb index filters :-

MongoDB does create index filter for a query shape. A query shape is a combination of query, sort, projection criteria. In simple words we can determine if we need to make use of indexes for a query. If an index filter is in place for a query it overrides hint(). Say, we provide a hint as part of a query, however some index filters do exist in place. In this scenario index filters are used. Hint is ignored.

Say, we try to run the query with hint after a server restart. Will hints be used? Yes. Reason being index filters don't exist after a server restart

I don't want a server restart. I do have index filters. What can I do?

Make use of PlanCacheClearfilters using db.runcommad({plancacheclearfilters:collectionname,...})

12) What are the database commands to monitor activity of mongodb?

13) What is flexibility of mongodb over rdbms?

14) How does MApR works with mongodb?

MapR the big data solution provider owns Apachae Hadoop and has come up with a partnership with mongoDB the premiere NoSQL database. MongoDB connector is officially certified to be used with MapR. This makes the bigdata migration, data mining, user segmentation and such business decisions easy

15) In MongoDB how will you avoid pre-allocation lag during journaling ?

While enabling journaling there will be a pre-allocaiton lag to create journal files depending on file systems this could be minutes during which no one cna connect to database. As a best practice this can be avoided. This can be avoided by copying the journal files from another instance of mongodb

Here are some important points on pre-allocation

1) They do not contian data

2) these files can be removed later

3) If an instanc eis started with journaling enabled by defualt [ default in 2.6] these files are created once again by mongod

16) How does MongoDB store a javascript function on the server ?

Though it is not a recommended practice to store application code within the database as there could be performance impact, javascript stored function can be stored inside mongoDB

MongoDB hosts a special system collection called system.js. This stores javascript function.

Save the javascript funciton a sfollows

db.system.js.save({ _id:"functionname",value: function(a,b){return a*b;}});

_id - stores function name. This should be unique

value - stores funtion definition

Once stored this function can be used from any javascript context

17) How to perform MongoDB Upgrade by replacing binaries?

MongoDB server software can be upgraded using OS specific package manager (or) by replacing software binary

1) Perform safe backup of mongoDB data before upgrade

2) Bring down mongodb instance

3) Download the altest binary onto temp location

4) Rename old binary

5) Copy new binary onto old binary location

6) Start the mongoDB instance

18) How does MongoDB Record write operations in journal?

MongoDB does make sure of journal files which are physical files stored in disks to perform crash recovery in case of unexpected shutdown if mongod daemon. Journal files stores the following information that are typically considered write operations in a mongoDB environment

1) Changes to collection including updates of documents, inserted documents

2) Changes to index structures

3) Changes made to the namespace files. This includes metadata changes to namespace files. Say we have a database by name lr, the data directory contians files of form lr.0,lr.1...lr.n,journal[journal file], lr.ns[namespace datafile]

3) Information on database changes including creation, dropping od databases, their associated datafiles etc

How does a mongoDB record write operation in journal file?

1) Whenever there is a write operation mongoDB copies these onto the private view which is a storage view in Physical RAM

2) MongoDB then writes these in batches called group commits onto journal files. This is tunes by appropriately setting commitIntervalMs parameter. When the group commit happens all the writers are blocked

3) From journal file mongoDB writes the information onto the shared view which is the operating system virtual memory view. This leaves shared view in a state different than the datafiles. This state is by default retained for 60 seconds. IF there is a memory crunch then there is a possibility to have this frequency changed from 60 seconds to littel less. After this timing information from shared view is written onto datafiles on disk

4) At this point all the jounal writes have been flushed

5) MongoDB stores details on write operations flushed onto datafiles. This information in journalfile is no longer necessary. Hence the journal file is recycled or deleted

19) MongoDB Deployment Planning Checklist:-

MongoDB deployment is brand new among most organizations. To support the deployment it would be best to deploy mongoDB by proper planning. IT would be better to comeup with checklist before mongoDB deployment

1) MongoDB schema is dynamic in nature. Make sure this schema is appropriate for your business

2) Application needs to be properly optimized. In a database application optimization starts with indexes. MongoDB offers different types of indexes

3) Sharding the unique feature of MongoDB

Give details on MongoDB Shell Methods :-

mongoDB is the document based database originally designed using C++. The commands to interact in mongo shell are javascript based. Mongo Shell methods represents the set of commands [javascript] that can be used in mongo shell. In applications that make use of mongoDB these shell methods can be based on drivers which represents the language used for interaction

Few interesting mongo shell methods and their usage have been detailed below

1) db.collection.count() - This can be treated equivalent to select count(*) from tables;. This returns the count of number of documents in a collection [equivalent in RDBMS is a table]

2) db.collection.find() - This performs searching and finding documents in a collection

3) db.collection.findone() - Returns a single document

4) db.collection.insert() - Creates a new document in collection

5) db.collection.remove() - Deletes documents form collection

6) db.collection.reindex() - Index rebuild operation. If the index is on _id column rebuild happens in foreground. If the index rebuild is on an index other than _id it is in background. this happens in a single instance mongoDB. In case of replicaiton this operation is not propogated from primary to standby

2) What is a parent references pattern in mongodb?

MongoDB is meant for its flexible schema design. As a part of data modeling it is interesting to learn more on the parent references pattern.

What is a parent references pattern?

To put it in simple terms it determines the way information is stored in a collection. In this pattern

1) Each and every node in a tree is stored in a document. This internally has some denormalization of data

2) Each document has informaiton on its parent

Consider the following case - lr Training - MongoDB Training, PMP Training

In a typical parent reference pattern we create three documents

db.lr.insert({_id:"MongoDB Trianing",parent:"lr"})

db.lr.insert({_id:"PMP Trianing",parent:"lr"})

db.lr.insert({_id:"lr Training",parent: null})

In case if we want to create an index on parent field to improve performance we cna do so

db.lr.ensureIndx({parent:1});

To retireve the parent of a node say MongoDB Training it is simple

db.lr.findOne( { _id: "MongoDB Training" } ).parent

We can find details of immediate child nodes of a parent

db.lr.find( { parent: "lr" } )

3) Give details on Schema difference NosQL Vs RDBMS :-

NoSQL the new generation databases are gaining popularity with storing, processing of big data. Traditional information data stores happened to be relational database management systems. Here is a quick overview of difference in schema design between RDBMS and NoSQL databases

1) RDBMS schema design - Fixed schema. Once a table is created with columns containing specified datatype, information needs to be stored in all columns with designated datatype

2) NoSQL schema design - Dynamic schema. There is no need for every field to contain data. Also the type of data stored in a field can vary

4) What is need of MongoDB 2.6 Write operation protocol enhancement ?

MongoDB modifies the document information via write operations. This includes usual CRUD operations like insert,update, delete, bulk insert. In versions before MongoDB 2.4 all the write operations issued will return results that determines the status of write operation. reporting the status is made possible using write concern. Write concern is integrated with write operation starting MongoDB 2.6 which comes as part of enhanced write protocol. This eliminates the need to make use of functions like db.getlasterrormessage() from clients connecting to mongoDB server

5) Does MongoDB support bigdata?

Mongodb offers plenty of features and options to support big data, high volume throughput application, data availability etc. Sharding is an interesting feature of MongoDB that supports this high availability, scaling of big data.

6) What is sharding?

Before talking about sharding lets talk about two types of scalability

Vertical scalability - Adding more RAM, CPU, Disk Space to cater growing needs

Horizontal scalability - Use more than one physical machine, distribute data across machines, club them together as single logical database

MongoDB offers horizontal scalability utilizing sharding feature.

7) What components do make a shard?

A shard is a machine in real world. Plenty of physical machines are clustered together as a single sharded cluster.

1) Query router - This is the mongos instance which interacts with client and routes query to appropriate shards

2) Config servers - In a typical production environment there are 3 config servers. They store metadata on cluster configuration

3) Shards - The machines storing logically partitioned information

8) What is the key behind partitioning in shard?

Shard keys help with partitioning in shards

9) Give details on MongoDB command?

As with any other database MongoDB is a NoSQL database in which information is stored, retrieved, manipulated. This is accomplished using CRUD commands - Create, Read, Update and delete. In addition to this MongoDB offers advanced features in form of MongoDB commands

To implement functionality that falls outside scope of CRUD mongodb makes use if commands. They assist with many different operations of mongoDB collections

Some popular MongoDB commands include :

1) getlasterror()

2) db.runcommand({"drop":"collectionname"); - This is a good example of drop command

MongoDB supports many such commands

10) How do I get list of commands from MongoDB?

To get to know list of commands supported by particular version of MongoDB run :

db.listcommands()

11) Why is MongoDB the database of 21st century?

Relational databases starting with most popular Oracle database that had its advent starting 1970 has been designed with the needs of data and information storage around 1970's. It not only includes the the adoption of traditional waterfall project management methodology but also was designed considering the Enterprise Architecture 1.0

However, in 21st century the Enterprise 2.0 has evolved as a totally different arena in which internet gains 100% prominence in day-to-day life of almost everybody. This has lead to explosion of data growth. To support this lets take a quick look at the amount of information stored and processed by two most popular websites in this world

1) facebook on an average stores and processes 500tB of data every day

2) Google the Big Data master processes about 20 Petabytes of data every day

To handle such large amount of information that come in structure and unstructured format 1970 modeled relational databases may not be adequate. hence there is a compelling need to redesign the database with current and future trends in mind

This has lead to evolution of more sophisticated, non-relational No-SQL databases. Though there are plenty of NoSQL databases in market to catch up the momentum, only handful of them are really popular owing to their robust architecture, cost-effectiveness, resource availability. Among them the best and most popular happens to be MongoDB

Variety of sectors including healthcare, finance, aerospace etc have adopted this interesting DB owing to its cost effectiveness, speed and agility. Mongodb is truly an agile database meeting needs of 21st century's growing internet space

12) Give details on MongoDB Journaling an Overview :-

MongoDB stores information as data in datafiles. Journaling mechanism offers write-protect mechanism in MongoDB. Journal files are physical files stored in physical disk location

1) Data changed in RAM is written onto journal file on disk

2) Information from Journal file is applied to data files

3) In case of mongod daemon crash, information is retained in journal file

4) After a crash, information is applied from journal file

5) By default 100ms worth of information is lost as this is the default commitintervalMs. This setting is the value that determines the timing between data writes onto journal files from memory. This setting can be changed

6) Without a journal file in place, when a mongod crashes we need to perform repair (or) resync from clena member of replica set if one in place

7) From MongoDB 2.2 onwards in 64-bit environment journaling is enabled by default

Give details on MongoDB Process structure :-

Mongod - Mongod is the process that runs in a server and manages the mongodb related activities. Mongod is a server process , however it can be forked as daemon as well. The default port in which mongod process listens to is 27017. However, it is possible to specify the port number using port option. The default location of datafiles is /data/db path. It is possible to change the value using dbpath option.

mongos - Mongodb shard as it is popularly called is the routing service in mongodb environment.

Give details on Schema difference NosQL Vs RDBMS :-

NoSQL the new generation databases are gaining popularity with storing, processing of big data. Traditional information data stores happened to be relational database management systems. Here is a quick overview of difference in schema design between RDBMS and NoSQL databases

1) RDBMS schema design - Fixed schema. Once a table is created with columns containing specified datatype, information needs to be stored in all columns with designated datatype

2) NoSQL schema design - Dynamic schema. There is no need for every field to contain data. Also the type of data stored in a field can vary

Give details on MongoDB Journaling an Overview :-

MongoDB stores information as data in datafiles. Journaling mechanism offers write-protect mechanism in MongoDB. Journal files are physical files stored in physical disk location

1) Data changed in RAM is written onto journal file on disk

2) Information from Journal file is applied to data files

3) In case of mongod daemon crash, information is retained in journal file

4) After a crash, information is applied from journal file

5) By default 100ms worth of information is lost as this is the default commit interval ms. This setting is the value that determines the timing between data writes onto journal files from memory. This setting can be changed

6) Without a journal file in place, when a mongod crashes we need to perform repair (or) resync from clean member of replica set if one in place

7) From MongoDB 2.2 onwards in 64-bit environment journaling is enabled by default

Mongodb offers plenty of features and options to support big data, high volume throughput application, data availability etc. Sharding is an interesting feature of MongoDB that supports this high availability, scaling of big data.

What is sharding?

Before talking about sharding lets talk about two types of scalability

Vertical scalability - Adding more RAM, CPU, Disk Space to cater growing needs

Horizontal scalability - Use more than one physical machine, distribute data across machines, club them together as single logical database

MongoDB offers horizontal scalability utilizing sharding feature.

What components do make a shard?

A shard is a machine in real world. Plenty of physical machines are clustered together as a single sharded cluster.

1) Query router - This is the mongos instance which interacts with client and routes query to appropriate shards

2) Config servers - In a typical production environment there are 3 config servers. They store metadata on cluster configuration

3) Shards - The machines storing logically partitioned information

What is the key behind partitioning in shard?

Shard keys help with partitioning in shards

MongoDB is meant for its flexible schema design. As a part of data modeling it is interesting to learn more on the parent references pattern.

What is a parent references pattern?

To put it in simple terms it determines the way information is stored in a collection. In this pattern

1) Each and every node in a tree is stored in a document. This internally has some denormalization of data

2) Each document has informaiton on its parent

Consider the following case - lr Training - MongoDB Training, PMP Training

In a typical parent reference pattern we create three documents

db.lr.insert({_id:"MongoDB Trianing",parent:"lrTraining"})

db.lr.insert({_id:"PMP Trianing",parent:"lrTraining"})

db.lr.insert({_id:"lr Training",parent: null})

In case if we want to create an index on parent field to improve performance we cna do so

db.lr.ensureIndx({parent:1});

To retireve the parent of a node say MongoDB Training it is simple

db.learnersrefernece.findOne( { _id: "MongoDB Training" } ).parent

We can find details of immediate child nodes of a parent

db.lr.find( { parent: "lrTraining" } )

Give details on mongodb shell methods :-

mongoDB is the document based database originally designed using C++. The commands to interact in mongo shell are javascript based. Mongo Shell methods represents the set of commands [javascript] that can be used in mongo shell. In applications that make use of mongoDB these shell methods can be based on drivers which represents the language used for interaction

Few interesting mongo shell methods and their usage have been detailed below

1) db.collection.count() - This can be treated equivalent to select count(*) from tables;. This returns the count of number of documents in a collection [equivalent in RDBMS is a table]

2) db.collection.find() - This performs searching and finding documents in a collection

3) db.collection.findone() - Returns a single document

4) db.collection.insert() - Creates a new document in collection

5) db.collection.remove() - Deletes documents form collection

6) db.collection.reindex() - Index rebuild operation. If the index is on _id column rebuild happens in foreground. If the index rebuild is on an index other than _id it is in background. this happens in a single instance mongoDB. In case of replicaiton this operation is not propogated from primary to standby

How to Create MongoDB Enterprise Service in windows environment manually?

MongodB enterprise can be installed in windows server. This requires minimum of Windows server 2008 R2. After installing mongoDB enterprise it becomes mandate to create a service manually in windows environment using windows utility sc.

sc.exe is the tool that comes as part of microsoft toolkit that can be used to create and start a windows service.

What is the use of mongodb mongostat utility?

Mongostat the monitoring tool from mongoDB can be considered equivalent to vmstat that a Linux OS provides. This tool provides status/run-time information of processes like mongod which is a typical mongoDB instance, mongos the MongoDB shard instance. This is a reporting tool from mongoDB that provides information on system performance, activity

So, what does a mongostat typically do?

a mongostat utility mainly captures and returns the counts of database operations by type. Typical database operations include CRUD operations - insert, select/query,delete, update etc. This is an indication of load on the server and the way load is distributed. This helps us with capacity planning of mongoDB database. Information on what percentage of the time the db lock happens with different database operations is captured and returned by mongostat

Mongostat can be used with both mongod and mongos instances

Why did mongodb acquire wiredtiger?

On Tuesday December 16,2014 MongoDB has come out with an announcement about its acquisition of Wiredtiger the premiere NoSQL engine that helps with obtaining performance on applications that contain write intensive operations.

Starting with MongoDB 2.8 wiredtiger will come as a native storage engine with MongoDB

As with any other database MongoDB is a NoSQL database in which information is stored, retrieved, manipulated. This is accomplished using CRUD commands - Create, Read, Update and delete. In addition to this MongoDB offers advanced features in form of MongoDB commands

So, what is a mongoDB command?

To implement functionality that falls outside scope of CRUD mongodb makes use if commands. They assist with many different operations of mongoDB collections

Some popular MongoDB commands include :

1) getlasterror()

2) db.runcommand({"drop":"collectionname"); - This is a good example of drop command

MongoDB supports many such commands

How do I get list of commands from MongoDB?

To get to know list of commands supported by particular version of MongoDB run :

db.listcommands()

Give details on what a vorometric encryption is and how this can be used with mongodb?

Enterprise security is gaining popularity day by day. With advent of standards like HIPPA, PCI DSS security of data at every level does become important. Cost of losing a data is much more than securing the data. Addressing such needs is now possible using enterprise grade security solutions. One such solution is Vormetric transparent encryption. It meets enterprise security, contractual requirements, compliance requirements by application of encryption, access policies. This is used with databases, big data, Platform as a service aka PaaS to secure data residing at rest in disk.

One interesting application that does provide document and field level encryption is MongoDB.Vormetric Application Encryption is a library to simplify integrating application-level encryption into existing corporate applications. Vormetric data security platform can be used with mongoDB to provide OS, file and application level encryption

What causes typeerror while using shutdown command in mongoDB and how to fix this?

While using shutdownserver() command connecting to admin database we found the following error:

use admin

> db.shutdownserver();

2014-12-24T21:30:47.593-0500 TypeError: Property 'shutdownserver' of object admin is not a function

Surprisingly the fix is really simple

use admin;

db.shutdownServer();

The above commadn did properly close all the files, removed file locks [ fs lock], exit the process gracefully

As such function names are case sensitive in a MongoDB environment. We can also make use of mongod --shutdown command to perform same operation

What are three types of installation setup types in mongoDB?

While installing mongodb using installer we can choose to install it using one of the following three setup types

1) Typical - Includes most common features

2) Custom - Users can choose the program features to be installed and location in which they are installed. Components include Server, client, monitoring tools, import/export tools, router, miscellaneous tools. Miscellaneous tools include mongodump/mongorestore, mongoexport, mongoimport. On an averae the features consume closer to 300MB hard disk spaceAX

3) Complete - This is resource intensive and installs all program features

What is unique advantage of custom setup type?

We can choose location as well as components to be included in installation

What makes slamdata so unique with mongoDB?

Tools like pentaho that are heavily involved in business analytics(BI tools) are integrated with MongoDB like JSON based document oriented NoSQL databases, the mongoDB structure is transformed into relational database equivalent tables and analysis has been done.

Tools similar to pentaho including Apache Drill involves ETL [Extract, Transform.Load] process. These steps make it time consuming and complex. The major reason why ETL could not be by-passed is that these tools are originally built for RDBMS and they adopted the transformation of NoSQL DB’s in RDBMS form before analytics.

Now Slamdata that comes with its Slamengine background is a open-source tool. It is interesting to note that slamdata makes it possible to speak directly to MongoDB. Being a open-source tool that can talk directly to JSON based databases we can expect Slamdata BI solution on Cassandra database real soon.

In a real world scenario Big Data drives healthcare analytics. To put it simple healthcare organizations are looking for a way to pull legacy information about patients health record. HIE and attempt to make healthcare systems interoperable makes it possible to pull informaiton about patients from all participating hospitals. Once the data grows big this needs to be stored, pulled, processed to predict the health of the patient based on past facts. Say the report contains history of patient starting with say he is a diabetic. Now, some blood samples drwan from patient for a different purpose shows elevated blood sugar levels and this indicator will help physician choose correct treatment.

What is MongoDB upsert() merge equivalent in Oracle database?

Oracle database offers interesting option to perform all the DML operations using a single MERGE. similar to this in mongoDB starting 2.6 version the latest upsert() makes it possible to perform the following

Consider the following commands:

db.collection.find(query criteria).upsert().update(criteria);

db.collection.find(query criteria).upsert().updateOne(criteria);

db.collection.find(query criteria).upsert().replaceOne(criteria);

When documents matching query criteria is found it is either updated/replaced based on query conditions. If no document matching criteria is found, new document is inserted. This works same way as in Oracle MERGE

What is the way to pull protected fields using upsert?

It is possible as follows

{upsert:true,

select:{"protectedfield":1} }

From mongoDB version 2.7.7 onwards it is possible to determine if upsert() uses correct indexes using explain()

Give details on control Scripts in MongoDB:-

Control scripts are the set of scripts that get created in /etc/init.d (or) /etc/rc.d directory. They usually useful in STARTUP of different services, daemon during system startup. MongoDB mainly consists of two daemons/processes namely - mongod, mongos

mongod - This is the main MongoDB daemon that runs the MongoDB server

Mongos- This is the mongoDB shard daemon

Until mongodb 2.6.2 version we had control scripts for both mongod and mongos daemons.

Starting with 2.6.3 we have control scripts only for mongod. If needed mongos control script can be created and added to system

How and When to use indexes in MongoDB?

As with any of the relational databases mongoDB does support index as part of application design. Here are some best tips and practices that can be considered or taken as a strategy while determining to create indexes

1) Look for size of an index. To see what is the current size of an index issue db.collection.totalIndexSize() command that gives details of total index size in a collection. For efficient index access they must reside in RAM. With recent mongoDB implementaiton now it is possible for the index to maintian most recent most frequently accessed information in RAM instead of all info. This conserves RAM needed to store working sets

2) For queries that are covered indexing is the most efficient way to go. A query is considered covered only if all the fields in the query are part of index. In such cases only index is used to return information and document access is totally skipped. However , this has some limitation on fields that do contain array data types. Covered indexes can't be created in such fields

3) Frequency at which query is used in an application - Say a query does use a document on a frequent basis then that field can most likely be indexed. If the query clause involves $or operator multiple indexes are used. If not only one index is used. If there are more than one fields being used in query use compound index

5) Indexes helps with returning sorted results

Give details on MongoDB client operation termination:-

From mongo shell it is possible to terminate a running mongoDB client operation. Use simple db.killop() to terminate a running client operation

1) Find current operation id as follows

db.currentop()

2) Terminate this operation as follows

db.killop(opid)

This should be handled with care and make sure you are terminating correct operation

With mongodb it is possible to use latest maxtime ms which limits processing operations on cursor. If time limit surpasses maxTimeMS errorcode 50 is returned. It is safe as mongoDB terminates this safe at earliest interrupt point , the point at which an operation can be safely aborted

Give details on Integration of mongodb with monitoring tools:-

MongoDB monitoring involves utilizing tools that come as integral part of mongodb database to monitor the activity, state and status of different operations happening in mongoDB. These tools can be used from mongo shell command line as well as browser.: Munin / Cactii / Nagios

How Narrativescience uses MongoDB for Quill?

Narrativescience the firm that owns artificial intelligence based software quill provides narrative reports based on the data stored in database. To support big data narrativescience started using mongodb ever since 2010. Based on a latest article from narrativescience in MongoDB official blog, here are some interesting features of MongoDB in use

1) TTL indexes - This is an interesting mongoDB feature. This is created using normal db.collection.createindex() shell method with an additional expireAfterSeconds parameter. Once created, a background job purges expired record ocne in 60 seconds. Such indexes are usually used to expire and remove the documents from collections on logs, user sessions etc. Quill does make use of TTL indexes

2) Bigdata storage - As per narrativescience's information quill processes over 5TB of data stored in mongoDB

3) Schema flexibility - It is possible to make changes to collections on teh fly. This is advantageous for narrativescience as they can tweak collection structure depending on client

4) Nested documents - This is used by natural language generation aka NLG tool quill


What causes SyntaxError: Unexpected token { error in mongodb and how to fix it?

Mongodb is based on JSON the java script library and stores information in BSON the Binary JSON format.

As a first step in mongodb, it often becomes necessary to practice commands, not queries any more in javascript format and this error is very common

> db.mycollection.insert{{hello:"lr.com"}}

Jan 03 SyntaxError: Unexpected token {

In case of above error, check if we are in correct database

> db

test

This is a typical Syntax error, and we can simply fix this as follows

> db.mycollection.insert({hello:"world"})