Lecture Notes Of Day 22:
MongoDB Replication
Objective:
The objective of this lesson is
to teach students about MongoDB replication and how to set up replica
sets in MongoDB. By the end of the session, students should be able to
configure and manage replica sets to ensure high availability and
data redundancy in MongoDB.
Outcome:
Students will learn how MongoDB
replication works and how to set up replica sets, which will allow them to
achieve high availability and fault tolerance in their MongoDB deployments.
What is
MongoDB Replication?
MongoDB replication refers to the
process of synchronizing data across multiple servers, or nodes, to ensure data
availability and durability. Replication involves the creation of multiple
copies of the data, called replica sets.
A replica set in MongoDB
consists of two or more copies of the same data, which are synchronized.
This allows MongoDB to continue operating even if some nodes fail. Replica sets
provide redundancy and help with failover (automatic switching to a
backup in case of primary node failure), ensuring high availability for your
database.
Key
Features of MongoDB Replication:
- Primary-Secondary
Architecture: A replica set contains one primary
node and multiple secondary nodes. The primary node receives all
write operations, and the secondary nodes replicate data from the primary
node.
- Automatic
Failover: If the primary node becomes unavailable, one
of the secondary nodes is automatically promoted to become the new
primary.
- Data
Redundancy: Multiple copies of the same data ensure that
if one node fails, the data can still be accessed from other nodes.
How Does
MongoDB Replication Work?
Replication in MongoDB works by
having one primary node that handles all the write operations. The secondary
nodes are copies of the primary node and replicate its data. These secondary
nodes can only serve read operations (unless you configure them to allow
specific read preferences).
The
Replication Process:
1. Primary
Node:
o
The primary node is the node where all the write
operations happen.
o
All data changes made on the primary node are
recorded in an operation log (also called the oplog).
o
The primary node sends this oplog to the secondary
nodes to ensure they have an up-to-date copy of the data.
2. Secondary
Nodes:
o
The secondary nodes replicate the oplog from the
primary node.
o
They apply the operations in the oplog to their own
data set, making them exact replicas of the primary node.
o
If the primary node fails, one of the secondaries
can be promoted to primary to ensure continuity of operations.
Replica
Set Configuration
To set up replication in MongoDB,
you need to configure a replica set. A replica set consists of:
- One
primary node: This is where all write operations happen.
- One
or more secondary nodes: These nodes replicate data
from the primary node.
Steps to
Set Up a Replica Set:
1. Install
MongoDB on Multiple Servers: You need at least three MongoDB instances
running on different machines (or different ports on the same machine if you're
working in a local environment).
Example of setting up MongoDB on
three nodes:
o
Node 1: mongodb://localhost:27017
o
Node 2: mongodb://localhost:27018
o
Node 3: mongodb://localhost:27019
2. Start
MongoDB Instances with Replica Set Configuration: On each
of the MongoDB instances, start the MongoDB process with the --replSet option
to specify the replica set name.
Example command to start MongoDB
with replica set configuration:
css
CopyEdit
mongod --port
27017 --dbpath /data/db --replSet rs0
mongod --port
27018 --dbpath /data/db --replSet rs0
mongod --port
27019 --dbpath /data/db --replSet rs0
In the above command, rs0 is the
name of the replica set.
3. Initiate
the Replica Set: Once the MongoDB instances are started, connect
to one of the nodes (usually the first one) using the mongo shell:
css
CopyEdit
mongo --port
27017
Initiate the replica set using
the rs.initiate() command:
js
CopyEdit
rs.initiate()
This will initiate the replica
set with the current node as the primary node.
4. Add
Secondary Nodes to the Replica Set: After initiating the replica
set, you need to add the other nodes (secondary nodes) to the replica set. This
can be done with the rs.add() command:
js
CopyEdit
rs.add("localhost:27018")
rs.add("localhost:27019")
This will add the secondary nodes
to the replica set, and they will begin replicating data from the primary node.
5. Check
Replica Set Status: To check the status of the replica set, you can
run:
js
CopyEdit
rs.status()
This will show you the status of
all the nodes in the replica set, including whether they are primary or
secondary.
Configuring
Read and Write Operations
MongoDB allows you to configure
read and write operations in a replica set to meet your application’s needs.
1. Write
Concern: Write concern specifies the level of acknowledgment requested from
MongoDB for write operations. It can be set to:
o
w: 1: Acknowledgment from the primary
node only.
o
w: majority: Acknowledgment from the
majority of nodes in the replica set (higher durability).
2. Read
Concern: Read concern defines the consistency and isolation level for reading
data. Options include:
o
local: Reads data from the most recent
version in the replica set, which may not be the most up-to-date data.
o
majority: Reads data that has been
acknowledged by the majority of nodes.
3. Read
Preferences: You can configure read preferences to determine
where MongoDB reads data from in a replica set. The options include:
o
Primary: Always read from the primary
node.
o
Secondary: Always read from a secondary
node.
o
Nearest: Read from the node with the
least network latency, whether primary or secondary.
Handling
Failover and Elections
In a replica set, if the primary
node fails, one of the secondary nodes will automatically be elected as the new
primary. This is known as failover.
- Automatic
Failover: If the primary node becomes unavailable, an election
process is triggered where the secondaries vote to elect a new primary.
- Election
Process: The process takes a few seconds to a minute.
During this time, the replica set might be temporarily unavailable for
writes, but once a new primary is elected, the system resumes operations.
Monitoring
and Managing Replica Sets
MongoDB provides tools to monitor
and manage replica sets:
- rs.status():
Returns the status of the replica set, including the state of each node.
- rs.printReplicationInfo():
Displays the replication state of each node in the replica set.
- rs.stepDown():
Forces the current primary to step down and trigger a new election.
Summary
- MongoDB
Replication ensures data redundancy, fault
tolerance, and high availability through the use of replica
sets.
- A replica
set consists of a primary node and multiple secondary nodes
that replicate data.
- Automatic
failover ensures that when the primary node fails, one
of the secondary nodes is promoted to primary.
- Proper
configuration of write concern, read concern, and read
preferences allows for fine-grained control over the behavior of the
replica set.
- Regular
monitoring and management of the replica set ensure the health and
availability of your MongoDB deployment.
Practice
Exercise:
1. Set up a
MongoDB replica set with three nodes on your local machine.
2. Perform
some write operations on the primary node and check the data on the secondary
nodes.
3. Simulate
a failover by stopping the primary node and observing the automatic promotion
of a secondary node.
4. Experiment
with different write and read concerns to understand their impact on
operations.
By understanding and setting up
MongoDB replication, students will be able to ensure that their MongoDB
databases remain highly available and resilient to failures.
4o mini
You said:
Give answer...Practice Exercise:
Set up a MongoDB replica set with three nodes on your local machine. Perform
some write operations on the primary node and check the data on the secondary
nodes. Simulate a failover by stopping the primary node and observing the
automatic promotion of a secondary node. Experiment with different write and
read concerns to understand their impact on operations.
ChatGPT
said:
ChatGPT
Practice
Exercise: MongoDB Replica Set
Let's go step by step to complete
the tasks mentioned in the exercise.
Step 1:
Set Up a MongoDB Replica Set with Three Nodes on Your Local Machine
1. Install
MongoDB: Make sure you have MongoDB installed on your machine. If not, you can
follow the installation guide from the official
MongoDB website.
2. Start
MongoDB Instances for the Replica Set: You will need to start three
separate MongoDB instances, each running on different ports. Open three
different terminal windows and run the following commands:
o
For Node 1 (Primary):
bash
CopyEdit
mongod
--port 27017 --dbpath /data/db1 --replSet rs0
o
For Node 2 (Secondary):
bash
CopyEdit
mongod
--port 27018 --dbpath /data/db2 --replSet rs0
o
For Node 3 (Secondary):
bash
CopyEdit
mongod
--port 27019 --dbpath /data/db3 --replSet rs0
3. This will
start three MongoDB instances on ports 27017, 27018, and 27019, and each
instance is part of the rs0 replica set.
4. Initiate
the Replica Set: Now, connect to the primary node (Node 1) using
the Mongo shell:
bash
CopyEdit
mongo
--port 27017
Run the following command to
initiate the replica set:
javascript
CopyEdit
rs.initiate()
This will initialize the replica
set. To add the secondary nodes to the replica set, run the following commands
in the same Mongo shell:
javascript
CopyEdit
rs.add("localhost:27018")
rs.add("localhost:27019")
After this, the replica set is
configured with one primary node and two secondary nodes.
5. Check the
Status of the Replica Set: To verify the replica set status, use the
following command:
javascript
CopyEdit
rs.status()
This should show the primary node
and two secondary nodes. The primary node should have the state 1, while the
secondary nodes should have state 2.
Step 2:
Perform Some Write Operations on the Primary Node and Check the Data on the
Secondary Nodes
1. Insert
Data into the Primary Node: Now, perform some write operations on the primary
node. You can connect to the primary node using the Mongo shell:
bash
CopyEdit
mongo
--port 27017
Create a database and collection:
javascript
CopyEdit
use mydb
db.myCollection.insert({
name: "John", age: 30 })
This inserts a document into the myCollection
collection in the mydb database.
2. Check
Data on Secondary Nodes: To check if the data has been replicated to the
secondary nodes, connect to one of the secondary nodes:
bash
CopyEdit
mongo
--port 27018
Then, run the following query to
verify the data:
javascript
CopyEdit
use mydb
db.myCollection.find()
You should see the document {
name: "John", age: 30 } that was inserted on the primary node,
indicating that the data was successfully replicated.
Step 3:
Simulate a Failover by Stopping the Primary Node and Observing the Automatic
Promotion of a Secondary Node
1. Stop the
Primary Node: To simulate a failure, stop the primary node
(Node 1) by killing the process in the terminal where it is running. You can
press Ctrl+C or use the following command to kill the process:
bash
CopyEdit
kill
<PID_of_Node1>
2. Observe
the Failover Process: After stopping the primary node, MongoDB will
automatically trigger a failover process and elect a new primary from the
secondary nodes.
You can check the status of the
replica set to see which node has been promoted to primary:
bash
CopyEdit
mongo
--port 27018
rs.status()
The secondary node that is
promoted to primary should have the state 1, and the old primary node will now
be a secondary node with state 2.
3. Verify
the New Primary: After the failover, perform some write operations
on the new primary to check if the system is functioning correctly.
javascript
CopyEdit
db.myCollection.insert({
name: "Alice", age: 25 })
Then, check the data on the new
primary and secondary nodes to confirm that the write operation is replicated.
Step 4:
Experiment with Different Write and Read Concerns to Understand Their Impact on
Operations
1. Write
Concern: Write concern specifies the level of acknowledgment requested from
MongoDB for write operations. You can set the write concern when performing a
write operation.
Example:
o
Write Concern: w: 1 (acknowledgment from primary)
javascript
CopyEdit
db.myCollection.insert({
name: "Bob", age: 35 }, { writeConcern: { w: 1 } })
o
Write Concern: w: majority (acknowledgment from the
majority of replica set members)
javascript
CopyEdit
db.myCollection.insert({
name: "Charlie", age: 40 }, { writeConcern: { w: "majority"
} })
You can check the impact by
observing the acknowledgment behavior (whether the operation succeeds or fails
based on the write concern).
2. Read
Concern: Read concern specifies the consistency level of the data being read.
Example:
o
Read Concern: local (return the most recent data
from the replica set, which may not be the latest)
javascript
CopyEdit
db.myCollection.find().readConcern("local")
o
Read Concern: majority (return data that has been
acknowledged by the majority of the replica set members)
javascript
CopyEdit
db.myCollection.find().readConcern("majority")
The read concern affects the data
you see, especially during a failover or if the primary has not yet replicated
the changes to secondary nodes.
3. Read
Preferences: MongoDB allows you to configure read preferences
to choose where to read the data from. By default, reads are directed to the
primary node.
Example:
o
Read from Secondary Nodes:
javascript
CopyEdit
db.myCollection.find().readPref("secondary")
o
Read from Nearest Node:
javascript
CopyEdit
db.myCollection.find().readPref("nearest")
This controls where you read the
data from in a replica set, and it’s useful for load balancing and reducing
latency.
Summary
of Results
- Replication
ensures high availability and data redundancy.
- Automatic
failover promotes a secondary node to primary if the
primary node fails.
- Write
Concern (w: 1 or w: majority) controls the
acknowledgment behavior for writes.
- Read
Concern (local or majority) controls the consistency
level for reads.
- Read
Preference (secondary, nearest) helps optimize where
data is read from in a replica set.
This exercise should help you
understand how replication works, how failover is handled, and how to configure
MongoDB for different consistency and performance needs.
