Lecture Notes Of Day 19:
MongoDB Data Import and Export
Objective:
Learn how
to import and export data to/from MongoDB using the mongoimport and mongoexport commands.
Outcome:
By the
end of this session, students will be able to use the mongoimport and mongoexport commands to
handle data migration between MongoDB and other data formats like JSON, CSV, or
TSV.
1. Introduction
to MongoDB Data Import and Export
In
MongoDB, data migration is a key task when moving data between different
databases, environments, or file formats. The MongoDB tools mongoimport and mongoexport help us import
and export data to/from MongoDB in various formats like JSON, CSV, and TSV.
- mongoimport:
It is used to import data into a MongoDB collection from JSON, CSV, or TSV
files.
- mongoexport:
It is used to export data from a MongoDB collection into JSON, CSV, or TSV
files.
These
tools are essential when you need to migrate data, create backups, or work with
MongoDB data in other applications.
2. Using
mongoimport
mongoimport is a tool that allows us to import
data from an external file into a MongoDB database. The basic syntax for the mongoimport command is as
follows:
bashmongoimport --db <database> --collection <collection> --file <filename> --type <data_format> --jsonArray
Parameters:
--db: Specifies the database name where the data will be imported.--collection: Specifies the collection where the data will be inserted.--file: Path to the input file that contains the data to be imported.--type: The format of the data file (e.g.,json,csv,tsv).--jsonArray: Used when the input data is a JSON array (optional but useful for array-based JSON files).
Example 1: Importing JSON Data
Assume we
have a file named students.json
containing student data in JSON format.
json[ { "name": "John", "age": 21, "major": "Computer Science" }, { "name": "Jane", "age": 22, "major": "Mathematics" }]
To import
this data into a MongoDB collection called students
in the university
database, we would use the following command:
bashmongoimport --db university --collection students --file students.json --type json --jsonArray
- This
command will read the
students.jsonfile and import the data into thestudentscollection of theuniversitydatabase.
Example 2: Importing CSV Data
If we
have a CSV file named courses.csv:
csvcourse_name,credits,department"Math 101",3,"Mathematics""CS 101",4,"Computer Science"
To import
this CSV data into MongoDB:
bashmongoimport --db university --collection courses --file courses.csv --type csv --headerline
- The
--headerlineflag tells MongoDB to use the first line of the CSV file as the field names (headers).
3. Using
mongoexport
mongoexport is used to export data from MongoDB
collections into a file in various formats like JSON, CSV, or TSV.
The basic
syntax for mongoexport
is as follows:
bashmongoexport --db <database> --collection <collection> --out <output_file> --type <data_format>
Parameters:
--db: The database to export data from.--collection: The collection to export data from.--out: The path to the output file where the exported data will be saved.--type: The desired output file format (json,csv,tsv).
Example 1: Exporting Data to JSON
Suppose
we want to export data from the students
collection in the university
database into a JSON file. We use the following command:
bashmongoexport --db university --collection students --out students_export.json --type json
- This
will export the entire
studentscollection into a JSON file namedstudents_export.json.
Example 2: Exporting Data to CSV
If you
want to export the courses
collection to a CSV file, you can use the following command:
bashmongoexport --db university --collection courses --out courses_export.csv --type csv --fields course_name,credits,department
- The
--fieldsparameter specifies which fields you want to export. In this case, we’re exporting the fieldscourse_name,credits, anddepartment.
4. Advanced
Options for Data Import and Export
Both mongoimport and mongoexport come with
several useful options to tailor the import/export process.
Using --upsert
in mongoimport
If you
want to update existing records during import rather than inserting duplicates,
you can use the --upsert
option. This will update existing documents if they match the _id field.
bashmongoimport --db university --collection students --file students.json --type json --jsonArray --upsert
Using --query
in mongoexport
To export
only specific documents from a collection, use the --query option to define a
filter.
bashmongoexport --db university --collection students --out filtered_students.json --type json --query '{"age": {"$gte": 22}}'
- This
will export only students whose age is 22 or greater.
Specifying Field Types for CSV Export
When
exporting to CSV, you can specify field types explicitly, but typically,
MongoDB auto-detects the field types.
5. Handling
Large Datasets
When
dealing with large datasets, it’s essential to keep in mind that MongoDB might
hit performance issues. Here are some tips for handling large imports/exports:
- Chunking:
Break large files into smaller parts and import/export them in chunks.
- Indexes:
Consider removing indexes temporarily during large imports to speed up the
process, then recreate them afterward.
6. Practical
Exercises
1.
Import Data: Create a books.json file with data about books (title,
author, year, genre) and import it into a MongoDB database named library.
2.
Export Data: Export the students collection to a CSV file that contains only
the name and age fields.
3.
Filter Data on Export: Use the --query option to export only the students who are
above the age of 20.
7. Conclusion
The mongoimport and mongoexport tools are
powerful utilities for importing and exporting data in MongoDB. By using these
tools, you can efficiently manage data migration, backups, and data integration
between MongoDB and other systems.
Key
Takeaways:
- mongoimport:
Used to import data into MongoDB from JSON, CSV, or TSV files.
- mongoexport:
Used to export data from MongoDB collections to JSON, CSV, or TSV files.
- Use
options like
--upsertand--queryto customize the import/export process. - Handling
large datasets requires careful attention to chunking and indexing.
