MongoDB using PyMongo in Python
Drivers for Python are among the many widely used programming languages that MongoDB, Inc. formally supports. PyMongo is the Python driver for MongoDB. Regular upgrades and maintenance are guaranteed because MongoDB, Inc. handles its own upkeep. Because of its language-sensitivity and straightforward design, the PyMongo API is easy for developers to use and productive. Its API and the MongoDB shell are comparable, making it simple to move queries from the shell to application code. Typically, PyMongo documents are represented as Python dictionaries.
Installation of PyMongo
Installing PyMongo on your computer is the first step in using it.
For Linux
- Pip, which is Python’s package installer, can be used to install PyMongo. The installation of python-pip may be required if you do not already have pip.
- Install PyMongo with sudo pip install pymongo once pip is accessible.
- You may also manually download the tarball from the PyMongo plugin site (e.g., pypi.python.org/pypi). Extract it and execute sudo python setup.py install from the extracted directory.
For Windows
- PyMongo installs easily on Windows.
- PyMongo may be installed with setuptools’ easy_install. If setuptools isn’t installed, download it from Python Package Index.
- After installing setuptools, open a command prompt, look for Python’s Scripts subdirectory (e.g., C:\Python27\Scripts), and run easy_install PyMongo.
- The GitHub website also offers precompiled binaries for the PHP driver (much the same method may be used to install other Windows drivers).
After installation, type import pymongo in Python to test PyMongo. No error means installation was successful.
Connecting to MongoDB with PyMongo
MongoClient is used to connect to MongoDB. Unlike previous driver versions that did not recognise writings by default, this class has become the standard for connections and guarantees that writes are acknowledged by default.
Example Code for Connection:
from pymongo import MongoClient
# Connect to MongoDB running on localhost at default port 27017
uri = "mongodb://localhost:27017/"
client = MongoClient(uri)
# Select a database (e.g., 'test_db')
# If the database doesn't exist, MongoDB will create it automatically upon first data insertion
db = client['test_db'] # or db = client.test_db
# Select a collection (e.g., 'test_collection')
collection = db['test_collection'] # or collection = db.test_collection
print("Connected to MongoDB and selected collection successfully!")
Basic Data Operations
Standard CRUD (Create, Read, Update, Delete) activities are possible once linked. The PyMongo API often uses the same syntax as the Mongo shell.
- Inserting Documents (Create): Add documents to a collection.
- Querying Documents (Read): The find() function retrieves numerous documents, while findOne() retrieves one.
- Updating Documents (Update): Use update() to change documents.
- Deleting Documents (Delete): To delete documents, use delete_one() or delete_many().
GridFS with Python
GridFS is a specification that is built on top of standard MongoDB documents and is used to store huge files in MongoDB databases. No “special-case” treatment of GridFS is done by the MongoDB server; all work is handled by the client-side tools and drivers. An API that adheres to Python principles is offered by PyMongo for interacting with GridFS.
Example Code for GridFS operations:
import gridfs
from pymongo import MongoClient
client = MongoClient('localhost', 27017)
db = client.test # Connects to the 'test' database
fs = gridfs.GridFS(db) # Initializes GridFS for the 'test' database
# Putting a file into GridFS
# In modern PyMongo, this is typically done using bytes.
file_content = b"This is some sample text for my file."
file_id = fs.put(file_content, filename="my_sample_file.txt")
print(f"File stored with ID: {file_id}")
#
# Listing files in GridFS
print("Files in GridFS:", fs.list())
#
# Retrieving a file from GridFS
retrieved_file = fs.get(file_id)
print(f"Retrieved file content: {retrieved_file.read().decode()}")
#
Internal Workings and Nuances
Important tasks are carried out by MongoDB drivers, such as PyMongo, in the background:
- BSON Conversion: They convert Python dictionaries to and from BSON (Binary JSON), MongoDB’s binary data format for storage and communication. BSON encodes and decodes faster than JSON and provides additional data types.
- Object ID Generation: Drivers automatically create the unique values for the _id field (Object IDs) in MongoDB for newly created documents.
- Network Communication: A thin wire protocol is used to connect to the MongoDB server over a TCP socket.
Because of the MongoDB API’s cross-language consistency, learning the fundamentals of one driver’s API makes learning others simple.