Page Content

Tutorials

What is a Capped Collection in MongoDB?

Capped Collection in MongoDB

Fixed-size collections known as “capped collections” preserve the documents’ insertion sequence. A capped collection acts as a circular buffer, automatically overwriting its oldest entries to create space for new documents once it reaches the maximum size allotted. The automatic “age-out” feature is a crucial feature.

Ideal Use Cases: The following uses are ideal for Capped Collections:

Capped Collections in MongoDB
Capped Collections in MongoDB
  • Logging: Because entries happen quickly and the built-in first-in-first-out (FIFO) characteristic controls storage usage while maintaining event order, log information produced by high-volume systems should be stored.
  • Caching: Caching modest bits of data, however it’s crucial for read-heavy caches to accept a write penalty for indexes or make sure the collection remains in RAM.
  • Time-Series Data: Good for information that must be retrieved in insertion order, like the most recent occurrences.
  • Internal Replication Oplog: MongoDB maintains a rolling record of all data changes by using a capped collection for the oplog.rs (operation log) in replica sets.

Creating Capped Collections: The db.createCollection() command must be used to expressly create a capped collection. The collection’s size in bytes and capped: true must be specified. Although the size restriction takes precedence, you can also set an extra max parameter to restrict the number of documents. Unless the size is 4096 bytes or fewer, MongoDB rounds the size to an integer multiple of 256.

Example: Create a 10,000-byte capped collection named myLogCollection:

db.createCollection("myLogCollection", {capped: true, size: 10000}) 

You can also limit the number of documents, for instance, to a maximum of 1000:

db.createCollection("myLogCollection", {capped: true, size: 100000, max: 1000})

Use db.runCommand({“convertToCapped”: “collectionName”, size: <sizeInBytes>}) to convert an ordinary collection to a capped one.

Properties and Limitations

  1. Fast Writes: Since new documents may always be appended to the end of capped collections without having to look for available space, inserts into these collections are incredibly quick.
  2. Guaranteed Order: Queries return documents in insertion order (natural order), which coordinates with disc order, making them efficient. Find documents in reverse order using db.collection.find().sort({“$natural”: -1}).
  3. Updates and Deletes:
    • Updates that might result in a document exceeding its initial allotted space would fail because documents cannot expand in size.
    • A capped collection cannot have individual papers removed. Drop the collection to eliminate all documents.
  4. Indexing: MongoDB 2.2 and later capped collections have a default _id index. If you intend to update documents in a capped collection, you should specifically establish an index to prevent table scans.
  5. Sharding: Sharding is not possible for capped collections.
  6. Tailable Cursors: These unique cursors are made for collections that have caps. Like the Unix tail -f command, they stay open after the initial results have been exhausted and keep retrieving new documents as they are added. The initial scan can be costly, but subsequent retrievals of fresh documents are quick because tailable cursors don’t require indexes and return documents in a natural sequence.

MongoDB TTL Collections

MongoDB TTL Collections automatically delete documents after a set time. The TTL index is a unique kind of index that is used to accomplish this feature.

Ideal Use Cases: TTL indices are especially helpful in:

MongoDB TTL Collections
MongoDB TTL Collections
  • Session management: When a user session is inactive for a while, it automatically ends.
  • Event Logs: Keeping track of logs or event data produced by machines that are only pertinent for a short time.
  • Caching: Using a more adaptable caching system as opposed to capped collections, which have time-based data expiration.

Creating TTL Indexes: The db.collection is used to establish a TTL index.Apply the expireAfterSeconds option to a field containing a BSON date type or an array of BSON date-typed objects using the createIndex() method.

Example: To remove documents one hour (3600 seconds) after the time indicated in createdAt, for instance, a TTL index can be created on the createdAt field of a log_events collection:

db.log_events.createIndex( { "createdAt": 1 }, { expireAfterSeconds: 3600 } ) 

Make that the createdAt field is set to a BSON date value when adding documents; this is usually new Date() for the current time. You can make the indexed date field itself the desired expiration time and set expireAfterSeconds to 0 if you want documents to expire at a particular future clock time.

Behavior and Restrictions

  1. Background Deletion: By reading the date-typed values in the TTL index, a background thread running in the mongod process is in charge of eliminating expired documents. About once every minute, this task executes. As a result, documents may stay in the collection for a while after they reach the expiration threshold and deletion is not always instantaneous.
  2. Updating Expiration: You can change the value of the indexed date field to a more current time in order to keep a document (such as an ongoing user session) from expiring.
  3. Index Type: Single-field indexes are TTL indexes. Compound indexes disregard expireAfterSeconds and do not support the TTL property.
  4. _id Field Incompatibility: The _id field cannot have a TTL index created on it.
  5. Capped Collection Incompatibility: Because capped collections do not permit the deletion of individual documents, TTL indexes cannot be built on them.
  6. Modifying expireAfterSeconds: createIndex() cannot be used to directly alter an existing TTL index’s expireAfterSeconds value. Alternatively, you have to drop and regenerate the index or use the collMod database command.
  7. Replica Sets: Only documents on the primary member are deleted by the TTL background thread in a replica set. These deletion procedures from the primary are subsequently replicated by secondary members.
  8. Performance: Because the removal procedure entails navigating the index in a manner akin to a user-requested delete action, TTL indexes, despite their flexibility, may not be appropriate for collections with exceptionally high write volumes.
  9. Storage Allocation: To reduce fragmentation brought on by frequent deletions, collections with TTL indexes employ a “power of 2 sized allocations” technique to manage disc space. This strategy allots more space in relation to the document size.

In conclusion, both Capped and TTL Collections provide unique methods for managing data, mostly for situations in which data must be inserted and retrieved in an ordered manner or has a limited lifespan. Your decision is based on whether you require time-based automatic data expiration (TTL) or a fixed-size buffer with tight insertion order (Capped).

Index