$ lexprog.com

// notes from an old coder -- php, databases, and the occasional rant

[March 08, 2026] MongoDB

MongoDB GridFS for File Storage

MongoDB GridFS: Tips & Tricks

────────────────────────────────────────────────────────

MongoDB GridFS: Tips & Tricks

Tip: GridFS for Files Over 16MB

MongoDB's BSON limit is 16MB. GridFS splits large files into chunks.

$bucket = DB::connection('mongodb')->getMongoDB()->selectGridFSBucket();

Gotcha: GridFS Uses Two Collections

fs.files (metadata) and fs.chunks (file data). Don't query them directly.

Tip: Upload a File

$stream = fopen('/path/to/file.pdf', 'r');
$fileId = $bucket->uploadFromStream('file.pdf', $stream, [
    'metadata' => ['author' => 'John', 'type' => 'pdf'],
]);

Gotcha: GridFS Doesn't Support Partial Reads Well

Reading a small portion of a large file still loads all chunks.

Tip: Download a File

$stream = $bucket->openDownloadStreamByName('file.pdf');
fpassthru($stream);

Gotcha: GridFS Metadata Queries

$files = $bucket->find(['metadata.type' => 'pdf']);

Query metadata like any other MongoDB query.

Tip: Delete a File

$bucket->delete($fileId);

Removes both the file document and all chunks.

Gotcha: GridFS vs S3

For production file storage, prefer S3 or similar. GridFS is good for small deployments or when you need transactional file storage.

Tip: Stream Directly to Response

$stream = $bucket->openDownloadStream($fileId);
return response()->stream(function () use ($stream) {
    fpassthru($stream);
}, 200, ['Content-Type' => 'application/pdf']);

Tip: Embed or Reference? The 80/20 Rule

If you always access data together, embed it. If you access it independently, reference it. The 16MB document size limit is the hard boundary — stay under 1MB for most documents.

Tip: Index Your Query Patterns, Not All Fields

Creating indexes on every field wastes RAM. Use explain() to find in-memory sorts and collection scans. Index only what your actual queries filter on.

Gotcha: No Transaction Rollback for Index Builds

Building an index on a large collection can take hours. If it fails midway, the partial index is silently discarded. Plan index builds during maintenance windows.

Senior Insight

GridFS in MongoDB is useful when you need file storage alongside your document data, but it's slower than dedicated object storage (S3, GCS). I've used GridFS for storing user-generated images in applications where file counts were under 100K and S3 wasn't available. The limitation: GridFS splits files into 255KB chunks, so retrieving a 10MB file requires reading 40 chunks. For any application serving files directly to users, S3 is strictly better.

Source: MongoDB Developer Center (https://www.mongodb.com/developer/), MongoDB Engineering Blog (https://www.mongodb.com/blog/channel/engineering-blog), Studio 3T Blog (https://studio3t.com/blog/)

────────────────────────────────────────────────────────
<-- back to posts