[March 29, 2026] MongoDB

MongoDB Aggregation Pipeline in Laravel

MongoDB Aggregation Pipeline: Tips & Tricks

────────────────────────────────────────────────────────

MongoDB Aggregation Pipeline: Tips & Tricks

Tip: Basic Aggregation

Post::raw(function ($collection) {
    return $collection->aggregate([
        ['$match' => ['published' => true]],
        ['$group' => ['_id' => '$category_id', 'count' => ['$sum' => 1]]],
        ['$sort' => ['count' => -1]],
    ]);
});

Gotcha: `$match` Should Come First

Put $match at the start of the pipeline to filter early. MongoDB can use indexes at the first stage.

Tip: `$unwind` for Array Data

Post::raw(function ($collection) {
    return $collection->aggregate([
        ['$unwind' => '$tags'],
        ['$group' => ['_id' => '$tags', 'count' => ['$sum' => 1]]],
    ]);
});

Counts tag frequency across all posts.

Gotcha: `$lookup` for Joins

['$lookup' => [
    'from' => 'categories',
    'localField' => 'category_id',
    'foreignField' => '_id',
    'as' => 'category',
]]

MongoDB's version of a LEFT JOIN.

Tip: `$project` to Shape Output

['$project' => [
    'title' => 1,
    'comment_count' => ['$size' => '$comments'],
    'author_name' => '$author.name',
]]

Gotcha: Pipeline Memory Limit

Aggregation pipelines have a 100MB memory limit per stage. Use allowDiskUse: true for larger datasets.

Tip: `$facet` for Multiple Aggregations

['$facet' => [
    'totalPosts' => [['$count' => 'count']],
    'categories' => [['$group' => ['_id' => '$category_id']]],
]]

Runs multiple aggregations in one pipeline.

Gotcha: `$out` Writes to a Collection

['$out' => 'post_stats']

Creates/replaces a collection with the pipeline output. Destructive operation.

Tip: Embed or Reference? The 80/20 Rule

If you always access data together, embed it. If you access it independently, reference it. The 16MB document size limit is the hard boundary — stay under 1MB for most documents.

Tip: Index Your Query Patterns, Not All Fields

Creating indexes on every field wastes RAM. Use explain() to find in-memory sorts and collection scans. Index only what your actual queries filter on.

Gotcha: No Transaction Rollback for Index Builds

Building an index on a large collection can take hours. If it fails midway, the partial index is silently discarded. Plan index builds during maintenance windows.

Senior Insight

I've learned to be explicit about MongoDB write concerns. The default w: 1 acknowledges writes from the primary only, which means a failover can lose acknowledged writes. For critical data, I use w: majority to ensure writes are replicated to a majority of replicas. The trade-off is latency — waiting for majority acknowledgment adds network round-trips. For a logging system where data loss is acceptable, w: 1 is fine. For financial transactions, w: majority is the minimum.

Source: MongoDB Developer Center (https://www.mongodb.com/developer/), MongoDB Engineering Blog (https://www.mongodb.com/blog/channel/engineering-blog), Studio 3T Blog (https://studio3t.com/blog/)

────────────────────────────────────────────────────────

<-- back to posts

MongoDB Aggregation Pipeline: Tips & Tricks

Tip: Basic Aggregation

Gotcha: $match Should Come First

Tip: $unwind for Array Data

Gotcha: $lookup for Joins

Tip: $project to Shape Output

Gotcha: Pipeline Memory Limit

Tip: $facet for Multiple Aggregations

Gotcha: $out Writes to a Collection

Tip: Embed or Reference? The 80/20 Rule

Tip: Index Your Query Patterns, Not All Fields

Gotcha: No Transaction Rollback for Index Builds

Senior Insight

Gotcha: `$match` Should Come First

Tip: `$unwind` for Array Data

Gotcha: `$lookup` for Joins

Tip: `$project` to Shape Output

Tip: `$facet` for Multiple Aggregations

Gotcha: `$out` Writes to a Collection