MongoDB Map-Reduce: Legacy Aggregation
MongoDB Map-Reduce: Legacy Aggregation
MongoDB Map-Reduce: Legacy Aggregation
Tip: Map Function
$map = 'function() { emit(this.category_id, 1); }';
$reduce = 'function(key, values) { return Array.sum(values); }';
Gotcha: Deprecated
Map-reduce is deprecated since MongoDB 5.0. Use aggregation pipeline instead.
Tip: Aggregation Replacement
['$group' => ['_id' => '$category_id', 'count' => ['$sum' => 1]]]
Gotcha: Map-Reduce Performance
Aggregation pipeline is faster and more flexible than map-reduce.
Tip: Output Collection
$collection->mapReduce($map, $reduce, ['out' => 'category_counts']);
Gotcha: JavaScript Execution
Map-reduce runs JavaScript. Aggregation pipeline runs native code.
Tip: Embed or Reference? The 80/20 Rule
If you always access data together, embed it. If you access it independently, reference it. The 16MB document size limit is the hard boundary — stay under 1MB for most documents.
Tip: Index Your Query Patterns, Not All Fields
Creating indexes on every field wastes RAM. Use explain() to find in-memory sorts and collection scans. Index only what your actual queries filter on.
Gotcha: No Transaction Rollback for Index Builds
Building an index on a large collection can take hours. If it fails midway, the partial index is silently discarded. Plan index builds during maintenance windows.
Senior Insight
The aggregation pipeline is MongoDB's equivalent of SQL's complex queries, and it's far more powerful. My most important lesson: always put $match as the first stage. An aggregation that filters 10 million documents down to 1,000 should scan only those 1,000 through the remaining stages. I've optimized pipelines by moving $match from position 5 to position 1 and reducing execution time from 30 seconds to 200ms.
Source: MongoDB Developer Center (https://www.mongodb.com/developer/), MongoDB Engineering Blog (https://www.mongodb.com/blog/channel/engineering-blog), Studio 3T Blog (https://studio3t.com/blog/)