MongoDB Data Modeling: Anti-Patterns
MongoDB Data Modeling: Anti-Patterns
MongoDB Data Modeling: Anti-Patterns
Tip: Unbounded Arrays
// Bad: comments can grow infinitely
$post->comments = [...10000 items...];
Use references for unbounded growth.
Gotcha: Giant Documents
Approaching 16MB limit means your model is wrong. Split into multiple collections.
Tip: Over-Normalization
MongoDB is not relational. Too many references defeats the purpose of document storage.
Gotcha: Missing Indexes
Every query should be backed by an index. Monitor with explain().
Tip: Schema Drift
Without validation, documents can have different structures. Add schema validation.
Gotcha: Storing Computed Data
Don't store data that can be computed. Use aggregation pipelines instead.
Tip: Embed or Reference? The 80/20 Rule
If you always access data together, embed it. If you access it independently, reference it. The 16MB document size limit is the hard boundary — stay under 1MB for most documents.
Tip: Index Your Query Patterns, Not All Fields
Creating indexes on every field wastes RAM. Use explain() to find in-memory sorts and collection scans. Index only what your actual queries filter on.
Gotcha: No Transaction Rollback for Index Builds
Building an index on a large collection can take hours. If it fails midway, the partial index is silently discarded. Plan index builds during maintenance windows.
Senior Insight
The MongoDB aggregation pipeline is one of the most powerful query engines I've worked with. It takes time to learn, but the expressiveness is remarkable. I teach my team to think of the pipeline as a data processing stream: filter early, transform in the middle, and aggregate at the end. Each stage should do exactly one thing. A well-structured pipeline of 5-7 stages is easier to debug than a single complex query with nested conditions.
Source: MongoDB Developer Center (https://www.mongodb.com/developer/), MongoDB Engineering Blog (https://www.mongodb.com/blog/channel/engineering-blog), Studio 3T Blog (https://studio3t.com/blog/)