Learning mongo, part 3: Test Results
If you haven't read part 1 and part 2, suffice to say I got mongo going, loaded some data into it, and then was surprised to find out that map reduce was literally the same speed as doing it in Python. And then, I figured out a few tricks to make things a bit faster.
The Apparatus
My client machine is a new iMac Corei5, running Python 3.4.
My mongo server machine is an abused Core2 T9600 @ 2.8GHz running Mint linux 17, and Mongod 2.4. It features a startling 4GB of RAM and a 500GB disk that I found in a pile. The T9600 gets a score of 1956 from Passmark, a not exactly stellar score for 2015, but plenty good for what I use it for. (the i5 scores 6763).
Both systems have the kind of disks with moving parts, and are connected to each other with ancient catX ethernet cables (where X at one time was probably 5, I'm not so sure anymore) through an old consumer-grade gigabit switch, which I often keep under a pile of laundry so it stays warm.
The test methodology
I ran tests with on a database with 1.3 million documents, with an average size of about 1400 (bytes), and 40 fields a piece.
I ran each test ten times, and then I averaged the results.
Results
Map Reduce: The functions mentioned in part 1 are loaded and run, and then again with jsMode turned on.
Python: A simple list based stats program, the selected value is pulled from the database, stored in a list, and then computations are run on the list.
Python, Accumulators: Stats are computed using accumulator variables, to avoid list overhead.
Map Reduce 2: The same functions as the other map reduce, except that they are stored in the database first.
The count shows the number of documents that were processed (via limit).
The numbers in the columns are the average number of documents processed per millisecond (higher numbers are better), followed by the standard deviation, σ (lower is better). The winners are shown in boldface.
| Count | Map Reduce | Map Reduce, jsMode |
Python | Python, Accumulators |
Map Reduce 2 | Map Reduce 2, JS Mode |
|---|---|---|---|---|---|---|
| 10⁰ | 0.08 σ=0.09 | 0.3 σ=0.03 | 1.2 σ=0.07 | 1.5 σ=0.02 | 0.2 σ=0.07 | 0.2 σ=0.08 |
| 10¹ | 0.7 σ=0.8 | 1.8 σ=0.7 | 8.3 σ=0.2 | 13.9 σ=0.8 | 1.9 σ=0.7 | 1.5 σ=0.8 |
| 10² | 6.8 σ=4.2 | 14.0 σ=4.5 | 31.5 σ=0.6 | 42.5 σ=1.9 | 16.9 σ=6.2 | 18.1 σ=6.6 |
| 10³ | 20.8 σ=3.5 | 52.3 σ=10.2 | 46.7 σ=2.5 | 62.7 σ=0.5 | 65.9 σ=29.5 | 61.7 σ=30.8 |
| 10⁴ | 24.74 σ=0.7 | 64.2 σ=6.6 | 50.3 σ=0.6 | 64.7 σ=0.1 | 119.6 σ=10.5 | 128.4 σ=23.3 |
| 10⁵ | 24.72 σ=0.7 | 69.2 σ=1.5 | 50.9 σ=0.7 | 66.0 σ=0.5 | 121.5 σ=5.0 | 123.7 σ=6.2 |
| 10⁶ | 22.5 σ=0.2 | 49.7 σ=4.3 | 37.2 σ=0.8 | 42.6 σ=1.5 | 69.2 σ=1.8 | 71.0 σ=2.2 |
Observations
For some reason, I measured a fairly high standard deviation when running the map reduce functions internal to the database. Without knowing what is going on in there, I must speculate that this variability is caused by something like garbage collection, or some other external factors, such as the server dipping into virtual memory.
As with many things in computing, there is always some amount of overhead in doing the calculations (setting up a network socket, seeking to the right place on disk, etc.). As such, there is some efficiency with scale. Presumably, nobody would bother running Map Reduce on 1 element. The data shows this pretty clearly.
The table seems to show that maximum performance is achieved somewhere between 10⁴ and 10⁵ documents. As the number approaches 10⁶ documents, the performance goes down, quite dramatically in some cases. If these numbers are indicative of how Mongo works in general, then aggregations should be limited to smallish batches.
With map reduce, processing a million documents takes about 14 seconds. However, ten batches of one hundred thousand documents should take about 8.08 seconds, apparently almost twice as fast as doing a million in a single batch. Processing a hundred batches of ten thousand documents should take about 7.8 seconds, which may be slightly faster than still.
Where to do aggregation
If the data is small, (i.e., < 10,000 documents), then it's probably better to stay in python. In reality, it's probably better to stay in python anyway, since logic really doesn't belong in the data store.
If the data that you're aggregating is bigger than that, but smaller than 1,000,000 documents, then still do it in python (but break it up into chunks). I ran some quick multithreaded python tests, and was able to churn through 1,000,000 documents in about 8.2 seconds (σ=2.4), faster than any of the other tests. I imagine that you'd get similar results by doing multiple Map Reduce jobs, but I haven't measured that.
If you have more than a million documents, I have no advice. Mongo is noticeably slower with 10⁶ documents in the datastore than it was with only 10⁵. With more than a million, I'd expect that it would continue to get slower. Check back in a few months once I've collected more data.
Obvious performance improvements
It's the CPU, stupid
I ran the in-database Map reduce function in jsMode on my iMac. As mentioned above, the iMac's CPU is roughly 6763/1956 = 3.4 times faster than the Core2. And, surprisingly, all of my tests ran a bit over three times faster. This tells me that mongo's performance is very CPU dependent (I expected some performance loss to I/O). However, even though the tests ran faster, they seemed to show the same dip in performance starting after 10,000 documents.
| Count | Map Reduce2, jsMode |
Python, Accumulators |
|---|---|---|
| 10⁰ | 0.3 σ=0.12 | 3.9 σ=0.2 |
| 10¹ | 1.6 σ=1.7 | 23.6 σ=2.0 |
| 10² | 39.1 σ=2.9 | 119.7 σ=3.7 |
| 10³ | 126.1 σ=45.2 | 168.8 σ=11.7 |
| 10⁴ | 211.9 σ=45.3 | 177.9 σ=12.0 |
| 10⁵ | 229.8 σ=21.7 | 187.6 σ=5.9 |
| 10⁶ | 220.0 σ=10.6 | 190.8 σ=2.0 |
The relatively high standard deviation is likely due to running the client and server on the same system, and/or too many cores in use interfering with my CPU's ability to keep its turbo spun up.
Use a SQL database
Yeah, that may be kind of silly, since this post is about Mongo. However, Postgres (the not-too-fast-snail of databases) performs this aggregation in about 950 ms on 1,000,000 rows (on the T9600). For comparison, that's a bit over 1000 documents/ms, ten times faster than Mongo on the same server, and 4.5x faster than Mongo on the Core i5.
For non-relational data, an SQL database is definitely a bit awkward. But SQL database developers have been learning how to make their products fast and practical for the last forty years, so it's not as though they are useless when it comes to storing data and crunching numbers.
I'm also not the only one who's noticed that Postgres is faster than Mongo. That, and Postgres has that new JSON column.
Maybe my comment above should have said that postgres is the racing snail of databases.

A point for mongo though, it inserts much faster than postgres. As you might imagine, it takes a while to get a million rows of data across the wire, but postgres took roughly 30% than mongo.
Conclusion
As they say, 97% of the time, optimization is the root of all evil. However, this was an exercise for me to get a feel for how my database performs under some load. I've just scratched the surface of performance tuning, and still very much getting used to mongo.
My simple example probably doesn't do Map reduce any justice. I suspect that where it really shines is when you have many millions of documents, lots of collections, many sharded databases, and have a big aggregation pipeline.
My needs are pretty modest by comparison, so it may simply be that my data isn't "big data" enough for it to matter.
But in terms of simplicity and utility, it's hard to beat db.collection.insert({whatever: you_want}). Mongo's documentation makes it seem like setting up replication isn't much harder than that either. So while Postgres may be faster in (cheap) computation time, mongo is way faster in (expensive) developer time.
Most importantly, I'm having a bit of fun learning about Mongo. I look forward to stowing it in my programming toolkit.