Let me set the background by saying that I currently (until the end of the week anyway) work for a large tech company. We recently launched a reader app for iPad. On the backend we have a thin layer of PHP, and behind that a lot of processing via C# with Mono. I, along with my brother Jeff, wrote most of the backend (PHP and C#). The C# side is mainly a queuing system driven off of MongoDB.

Our queuing system is different from others in that it supports dependencies. For instance, before one job completes, its four children have to complete first. This allows us to create jobs that are actually trees of items all processing in parallel.

On a small scale, things went fairly well. We built the entire system out, and tested and built onto it over the period of a few months. Then came time for production testing. The nice thing about this app was that most of it could be tested via fake users and batch processing. We loaded up a few hundred thousand fake users and went to town. What did we find?

Without a doubt, MongoDB was the biggest bottleneck. What we really needed was a ton of write throughput. What did we do? Shard, of course. Problem was that we needed even distribution on insert…which would give us almost near-perfect balance for insert/update throughput. From what we found, there’s only one way to do this: give each queue item a randomly assigned “bucket” and shard based on that bucket value. In other words, do your own sharding manually, for the most part.

This was pretty disappointing. One of the whole reasons for going with Mongo is that it’s fast and scales easily. It really wasn’t as painless as everyone led us to believe. If I could do it all over again, I’d say screw dependencies, and put everything into Redis, but the dependencies required more advanced queries than any key-value system could do. I’m also convinced a single MySQL instance could have easily handled what four MongoDB shards could barely keep up with…but at this point, that’s just speculation.

So there’s my advice: don’t use MongoDB for evenly-distributed high-write applications. One of the hugest problems is that there is a global write lock on the database. Yes, the database…not the record, not the collection. You cannot write to MongoDB while another write is happening anywhere. Bad news bears.

On a more positive note, for everything BUT the queuing system (which we did get working GREAT after throwing enough servers at it, by the way) MongoDB has worked flawlessly. The schemaless design has cut development time in half AT LEAST, and replica sets really do work insanely well. After all’s said and done, I would use MongoDB again, but for read-mostly data. Anything that’s high-write, I’d go Redis (w/client key-hash sharding, like most memcached clients) or Riak (which I have zero experience in but sounds very promising).

TL,DR; MongoDB is awesome. I recommend it for most usages. We happened to pick one of the few things it’s not good at and ended up wasting a lot of time trying to patch it together. This could have been avoided if we picked something that was built for high write throughput, or dropped our application’s “queue dependency” requirements early on. I would like if MongoDB advertised the global write lock a bit more prominently, because I felt gypped when one of their devs mentioned it in passing months after we’d started. I do have a few other projects in the pipeline and plan on using MongoDB for them.

Trackback

OMGOSH 8 comments

  1. Nice post! The only thing I would add is that it is possible to create dependencies while background processing jobs. Taking your example, of having a parent wait till it’s children complete – you could have each of the child jobs record the fact that they have completed in a redis list and then just have a parent poll this list to determine if it can complete. This is really rough around the edges, but something along those lines would work for your use-case.

  2. Santosh, thanks for the response. Thinking about it further, there maybe is a way to deal with dependencies all within Redis. It would be pretty prone to failure I’d think, because a lot of what you’d use queries for in Mongo would translate into key name hacks and layers of complexity. Either that, or the jobs would have some sort of self-awareness such that after each one completes, it checks if it is the last one and if so, queues the parent job to run as a normal item. This would work well with Redis lists. So instead of starting off with a tree, the tree builds itself from the bottom up. Could work fairly well.

    This is something I might experiment with in my free time…thanks for the idea!

  3. Hi Andrew,

    I wonder, why did not you go with standard queuing software’s like ActiveMQ, ZeroMQ or even gearman? Implementing your own queuing system, sounds strange, was there any reason to do so?

  4. sudhir, glad you asked. Most (if not all) stock queuing systems don’t support dependencies, and most are terrible at HA and/or scaling (at least at the time the decision was made). MongoDB promised to solve all of these problems. The idea being we use the same language for the queue that we have in our backend, and the same datastore we’re already going to be scaling for the rest of the app as our queue storage. Also, the actual code that runs the queue is relatively simple…a handful of queries and some thread pooling. It’s the workers that were complicated ;).

    Given the chance, I would build a custom queuing system again. It’s not particularly complicated…you just have to choose the correct datastore. MongoDB *is not it*. I will, however, take a closer look at Active/Zero/RabbitMQ again and re-evaluate my position on this. A lot can happen in a year.

    Thanks for your comment!

  5. It is good to know about your experience with MongoDB.This is because i am evaluating different options available in the market to be used as a simple but very fast queuing/message bus which should also have a option of persistence.
    I read some good article provide even source code and highly recommend MongoDB, but your exp is totally different. Looking at the drawback of total database lock for a single update, it should be slow.I don’t know why people are saying it very fast, which you also acknowledge(not in case of queue).
    By the way, ZEROMQ is also known to be very fast but its have persistence option.

    Cheers
    Reena

  6. Andrew Lyon said

    Reena, thanks for your comment. MongoDB is a very fast system. It achieves this speed by returning a success from an insert/update before that update is written into the database. So while things appear fast from the app side, it’s internally queuing all these writes. If you get enough writes at once, the global write lock kicks in and things start slowing down. You can shard to work around this, but sharding on constantly-changing data (ie a message queue) just isn’t very feasible in mongo. It’s a great database, but not for high-write scenarios.

    ZeroMQ is a great library, but it’s not really a queuing system. It’s perfect for building communication channels between threads and between machines. You could easily build a persistent queuing system on top of it, but you could also use one of the systems people have already built.

    I would look at beanstalkd. It’s written in C (fast, efficient) and writes a binary log (persistent). It does not support sharding/replication, but you can shard your queue items from your application/worker side and spread your items out over several instances if needed. It really just works, and works well.

    Other than that, I’ve heard really good things about gearman, although I haven’t personally used it.

    Good luck!

  7. Chris Kerlin said

    Any improvement with mongo 2.2?

  8. Andrew Lyon said

    Chris: doubtful. Mongo has gone through some significant changes to mitigate the global write lock (particularly, when waiting on disk, it yields the lock) and has gained great write performance advantages, but still isn’t able to handle writes fast enough for any high-write scenario. The global write lock *still exists* in 2.2 and probably won’t be gone for a while. Don’t use MongoDB for high-write applications…