Sunday, March 30, 2014

Mysql to MongoDb, chapter 2: Diving in

[This is a second part in an ongoing series, part 1 is here)

So, after making the call to go mongo, doing a mongo 101 crash course, we've started working on two major fronts (Only two developers):
1. DB Layer rewrite - This was pretty much straight forward. We've had about 100 functions to rewrite, but a lot of them were simple CRUD function. We've decided to use the native mongo node drivers, as I don't like to use frameworks in my code. (There's mongoose, which is a nice ODM layer, but, as I've said, I'd rather use native stuff, unless there's a performance advantage there.)
Major points you need to consider when transforming your code:

- Mysql has auto count features for unique inserts in table. Mongo has a unique id per object in database. If you're not using one, single unique id (And you're not) for each record in your mysql database, you need to use some sort of an applicative counter solution for inserts in mongo. This is also very useful to return insert id for new entities.

- Type checks and conversion: Rather than using a framework, I've decided to implement a simple hash table for field names and types.

- Logging: Like printing mysql statements to log, write a function which logs your query objects in mongo native format. (Like db.users.find({name:"yuval"}). It makes it much easier to debug.

THIS ONE IS REALLY IMPORTANT:

Do not break mysql support in order to support mongo! Fix both the mysql and mongo db layers, not in your application layer! Make it work, and don't do irreversible things that would brake mysql support. Support switching form mysql to mongo in a single configuration flag, so you can compare performance.   

2. Data conversion - We used mongify, a neat ruby tool, which translates sql databases into mongo. Performance was a bit dodgy for huge tables, se we've contributed some code, which also upped the performance by ~20 times.

Important note for people using open source software - Don't just report bugs. You can fix stuff and contribute to the community. 
[Especially if you're using it for commercial purposes]

Some things we've encountered during our conversion process:
- Dry run your conversion process. Dump your mysql, reload into a vanilla server with mongo installed, and do the dry runs from that server. The operational conversion is something you only need to do once, so it's ok to leave things for manual tinkering later!

- You'll see your application is working slower. Don't worry about it. There's a lot of tuning to do.

- Indexing: You need to take good care of this. Use explain({verbose:1}) for your big queries in order to find out why. Indexing in mongo will solve a lot of your performance problems.

- Large sorts won't work, even with indexes. In our case, it was a sort for an set of more than 130000 records. Instead of implementing paging, we've moved the sorting to the application (Works really fast, thank you). We will need to implement paging eventually, because we've just postponed the inevitable...

- Uniqueness: Like mysql, mongo has an ensureUnique method on index creation. We decided to add indexes manually and not automatically.


The next chapter will deal with more sophisticated tuning methods post conversion. Stay tuned :)



No comments:

Post a Comment