eHarmony finds MongoDB the match that is perfect information shop

eHarmony finds MongoDB the match that is perfect information shop

On the web dating website eHarmony has utilized open source NoSQL database MongoDB for the information shop, to accelerate distribution of matches between users

On the web site that is dating discovered that open source NoSQL database MongoDB ended up being the most wonderful match because of its information store needs.

The solution had around one million registered members in 2001 nevertheless now has 44 million, and its particular machine-learning compatibility engine that is matching gained in elegance. Consequently, its Postgres SQL relational information store had been no further the solution that is best.

Thod Nguyen, primary technology officer at eHarmony (pictured) says: “Our compatibility matching model is starting to become increasingly more complex. And, remember, it’s bi-directional. It really is a various model to, state, Netflix. It is possible to like a film however it doesn’t always have to have a liking for you straight straight right back.”

He claims that 5% of most US marriages, since 2005, begin during the eHarmony internet site, which processes a billion matches every day. The technology that is machine-learning has been processing individual pages for ten years is proprietary.

Making use of MongoDB for the information shop means processing the whole user pool may take destination within 12 hours, an activity that formerly took 15 times.

“But matching is simply one part of the web site,” claims Nguyen. “There are user engagement tasks, too,” which may have become richer having a brand new web site, he claims.

Nguyen joined the Santa company that is monica-based months ago, by having a history that features time at MyLife and electronic advertising platform provider Zurock, and experience with placing NoSQL technologies into manufacturing.

He and their 60-strong group have now been confronting a “dramatic boost in traffic”, with the increasing complexity associated with the user profiles matching model.

“In this specific instance MongoDB is the greatest NoSQL solution for the issue we had been wanting to deal with, with regards to scalability and gratification,” he claims.

“The information shop of this individual pool was once centered on Postgres SQL – centralised rather than distributed. It had been hard to measure whilst the information expanded so when the true quantity of characteristics in the profiles increased.

“You need to deliver your matches near real-time. In the event that you processed our whole individual pool it took months to come up with matches, particularly those top-quality matches. Therefore, in 2012 we started initially to reconsider exactly how we architected the device, because of the data shop as a component that is key of.”

eHarmony examined HDFS [Hadoop entrepreneur dating Distributed File System], Oracle’s MySQL, the Voldemort information shop, and Cassandra.

“MongoDB was good at scalability and it has great integral sharding and replication, rendering it great at running complex questions,” claims Nguyen.

“It comes with a versatile and dynamic schema. With the SQL system you needed to do a full data migration if you wanted to add an attribute to a profile. With tens of terabytes of information in manufacturing that is very hard. Because of the brand new system we just add more nodes to your group.

“It’s top solution that is optimal this kind of complex issue [the data shop part of the architecture].”

For lots more on NoSQL in internet organizations

He suggests others to adhere to the approach of beginning from “the issue become resolved, maybe maybe not the technology as such”.

“Go through numerous various solutions, SQL and NoSQL,” he says. “consider available supply. Be open-minded about this. There is lots of available source that is handling problems that are similar you need certainly to find the correct one for you personally along with your problem set”.

He defines himself being a “great proponent of available source”, but counsels that, “Community help is essential. There was a genuine distinction between evidence of concept and an enterprise manufacturing environment. Frequently you do not see issues when you look at the development and test phase, the thing is them more in manufacturing. And for that you need to have a complete large amount of professional support.

“MongoDB is great for the reason that respect – there is certainly good community help, but additionally expert help through 10gen.

“And it’s also crucial to provide back once again to the city. We have done that — aided by the Seeking question collection provided to GitHub”.