Video: eHarmony Lead Engineer Offers An Inside Look At Big Data & Dating

  • Contributed by:
  • Views: 1,581

Most folks are probably content to hit Like buttons and swipe right without knowing how any of it actually works. But if you're like me, you can't help being curious about what goes on behind the scenes.

For those of us in the second category, there's this recent video from David Gevorkyan, principle software engineer at eHarmony. In the hour-long talk, Gevorkyan describes how eHarmony creates the highly compatible matches it's known for, and how the company leverages Big Data technologies to accomplish that goal.

In his words:

I discuss how we take a billion+ potential matches that we find through MongoDB, store them in a Voldemort NoSQL datastore, and then run multiple Hadoop jobs to come up with a filtered list based on Machine Learned models. Our Hadoop clusters are in-house, high density, low power Seamicro installations, and we use Spring Batch and Spring Data Hadoop to orchestrate the Hadoop jobs.

Yeah, I didn't get any of that either.

Thankfully, there's an accompanying SlideShare presentation that makes all that fancy-pants technical jargon easier to digest.

What sets eHarmony apart from other dating services, Gevorkyan explains, is its Compatibility Matching System. There are 3 components:

  • One that matches using a member's personality and psychological profile
  • One that matches using historical data gathered over eHarmony's 15 years in business
  • And one that matches by ensuring the right match is delivered at the right time to as many people as possible

They are called Compatibility Matching, Affinity Matching, and Match Distribution, respectively. Currently, eHarmony asks about 150 questions in order to build a profile of a user, which offers the site unprecedented access to the user's personality, values, attributes, and beliefs. Once the profile is built, it is placed against the profiles of potential date candidates and assessed for compatibility.

As you probably suspect, eHarmony has learned quite a few interesting things over the years, including how distance, height difference, and the use of certain words in profiles affects the probability of communication. They're clearly making good use of that data, because eHarmony has been responsible for more than 600,000 marriages in its lifetime – 438 per day, which accounts for 5% of all US marriages.

For a breakdown of how it all works, check out the video above. There's a significant amount of techie talk, but it's interesting even if you're not a data nerd or a computer science major.