7 Things Data Analytics Can Learn from Online Dating7 min read
What’s the secret to their success?
Dating based on big data is behind long-lasting romance in relationships of the 21st century. Online dating companies leverage big data analytics on all of the information collected on users and what they’re looking for in a relationship through in- depth questionnaires as well as other data elements such as website habits and social media.
What Can We Learn from Online Dating Sites?
Unlike product and content companies, online dating sites have a bigger challenge—the process becomes significantly more complex when connections involve two parties instead of one. When it comes to matching people based on their potential mutual love and attraction, analytics get significantly more complicated. The data scientists at dating sites work hard to find the right techniques and algorithms to predict a mutual match. I.e., Person A is a potential match for Person B, but with high probability that Person B is also interested in Person A.
To conquer this challenge, dating sites employ a multitude of strategies around data. Below are the 7 key takeaways we can learn from them.
1. Use the Right Tool for the Job
The compatibility matching system of eHarmony was originally built on a RDBMS but it took more than 2 weeks for the matching algorithm to execute. eHarmony now employs a more modern suite of data tools. By switching to MongoDB, they have successfully reduced the time for the compatibility matching system algorithm to run at 95% (less than 12 hours). Big data and machine learning processes analyze a billion prospective matches a day. Tools like IBM’s PureData System allow eHarmony to analyze patterns in petabytes of data and help them to complete approximately 3.5 million matches every day.
Many dating sites have learned how to manage large data sets from Google, and deliver quick results using indexing and distributed processing. Google Search works quickly, but hardly anyone considers the number of Google bots crawling through the web to generate dynamic results in real time. Google Search results are generated in milliseconds, and are the outcome of the distributed processing of big data. Google Search keeps an index of words instead of searchin g through webpages directly, as it’s better to scan through the index than to scan through the whole page. Google also uses the Hadoop MapReduce framework for scanning through huge numbers of servers and integrating the results into an index.
Match.com is powered by the Synapse algorithm. Synapse learns about its users in ways similar to sites like Amazon, Netflix, and Pandora to recommend new products, movies, or songs based on a user’s preferences. The Synapse algorithm is based on the stable marriage problem solved by the Gale–Shapley algorithm. This is the same algorithm that is used every day in other industries for things like content recommendations, high volume financial trading, ad placements, and web rankings on sites like Twitter, Reddit, and Google.
2. Employing Different Strategies to Gather Data
In order to gather data about its users, online dating companies provide questionnaires comprised of up to as much as 400 questions. Users have to answer questions on different topics varying from hypothetical situations to political views and taste preferences to increase their online dating success rate.
Match.com and eHarmony both use their own proprietary questionnaires that aim to dig deep into who you are, and what you may like in a partner. At more than a hundred questions each, and taking hours to complete, it is a lot of effort, but the user’s answers become the data which allows the site to build up as much information on you as they can before plugging you into their matching algorithms.
In addition to user surveys, dating sites also analyze the behavior of users on their dating websites, usually based on the kind of profiles they visit. Online dating data is also collected from social media platforms, credit rating agencies, history of online shopping websites, and various online behaviors like media consum ption.
Matchmaking algorithms themselves change, too, resulting in different pools of potential matches based on whether people arrived on the site via a mobile device, online, or after watching a television ad.
3. Account for Accuracy of Data
The challenge in predictive modeling in dating sites is in understanding what self-reported data is “real” in the prediction models.
People have a tendency to lie (or exaggerate) about age, body type, height, education, interests, etc. Excluding certain variables or taking a multi-dimensional scoring approach with different weights is often utilized. For example, females tend to lie about their weight, age, and build, while males tend to lie about their height, income, and age. Another instance of providing inaccurate data is when the person believes that he/she is more appealing when listing that they love listening to classical music--while the accuracy of this data can better be determined by an analysis of the Spotify playlist or iTunes history.
Data analytics from Facebook profiles, or online shopping pages are also much more helpful in predicting human behavior based on actions than what the users fill out in a questionnaire.
4. Design Thinking to Augment the Data
Design thinking involves understanding your customers as people who need your help and creating empathy around their needs, hopes, fears—and the root cause of the challenges they face when dealing with a particular problem. Jason Chunk, Vice President of eHarmony, has been quoted as saying: “From the data, you can tell who is more introverted, who is likely to be an initiator, and we can also see if we give people matches at certain times of the day, they would be more likely to make communication with their matches.”
But there are societal an d cultural considerations beyond what the data is telling them. For instance, eHarmony’s international service knows its candidates outside the U.S. are more comfortable with being matched with someone who smokes cigarettes or drinks alcohol than its U.S. members, so it broadened its matching algorithms for overseas users.
5. Leverage Analytics to Make Smart Business Decisions
eHarmony spends around $80 million a year on marketing, which is down from the $100 million it spent before it invested in its own custom attribution measurement system. This system uses data to help the company spend more efficiently. eHarmony built its own attribution system in-house, evaluating 125 terabytes of data to optimize things like media buys, how it communicates with people when the ads convert, and even what types of matches they see.
6. Know Your Customer
Despite having petabytes of data about their customers, most dating sites also ensure they really know their customers, and make an effort to ask them what types of activities they're doing, how they're dating, who they're dating, where they're going, and what they're doing while out on dates. Leaders from dating sites always ensure they are “in touch with the customer.”
7. Know Your Competition
Know your competition and make sure you're always using competitive products. Most industry leaders don’t do enough of this, assuming they're the trendsetters doing the innovating. But Sam Yagan, CEO of The Match Group, is quoted as saying “I use our competitor's product as much as we use our own. I have all of our competitors' apps on my phone.”
As vital as it is to know your competition it is also equally important to know the technical landscape. Ensure you are aware of the applications people are employing in a different vertical or in a different industry. Examples include the use of location-based services and video.
Data is a key resource in this century. It has a number of valuable uses that promise to dramatically impact the business landscape—but to be employed effectively it needs to be carefully leveraged. We learn from industries like online dating, that have put data at the center of their business processes to create an innovative and effective product.