Tuesday, April 10, 2012

What's Missing in Data Science Talks

On January 28th, 2008, the $169M sale of Israeli FraudSciences to eBay's payments division PayPal was publicly announced. I was part of the 65 person crew and head of the analytics group at the time. FraudSciences became PayPal's Israeli R&D center and is still a thriving team spanning more than 100 people and providing great value to the company. Our story has even been mentioned on StartUp Nation, in an inspired-by-a-true-story style dramatization of events.

The sale and its ramifications is not what I want to talk about, though; what I do want to talk about is the events that led to that sale, and more specifically the test that PayPal ran us through. You see, PayPal had to see whether our preposterous claims about how good our algorithms were held true, so they threw a good chunk of transactions at us to analyze and send back to them with our suggested decisions. Long story short, our results had an upside of up to 17% over PayPal's own algorithms at the time, and the rest is history.

How did we do that, then? We must have had a ton of data. We must have used algorithm X or technique Y. We must have been masters of Hadoop. Wait - no. 2007. Nothing of the sort. Everything takes forever. To get to these results we didn't even use the two famous patents FraudSciences viewed as huge assets since they required some sort of real time interaction with the buyer. What we did have were roughly 40,000 (indeed) well-tagged purchases, good segmentation, and great engineered features all geared at very well defined user behaviors. What we had, plain and simple, was strong domain expertise.

Domain expertise, or lack thereof, is exactly my issue with the talk about Data Science today. Here's an example: I recently had a friend, a strong domain expert, rejected from a pretty nascent startup filled with very smart engineers since they didn't really know where to place his non-developer profile in their team. Were they wrong to not hire him? Maybe, maybe not. I can't judge. Were they wrong to make the decision based on coding skills? Most definitely. It's a very common passion for data and ML geeks such as ourselves to embark on the (in my opinion) hubris-driven task of building an artificial intelligence that will solve all problems, the Generic SkyNet. We neglect to admit the need for specific knowledge. It is then when discussions of volume and structure of data sets replace keen understanding of what people are trying to achieve - when complex tools replace user research. Unsurprisingly, these attempts either fail or scale down to take domain by domain. They can still take over the world - just with a different strategy.

When I read people on Kaggle, in itself an amazing website and community, list the tools they threw at a dataset instead of how they led with a pure analysis of pattern and indicators, I cringe a little. This is a craft fueled by excess - in space, in memory, in computing power, even in data. While often times highly useful, almost as often does it  make us miss the heuristic just in front of our eyes. I think that analysis and Data Science need to incorporate this realization as well, to become a real expertise.

Fraud detection and prevention and Credit issuance, the stuff we deal with on a daily basis at Klarna, are areas where this is an obvious issue. High fragmentation in geographies, payment instruments and products creates smaller training and validation sets than you'd ideally want. The need to wait for default or a chargeback limits the time between iterations. The presence of bad signals is scarce compared to other types of classification. Operational issues and fraudsters' strong incentives to hide (as well as abuse or "friendly" fraud) cause "dirty" performance flags. And still we have a shop that uses a number of instances per segment that Data Science teams would frown upon to make some accurate decisions. How is that? The same way FraudSciences gave PayPal's algorithms a run for their money - we use domain expertise to distill features that capture interaction in a way that automated feature engineering methods will find hard to imitate. We use bottom up analysis of behavioral patterns. We add a sprinkle of behavioral economics (but building a purchase flow is a completely different story).

This aspect of what we do is available to any Data Scientist out there - I've written extensively about finding domain experts. They're around you. Use them - and don't get hooked on the big guns just because they're there*.

*Well, only if you want to get better results quicker and are acting under market and product constraints. If you're a contributor to an open source project - carry on with your great work!