This post is a first in a series I will be exchanging with Allison Miller, one of my esteemed colleagues in Paypal's Risk organization, in her reinstated blog.
“Man may be defined as the animal that can say "I," that can be aware of himself as a separate entity”. (Erich Fromm)
New Age movement over the last decade led to the calling to each of us to find our own “true identity” through introspection; supported by modern psychology, the journey of identity constantly drives for defining, consolidating and presenting our personalities through titles that illuminate various aspects of our day to day behavior as part of a healthy, consistent and coherent identity that is who we are.
The Web is no different. With the rising popularity of social networks, real time communications, blogging and free emails, our daily communication and personal information can now be found on the web. The path to definition, presentation and consolidation of identities was short: Google lets you create your own Google profile and ReputationDefender helps you defend it; social network aggregators like 8hands help you consolidate all your social activities and openID helps you consolidate your authentication; and a plethora of “web 2.0” marketing firms will help you brand yourself in a way suitable for GenY, GenZ or whatever Gen is running around out there at the time. Not only that – as advanced technology and identity representation on the web have evolved, online services push us to become more “public” – share more information with “Everyone”, tweet our thoughts to a massive crowd of followers and broadcast our preferences, beliefs and orientation whenever we see fit. Obviously this drive has an apparent financial value: the more information we share, the easier it is to segment who we are, what we want, our behavior in purchasing and other activities – most of them if not all of them conducted online. Knowing one’s identity, enough to put them in a specific category, grew to become highly valuable not only in marketing but also in risk management and other areas using business intelligence to make informed decisions; the more relevant data you have, the more you are able to predict the behaviors you want to encourage or discourage, depending on your line of business and preferences.
The practice of Identity management is a vital part of Risk management: not only does it provide the essential trust in payment systems, when the users (both buyer and seller) are properly “known”, vetted and engaged, but it is also key for being able to expand your business without losing your assets to fraud or default. Identity management, as Allison will probably discuss in her post, is what happens in Risk management beyond and after access control. It is not enough to determine infosec best practices for password strength; we need to be able to deal with bad, invented, stolen or just compromised identities and accounts. Developing the ability to manage identities requires working through three big challenges: authentication (proving that a user is really who he claims to be), fuzzy identities (the challenge of probabilistic consolidation of identity pieces) and the initial encounter (when you don’t know anything about the user). The last two deserve a post of their own; today I’ll focus on the first.
And so, people provide information out of good will, even connecting the dots of their online identities for you using aggregation services or unified credentials like openID. You get all the information you need, and even get educated users who really want to invite their friends to your service and provide you with more information about who they and their friends are. Perfect! Or is it? Reading Dr. Rohit Khosla’s great article about the “privacy theatre” makes it pretty obvious that it is not. The “privacy theatre”, the way I read it, is describing the fact that social networks and services are providing privacy controls for users, creating the false notion that user information is protected per their own privacy definitions - where in reality, not only does the network itself seek to reopen its APIs and expose more and more user information, but it is also careless in simply not protecting that information from hacking. The bottom line is mass compromise of credentials - RockYou’s example is just one of them; the Heartland breach indictment earlier this year is another; through 2009 the ITRC reported 492 cases of data breaches, from hospitals to government offices. But it is more than just breaches that churn on the value of credentials: user information theft and compromise is now a lot easier with the abundance of information broadcasted out there, usage of emails as the username (Paypal is also a part of this misdoing) and users’ tendency to share credentials with pretty much anyone (resulting from our education effort) serve those ill-willed as much as they served the needs of legitimate social networks. It’s clear that credentials cannot serve as your “identity” anymore, to the extent that they may be useless in most cases.
Hence, the question of authentication doesn’t revolve around the amount of data your have; rather, it revolves around the question – can you use the data to authenticate that the user currently accessing your site is who he claims to be? If credentials are broadly compromised, and personal information that is usually used for KBA (knowledge based authentication) can be screen scraped from your profile page, authentication becomes a much harder task. In some real world brick and mortar cases, this question is pretty straight forward – it’s considered very unlikely for a fraudster to have the same face as their fraud victim (given that the data source itself, i.e. the ID, is not forged). On the web we are dealing with electronic entities claiming for ownership over actual financial instruments - a whole new ball park.
When there are almost no secrets left
How do you deal with such a grim scenario? How do you differentiate between compromised and non-compromised credentials when the user provides them to you? You just don’t. You treat all credentials as compromised and carry on. Risk management is different in essence from other types of business practices because you are dealing with a team that is set to undermine your every move (assuming that there is an “international organization of fraudsters” is always good practice) – we know that because every time the industry demands a new “secret” from legitimate users, fraudsters start phishing for it: be it CVV2, Mother Maiden Name, SSN or others. Why would ANY old or new secret we give the user be any different? Credentials and most KBA cannot reliably serve as “something that the user knows” for N-factor authentication (also see Bruce Schneier’s 2005 article). So what DO we do? We do four things:
a. Riskiness estimations based on ownership factors (something the user has): developing the ability to evaluate the riskiness of logins and sessions based on our knowledge of the user’s machine or behavior, rather than issuing a token (security key or protected device) or cross referencing with cross merchant bad lists. Identity should not be what is asserted, but what is detected. This will also allow a different, data-driven definition of fraudsters – one that can actually be tracked through your system.
b. Consistency estimations based on inherence factors (something the user is): developing the ability to evaluate the riskiness of actions and behaviors based on our knowledge of the user’s previous actions and preferences. This is when behavioral analytics shift gear – and when properly applied, allow modeling of user behaviors and comparison between users that is then used to detect deviations.
c. Specially designed authentication challenges. Wait – didn’t I just say that there are no more secrets? Well, some secrets are better kept than others – specifically, those that fraudsters really have a hard time to get or cannot anticipate the system will ask for. Selecting the right bit of information (or the right secret) to properly divide the population between fraudsters and legitimate users (one that only legitimate users can provide) is a huge part of the analytic work. Choosing the one that works in context and won’t drop completion rate is even more challenging.
d. Actively engaging users in maintaining and monitoring their own identities in your system raises awareness but also crowdsources some of your risk management exactly when it is needed – and might even counteract some of the worse education users are getting online.
Being able to properly use ownership and inherence while challenging the right secrets is the best practice when all credentials are compromised – NOT issuing more and more secrets. Choosing the right secrets and evaluating properly are the core competencies of domain experts in our field, and cannot be replaced by discussing the importance of CVV2 and complying with PCI (though, while regulation demands it, those are also important). In a world without secrets (well, almost without secrets), we cannot allow ourselves to stay behind the curve by sticking to old school authentication practices.