False positive reduction in credit card fraud detection

Model extracts granular behavioral patterns from transaction data to more accurately flag suspicious activity.

Share

Consumer’s credit cards are declined shockingly frequently in real exchanges. One reason is that fraud-detecting innovations utilized by a shopper’s bank have inaccurately hailed the deal as suspicious. Presently MIT scientists have utilized another machine-learning procedure to radically decrease these false positives, saving banks cash and facilitating customer frustration.

Utilizing machine learning to distinguish money related misrepresentation goes back to the mid-1990s and has progressed throughout the years. Analysts prepare models to extract behavioral of conduct from past exchanges, called “features,” that signal fraud. When you swipe your card, the card pings the model and, if the features fraud behavior, the sale gets blocked.

Now, MIT researchers have developed an “automated feature engineering” approach that extracts more than 200 detailed features for each individual transaction — say, if a user was present during purchases, and the average amount spent on certain days at certain vendors. By doing so, it can better pinpoint when a specific card holder’s spending habits deviate from the norm.

Tested on a dataset of 1.8 million transactions from a large bank, the model reduced false positive predictions by 54 percent over traditional models, which the researchers estimate could have saved the bank 190,000 euros (around $220,000) in lost revenue.

Kalyan Veeramachaneni, a principal research scientist at MIT’s Laboratory for Information and Decision Systems (LIDS) said, “The big challenge in this industry is false positives. We can say there’s a direct connection between feature engineering and [reducing] false positives. … That’s the most impactful thing to improve accuracy of these machine-learning models.”

The backbone of the model consists of creatively stacked “primitives,” simple functions that take two inputs and give an output. For example, calculating an average of two numbers is one primitive. That can be combined with a primitive that looks at the time stamp of two transactions to get an average time between transactions.

Stacking another primitive that calculates the distance between two addresses from those transactions gives an average time between two purchases at two specific locations. Another primitive could determine if the purchase was made on a weekday or weekend, and so on.

Veeramachaneni said, “Once we have those primitives, there is no stopping us for stacking them … and you start to see these interesting variables you didn’t think of before. If you dig deep into the algorithm, primitives are the secret sauce.”

“One important feature that the model generates, is calculating the distance between those two locations and whether they happened in person or remotely. If someone who buys something at, say, the Stata Center in person and, a half hour later, buys something in person 200 miles away, then it’s a high probability of fraud. But if one purchase occurred through a mobile phone, the fraud probability drops.”

“There are so many features you can extract that characterize behaviors you see in past data that relate to fraud or nonfraud use cases.”

Paper co-authors are: lead author Roy Wedge, a former researcher in the Data to AI Lab at LIDS; James Max Kanter ’15, SM ’15; and Santiago Moral Rubio and Sergio Iglesias Perez of Banco Bilbao Vizcaya Argentaria.

Trending