Credit Card Fraud Detection With Machine Learning

The myriads of plastic cards in use worldwide are a gold mine for criminals. By 2027, financial service providers are expected to take a $40 billion hit globally in credit card losses, a significant increase compared to $27.85 bn in 2018.

This growth in losses is partially caused by the rise of electronic transactions. Just imagine that today the average American has more than three credit cards, which amounts to 1.5 billion cards in the US alone. While the number of plastic cards globally numbers an impressive 22.11 billion.

Another reason is that fraudulent methods are getting more sophisticated and thus harder to spot by traditional fraud detection software.

To address such a massive challenge effectively, companies involved in the card payment process need advanced approaches. No wonder that high hopes are placed on machine learning. ML-fueled technologies have made a name for their ability to handle huge amounts of data and discover anomalies that humans may overlook.

In this article, we’ll explore who suffers from payment card fraud, how this type of crime occurs, and what machine learning can do to prevent it. We’ll also look at effective use cases of machine learning in fraud detection and explain the process of ML implementation into the fraud detection workflow.

You may also have a breif rundown on fraud detection with our video:

Fraud Detection: Fighting Financial Crime with Machine Learning

Fraud detection explained in 12 minutes

What is credit card fraud and who become targets of scams

According to the FBI, credit card fraud is “the unauthorized use of a credit or debit card, or similar payment tool to fraudulently obtain money or property.” All players involved in the card-based payment process can potentially fall victim to scammers, including:

cardholders,
online merchants,
payment gateway providers,
payment processing companies,
credit card payment systems,
card issuers (issuing banks), and
acquirers (acquiring banks).

Card-based payment process and its key players.

Except for cardholders whose anti-fraud measures narrow down to vigilance and timely reporting about lost or stolen cards, all other players rely on various digital tools designed to combat scams. The importance of these tools is hard to overstate. Say, if an online business shows a fraud rate greater than one percent, card networks like Mastercard or AmEx may cancel permission to accept and process credit card payments.

With all the variety of fraudulent schemes involving credit cards, they can be roughly divided into two large groups — identity theft and transaction laundering.

Identity theft

Credit card fraud is the most common form of identity theft, affecting more than 10.7 million people annually. It occurs when someone steals a card or snatches personal information to perform so-called card-not-present (CNP) transactions.

Most commonly, ID thieves use a victim’s identity and payment credentials to

make purchases a cardholder doesn’t authorize,
withdraw money from a victim’s existing account (account takeover),
apply for a new credit card (fraudulent application), or
open a new account.

Criminals may obtain sensitive information via phishing emails, skimming devices embedded into card readers, and cyber-attacks on banks or retailers with a low fraud control level. But often scammers use far simpler methods — such as rummaging through papers carelessly dumped in the trash or just looking over a person’s shoulder when a potential victim enters a PIN.

Transaction laundering

This relatively new and advanced method of money laundering is also known as undisclosed aggregation, factoring, or credit card laundering. The fraud involves a legitimate merchant whose credentials are used to process payments for illicit or illegal products and services through a payment card network.

Criminals may exploit huge online marketplaces to launder dirty money via fake transactions. Another scenario is to create an innocent-looking shell website (say, a toy or clothing store) to actually sell illegal substances.

Transaction laundry involves selling prohibited goods via legitimate looking (or actually legitimate) websites. Resource: Sharetribe

Three key factors facilitate transaction laundry and make it hard to detect:

ease of creation of a professional-looking website,
the large number of go-betweens involved in online payment processing, and
a rise in the use of corporate credit cards, enabling large payments to be done via a single transaction.

All of the above create a situation when legal payment service providers can involuntarily become a part of criminal schemes.

Of course, no technology can stop criminals from sending out malicious emails or launching fake websites. But they are able to keep scammers from achieving their overall goal — generating illicit gains. And here machine learning comes to the foreground.

How machine learning helps with fraud detection

The key objective of any credit card fraud detection system is to identify suspicious events and report them to an analyst while letting normal transactions be automatically processed.

For years, financial institutions have been entrusting this task to rule-based systems that employ rule sets written by experts. But now they increasingly turn to a machine learning approach, as it сan bring significant improvements to the process.

1. Higher accuracy of fraud detection. Compared to rule-based solutions, machine learning tools have higher precision and return more relevant results as they consider multiple additional factors. This is because ML technologies can consider many more data points, including the tiniest details of behavior patterns associated with a particular account.

2. Less manual work needed for additional verification. Enhanced accuracy leads reduces the burden on analysts. “People are unable to check all transactions manually, even if we are talking about a small bank,” Alexander Konduforov, data science competence leader at AltexSoft, explains. “ML-driven systems filter out, roughly speaking, 99.9 percent of normal patterns leaving only 0.1 percent of events to be verified by experts.”

3. Fewer false declines. False declines or false positives happen when a system identifies a legitimate transaction as suspicious and wrongly cancels it.

4. Ability to identify new patterns and adapt to changes. Unlike rule-based systems, ML algorithms are aligned with a constantly changing environment and financial conditions. They enable analysts to identify new suspicious patterns and create new rules to prevent new types of scams.

Key differences between rule-based and ML-based approaches to fraud detection.

To obtain the above-mentioned advantages, fraud detection solutions use two ML techniques — supervised or unsupervised learning.

Supervised learning means that a model learns from previous examples and is trained on labeled data. In other words, the dataset has tags that tell the model which patterns are related to fraud and which represent normal behavior.

“Banks and payment systems typically accumulate tons of data on different fraudulent schemes that can be used to train a model," Alexander Konduforov says. "Such models are constantly updated and improved to produce accurate results. But unfortunately, they fail to spot new fraud schemes if faced with them.” That’s when unsupervised learning comes into the picture.

Unsupervised learning is also called anomaly detection as it automatically captures unusual patterns. In this case, training datasets come without any labels or instructions. This approach lags behind supervised learning in terms of accuracy. But it is unrivaled when a business needs to find hidden fraud patterns and useful insights.

As a rule, fraud detection systems combine both approaches that complement each other. In the next section we’ll see how the combination of different ML techniques work in practice.

Big players: what they do to protect online payments

Large online merchants and payment service providers are no strangers to credit card fraud and its consequences. They have been building their risk management strategies for years, being among early adopters of machine learning. Some of these pioneers share experience with the general public, even giving open access to their antifraud solutions.

How financial giants combat fraud with ML.

PayPal: taking a deep learning approach

Payment processor and payment gateway provider with over 200 million active accounts worldwide, PayPal invests $300 million annually in anti-fraud technologies. Ten years ago, their system relied on logistic regressions — one of the most common supervised learning algorithms used for classification. Then they added more advanced supervised learning techniques, namely Gradient Boosted Trees (GBTs) and neural networks. As a result, the system’s accuracy improved by 50 percent.

Currently, PayPal is taking a new, deep learning approach, benefitting from huge amounts of fraud data accumulated over the years. Deep learning models have already proved to be 10 to 20 percent more accurate than machine learning algorithms in real-time fraud detection. The PayPal’s latest reported fraud loss rate is 0.28 percent (or 28 cents for every $100 processed).

Amazon: training models on AWS customers’ datasets

This year the world’s largest online retailer made publicly available a fully-managed service — Amazon Fraud Detector. The technology is powered by machine learning and the company’s 20-year experience in combating online scams.

To take advantage of Fraud Detector, AWS customers have to integrate with the service via API and feed the system historical data, along with markers indicating fraudulent and legitimate transactions. Amazon combines this information with its own data to create models that identify signs of identity theft or transaction laundry.

How Amazon Fraud Creator works. Source: AWS

Among early adopters of Fraud Detector services are Truevo, offering payment gateway and card acquiring services, and ActiveCampaign that provides sales and marketing automation software.

Visa: saving $25 billion a year with neural networks

The world’s second largest card network (after China UnionPay which accounts for 45 percent of all payment cards worldwide) reports the fraud rate of less than 6 cents per $100 transacted (0.06 percent). To attain this result, Visa built a multi-layered security infrastructure with an AI-fueled fraud detection system at its core.

Known as Visa Advanced Authorization, the system prevents up to $25 bn a year from being stolen by fraudsters. It uses neural networks to assess over 500 attributes including

a type of transaction (online, contactless, magnetic stripe, or in-app),
time of the day,
amount of money,
location, and
account’s spending patterns.

It takes about a millisecond to estimate the probability of fraud. The results are sent to a cardholder’s bank where the decision on whether to approve or decline a transaction is made.

American Express: achieving the lowest fraud rate

Operating as a credit card issuer, network, and merchant acquirer, AmEx handles 25 percent of the credit card activity in the US. This 170-year old company deployed its first machine learning models in 2014, and now also uses deep learning models to capitalize on the huge datasets available. AmEx tools monitor in real-time $1.2 trillion worth of transactions a year, demonstrating the lowest fraud rates in the credit card industry.

For small businesses, the financial giant offers a free ML-fueled solution called Enhanced Authorization. Merchants using the technology report a 60 percent reduction in fraudulent transactions.

Enhanced Authorization by AmEx helps merchants to effectively detect fraud.

Chase Merchant Services: proactively preventing threats

Chase Merchant Services previously called Chase Paymentech is the online branch of the US largest bank, JPMorgan Chase. The leading payment processor uses machine learning to constantly monitor online transactions, hitting $1.1 trn in annual payment volume. Due to the advanced technologies, the bank reduced credit card fraud losses by 50 percent over the last five years.

The Chase platform embraces supervised ML to pinpoint already known fraudulent patterns and unsupervised ML to analyze data for atypical behavior. Getting a clearer understanding of new fraud threats, the bank takes a proactive approach to prevent them.

When and how to implement machine learning in fraud detection processes

Machine learning is a natural choice for large service providers processing millions of transactions daily. Moreover, such behemoths can hardly run their business successfully without ML or deep learning solutions. If they did, they'd become easy prey for all sorts of scammers.

That said, you don’t need to be as big as Amazon or PayPal to adopt advanced technologies. When exactly does it pay to implement machine learning?

The first sign is a labor shortage. As your business grows, the number of transactions increases proportionally. Of course, you may hire another ten experts to look through cases identified by the existing system as suspicious. Yet, implementation of machine learning can be more effective as it dramatically reduces the number of false positive transactions that require manual verification.

The second “red flag” is the increased percentage of fraud your system overlooks or of legal transactions blocked. This indicates that the rule-based software is no longer up to the task.

Often, two signs come hand in hand, calling for technical improvements. Once the need for machine learning is recognized, the following steps should be taken.

1. A data science team conducts an audit on data accumulated by a financial structure to assess how it fits for given tasks. The team also looks into current anti-fraud processes and evaluates the feasibility of an ML solution at the given stage.

2. If data is insufficient — for example, banks collect only basic information on transactions — data scientists inform what additional metrics must be gathered.

3. First, unsupervised learning techniques may be applied to analyze available data and find patterns of fraudulent behavior.

4. When the atypical patterns are identified, supervised machine learning goes in to build a model and fine-tune it on larger datasets.

5. Ready-to-use algorithms can be wrapped in a web-service or a program module. As such, they are delivered to software developers to be integrated into the working e-commerce platform, payment processing software, or other system used by a merchant or service provider.

Credit Card Fraud Detection: How Machine Learning Can Protect Your Business From Scams