Credit card fraud is an ever-evolving arms race. As security measures improve, fraudsters develop increasingly sophisticated techniques to bypass them, often operating in organized, complex networks. Recently, I embarked on a project to tackle this challenge using the IEEE-CIS Fraud Detection dataset—and the journey took me from standard machine learning algorithms all the way to advanced Graph Neural Networks (GNNs).

Here is a look into how I built the pipeline and why modeling transactions as a graph fundamentally changed the game.

The Challenge & The Data

The core of the problem lies in predicting a binary target: isFraud. The dataset provides a rich playground, split across two main tables:

  • transaction: Contains the core details of the payment (amounts, timestamps as timedeltas, card information).
  • identity: Contains network and device information associated with the transaction.

Joined by TransactionID, this data provides a comprehensive, albeit extremely imbalanced, view of online transactions. The biggest hurdle? Fraudulent transactions make up a tiny fraction of the dataset, meaning a model that simply guesses “not fraud” every time will still achieve a seemingly high accuracy.

Building the Standard Pipeline

Before jumping into complex architectures, it’s crucial to establish a strong baseline. My initial approach involved a complete standard machine learning pipeline:

  1. Extensive Preprocessing: Cleaning data, handling missing values, feature engineering, and dimensionality reduction.
  2. Handling Imbalance: I utilized SMOTE (Synthetic Minority Over-sampling Technique) to generate synthetic examples of fraud, forcing the models to learn the patterns of fraudulent behavior rather than just ignoring them.
  3. Training & Evaluation: I trained a suite of models including SVMs, K-Nearest Neighbors, AdaBoost, Random Forests, XGBoost, and LightGBM.

The Baseline Results

Evaluating these models on an unsampled 50k validation set yielded expected results. Gradient boosting frameworks led the pack among tabular models:

  • Random Forest achieved a respectable 43.85% F1-Score with $111k in projected savings.
  • LightGBM pushed the F1-Score to 49.55%.
  • XGBoost topped the standard models with an F1-Score of 50.52% and a projected savings of $174,060.

While an F1-score of ~50% might sound low in other domains, in the highly imbalanced world of fraud detection, this is a solid baseline. But looking at the recall (around ~36% for XGBoost), I knew there was a lot of fraud slipping through the cracks.

The Secret Weapon: Graph Neural Networks (GNNs)

Standard tabular models treat every transaction as an isolated event. But fraud rarely happens in a vacuum. Fraudsters use shared devices, similar IP addresses, and connected email networks. To capture these relationships, I turned to Graph Neural Networks.

By modeling the data as a graph—where nodes represent entities (like a specific credit card or an IP address) and edges represent the transactions between them—the model can learn to identify complex fraud rings.

I evaluated the GNN on a SMOTE-resampled graph encompassing over 1.1 million relationships.

The GNN Impact

The results were nothing short of astronomical.

While the accuracy dropped slightly to 85.00% (due to a higher false-positive rate inherent in aggressive recall strategies), the Recall skyrocketed to 84.00%.

Because high-dollar fraud rings were no longer slipping through unnoticed, the Projected Savings jumped to an incredible $68,169,432.00.

Model Recall F1-Score ROC-AUC Projected Savings
XGBoost 36.63% 50.52% 89.00% $174,060.43
Graph Neural Network 84.00% 85.00% 92.76% $68,169,432.00

Note: The GNN’s unique capability to map transaction identities led to a substantial improvement in Recall, making it heavily favorable for high-dollar fraud detection compared to tabular models.

Conclusion

This project underscored a vital lesson in data science: context matters. Tabular models are fantastic, fast, and reliable. However, when the problem fundamentally revolves around relationships and networks—like organized credit card fraud—re-framing the problem into a graph can unlock performance that isolated data points simply cannot reach.

If you are interested in exploring the codebase, including the data preprocessing scripts, the PyTorch Multi-Layer Perceptrons, and the GNN implementation, you can check out the source code in my repository: Vijay-K-2003/CSE_575_Fraud_Detection.


To run the code yourself, simply install the dependencies via pip install -r requirements.txt and run python train_models.py for the tabular models, or explore the gnn/ directory for the graph-based approach.