
Most fraud detection systems work like a bouncer at an exclusive club. No explanation, no appeal, just a vague “you’re not on the list!!” and a door in your face. Your legitimate customer just wanted to buy a laptop. Now they’re using your competitor’s checkout. Congratulations!, you just lost 2.5 lakhs($800) and a lifetime customer because your ML model had a bad feeling.
I wanted to build something better. So I built FraudGuard AI a B2B SaaS platform that scores transactions in under 85ms and tells you exactly why it made the decision it did.
Radical concept, I know.
The Problem Nobody Talks About Enough

Everyone in FinTech talks about catching fraud. Fair enough chargebacks are brutal. You lose the product, the money, and Visa charges you a $25 fee on top of it just to really drive the point home.
But the other side of this coin is somehow less discussed: false positives.
Your fraud model blocks a real customer. That customer gets an embarrassing decline mid checkout. They close the tab, open a competitor’s site, and boom they never come back. You didn’t just lose a sale you lost every sale that person would have made for the next five years.
Legacy “rule based” systems made this worse. Engineers would hardcode rules like:
IF amount > $1000 AND country = "Russia" THEN block
Straightforward. Auditable. And completely useless once fraudsters figured it out which took them approximately four days.
The answer is machine learning. Specifically, XGBoost. And the answer to why XGBoost is boring and correct, which is the best kind of answer.
Why XGBoost and Not Something Fancier

Let me save you the Medium post rabbit hole: XGBoost is the industry standard for tabular data fraud detection because it works, it’s fast, and it doesn’t need a GPU cluster to run inference.
Could I have used a neural network? Sure. Would it have been overkill for a structured CSV of transaction features? Absolutely. Would it have made for a better LinkedIn post? Debatable.
XGBoost is an group of decision trees built sequentially each tree learning from the mistakes of the previous one. This “boosting” approach means:
- Tree 1 makes predictions. Gets some wrong.
- Tree 2 focuses specifically on what Tree 1 got wrong.
- Tree 3 focuses on Tree 2’s mistakes.
- Repeat 500 times.
- Final answer is a weighted vote across all 500 trees.
The whole thing runs in milliseconds on CPU. For a checkout flow where every millisecond matters, this is not a small thing.
Features the model actually looks at:
- Transaction amount and merchant category
- Hour of day, day of week (3AM transactions are suspicious. Sorry nocturnal animals.)
- Velocity - how many transactions from this card in the last hour
- Geographic distance between billing address and IP location
- Device fingerprint signals
- IP reputation score
Raw data comes in, gets preprocessed through a persisted pipeline, drops through 500 trees, and comes out the other side as a risk score between 0 and 1. The whole preprocessing pipeline is saved alongside the model artifact so there’s no drift between training and inference. A mistake that has burned people before.
The Part I’m Most Proud Of - SHAP

Here’s where most ML fraud systems stop, they give you a score. 0.87. Transaction blocked. Good luck explaining that to your risk analyst, your compliance team, or the angry customer on the phone.
FraudGuard AI uses SHAP (SHapley Additive Explanations) to generate per transaction feature attributions. Every single scored transaction returns a risk_factors object that looks something like this:
{
"risk_score": 0.91,
"status": "BLOCK_TRANSACTION",
"risk_factors": [
{ "feature": "hour_of_day", "contribution": +0.34, "direction": "increases_risk" },
{ "feature": "ip_distance_km", "contribution": +0.28, "direction": "increases_risk" },
{ "feature": "velocity_1hr", "contribution": +0.19, "direction": "increases_risk" },
{ "feature": "amount", "contribution": -0.06, "direction": "decreases_risk" }
]
}In plain English: “We blocked this transaction because it happened at 3AM, the IP is 4,000 miles from the billing address, and this card has made 12 transactions in the last hour. The amount was actually fine.”
Your analyst can act on that. Your compliance team can audit that. Your customer service rep can explain that. A raw score of 0.91 helps nobody.
SHAP runs via shap.TreeExplainer on the trained booster. For large batch jobs, we compute SHAP only for records above the risk threshold to keep costs sane you don't need a full explanation for every $12 coffee purchase.
The Architecture (Or: How Not to Overengineer a Side Project)

A few decisions worth explaining:
Model loads once at startup. FastAPI’s async lifespan handler runs load_models() before any requests arrive. The XGBoost artifact and preprocessing pipeline sit in memory. Per-request model loading would be a crime against humanity and I won't stand for it.
Business rules live next to the model, not inside it. The _status_from_rules function in main.py handles threshold logic and hard blocks (like block_on_location_mismatch) separately from the model score. This means you can tighten a rule without retraining. Compliance teams can audit rules without touching ML code. Everyone's happy.
Single env-driven API base URL. NEXT_PUBLIC_API_URL drives everything on the frontend. No hardcoded localhost URLs surviving into production. No 2AM debugging sessions because staging was pointing at your laptop. One variable, predictable everywhere.
The cron job. Yes, that’s cron-job.org pinging /api/v1/ping every 15 minutes to keep the Render free tier instance warm. Is this the most elegant infrastructure decision of my career? No. Does it work? Completely. Ship first, optimize later.
The 3AM Russian VPN Test

I needed to validate the full adversarial path end to end. So I simulated an attacker routing transactions through an offshore VPN mismatched geo metadata, proxy flags in the location payload, high velocity, odd hours.
Here’s what happened, in order:
- Transaction payload hits FastAPI
- Preprocessor extracts location features, flags is_location_unusual: true
- XGBoost risk score: 0.8
- SHAP: location features dominate the positive contributions
- Rules engine: block_on_location_mismatch triggers → BLOCK_TRANSACTION
- Response hits the frontend with full SHAP breakdown
- Supabase logs the event for telemetry
- Zero silent failures
That last point. Zero silent failures. This is the thing that separates a toy project from something you’d actually trust with real money. A system that fails silently is more dangerous than a system that fails loudly. If something goes wrong, I want it in the logs, on the dashboard, and visible to the analyst not quietly wrong for six hours while chargebacks pile up.
Monetization (Because SaaS Needs a Business Model)
FraudGuard AI has a real Stripe sandbox integration.
Free tier: 5 transactions per day. Enough to evaluate the product, not enough to run a business on it.
Pro tier: Unlocks the Developer API Hub (self serve API key generation), bulk CSV auditing, and unlimited transactions. Stripe webhooks update subscription status in Supabase in real time.
Want to try it without a real credit card? Use Stripe’s test card: 4242 4242 4242 4242. Expiry: any future date. CVC: anything. You get full Pro access instantly in sandbox mode.
This is genuinely the part I’d encourage more portfolio projects to implement. Stripe sandbox is free, the integration teaches you webhooks, idempotency, subscription lifecycle management skills that translate directly to real SaaS engineering. And it makes your portfolio piece look like an actual product instead of a cheap student project.
Final Thoughts
I’ll be honest, FraudGuard AI is a Proof of Concept MVP. It’s not processing millions of transactions a day, and the backend lives on Render’s free tier and sleeps if you don’t poke it.
But the engineering decisions are real. The SHAP explainability is real. The Stripe sandbox integration is real. The adversarial testing was real. And the problem it’s solving balancing fraud prevention against customer experience, with full transparency into every decision is one of the most genuinely interesting problems in applied ML right now.
If you’ve read this far and you want to see it live, the link is below. Try the 3AM VPN transaction yourself. Watch SHAP tell you exactly why it got blocked.
And if you’re a CTO or a Tech Lead reading this wondering if the person who built it actually understands it I hope this answered that!
Live demo: https://fraud-guard-ai-five.vercel.app/ (Backend on Render free tier give it 30–60 seconds to wake up if it’s cold. The cron job tries its best.)
GitHub: https://github.com/3hal0n/fraud-guard-ai
Thanks for reading. If this was useful, useful, or at least mildly entertaining a clap or two wouldn’t hurt hehe.
I Built a Fraud Detection System That Explains Itself. Here’s What I Learned. was originally published in Level Up Coding on Medium, where people are continuing the conversation by highlighting and responding to this story.