Start now →

How to Built an Institutional Options Trades Detector

By Stas Khirman · Published April 10, 2026 · 7 min read · Source: DataDrivenInvestor
EthereumTrading
How to Built an Institutional Options Trades Detector

Most retail options traders know the frustration: you’re long calls, the chart is behaving, and then a large move blows through your position. Only afterward do you realize the options tape was screaming institutional activity for hours before it happened.

I spent the last few months building a system to detect this activity in real-time. This post explains how it works, the statistical methods behind it, and how to run it yourself.

The code is part of trading-skills, an open source Claude extension for AI-powered options analysis capabilities.

The problem with conventional “unusual activity” tools

Every retail options platform offers some form of unusual activity alerts. Most of them are wrong in the same ways.

The standard approach flags contracts where daily volume exceeds some multiple of open interest. That catches some real signals, but has three structural problems:

What actually signals institutional conviction is dollars invested.

The key data source: per-second options bars

The core ingredient is 1-second aggregation bars for options contracts. Massive.com (formerly Polygon.io, which many algo traders will recognize) provides exactly this through their aggregated bars API.

Their Starter plan covers options data at around $29/month. Given the decisions you’re making on it, that’s one of the cheaper edges available to retail traders. The Python client is clean:

from massive import RESTClient

client = RESTClient(api_key="your_key_here")

for bar in client.list_aggs(
"O:NVDA260320C00180000", # option ticker format
1, # multiplier
"second", # timespan
"2026-03-17", # from
"2026-03-17", # to
adjusted="true",
sort="asc",
limit=50000,
):
print(bar.timestamp, bar.vwap, bar.volume, bar.transactions)

Each bar includes VWAP, contracts traded, and — critically — transactions: the number of individual trades that composed the bar. If transactions == 1 and volume == 2000, that's a single block trade. There is no ambiguity.

Step 1: Pre-filter with Yahoo Finance

Before hitting the Massive API for every contract on a liquid underlying (which can mean thousands of strikes × expiries), I do a cheap screen using Yahoo Finance option chains via yfinance. This is free, no API key required.

Goal: find contracts with anomalously large dollar invested on the most recent trading day.

from trading_skills.options import get_expiries, get_option_chain
from trading_skills.utils import latest_trading_date
from dateutil.relativedelta import relativedelta
from datetime import date
import pandas as pd

UNDERLYING = "NVDA"
MAX_MONTHS = 2 # only near-term expiries
SIGMA = 3.0 # outlier threshold

ref_date = latest_trading_date()
expiry_max = ref_date + relativedelta(months=MAX_MONTHS)

rows = []
for expiry in get_expiries(UNDERLYING):
if date.fromisoformat(expiry) > expiry_max:
continue
chain = get_option_chain(UNDERLYING, expiry)
if "error" in chain:
continue
for opt_type, contracts in [("call", chain["calls"]), ("put", chain["puts"])]:
for c in contracts:
rows.append({**c, "expiry": expiry, "type": opt_type})

df = pd.DataFrame(rows)

# Filter to contracts that actually traded today (or previous session)
df["tradeDate"] = (
pd.to_datetime(df["lastTradeDate"], utc=True)
.dt.tz_convert("America/New_York")
.dt.date
)
prev_date = ref_date - pd.offsets.BDay(1)
df = df[df["tradeDate"].isin([ref_date, prev_date.date()])].copy()

# Dollar invested = lastPrice × volume × 100
df["invested"] = (df["lastPrice"].fillna(0) * df["volume"].fillna(0) * 100).round(2)
active = df[df["invested"] > 0].copy()

# Flag outliers: invested > median + sigma * std
median_inv = active["invested"].median()
threshold = median_inv + SIGMA * active["invested"].std()
candidates = active[active["invested"] > threshold].sort_values("invested", ascending=False)

print(f"Found {len(candidates)} candidates from {len(active)} active contracts")

This narrows 400–800 contracts to 5–15 candidates. Those get the per-second treatment.

Step 2: Per-second whale detection

For each candidate, pull all 1-second bars from Massive and find the seconds with anomalous dollar investment:

from trading_skills.massive.whales import option_whales

CONTRACT = "O:NVDA260410C00180000"
DATE = "2026-04-02"

whales, all_bars = option_whales(CONTRACT, trading_date=DATE, return_all=True)

print(f"Total seconds with data: {len(all_bars)}")
print(f"Whale seconds detected: {len(whales)}")
print(whales[["timestamp", "close", "volume", "transactions", "invested", "break_even"]])

break_even is the underlying price at expiry where the trade is flat: for calls, strike + premium; for puts, strike - premium. It's a quick proxy for how far the stock needs to move for the institution to profit.

The detection algorithm

Naive thresholding breaks down on thinly-traded contracts because investment distributions are wildly non-Gaussian. I use a two-method adaptive approach based on sample size.

Small samples (n < 30 active bars): Modified Z-Score

When few bars have positive investment, the standard deviation is unstable. A single large value inflates σ — which raises the threshold and obscures the very event you’re trying to find. The Modified Z-Score (Iglewicz & Hoaglin, 1993) uses MAD (Median Absolute Deviation) instead:

MAD = median(|xi − median(x)|) 
Mzi = 0.6745 × (xi − median(x)) / MAD

Flag as whale if Mzi > sigma_z (default 3.5). The 0.6745 factor makes MAD equivalent to σ on normally-distributed data, so the threshold is on the same scale as a standard-deviation cutoff - but far more resistant to contamination from the outlier you're trying to detect.

Large samples (n ≥ 30): median + sigma × std

With enough data, σ stabilizes. Using median as center (not mean) keeps it robust against the skewed distribution:

mask = df["invested"] > df["invested"].median() + sigma * df["invested"].std()

Hard rules that override the statistics

Two rules apply regardless of sigma:

Always whale: Any bar where invested / transactions >= $1,000,000 - one block trade of $1M or more. No statistical test needed. This is definitionally institutional.

Never whale: Any bar where invested / transactions <= $100,000 is dropped even if it passes the statistical test. Many small trades summing to a large total aren't whale activity; this removes that noise.

The full pipeline: whales_hunter()

Both steps are wrapped in one function:

from trading_skills.massive.whales import whales_hunter

result = whales_hunter(
"NVDA",
max_months=2, # expiry window
precise=True, # use Massive per-second data
sigma=3.0,
sigma_z=3.5,
trading_date=None, # defaults to latest trading day
)

print(f"Source: {result['source']}") # "massive" or "yahoo only"
print(f"Trading date: {result['trading_date']}")
print(f"Whale events: {result['total_whales']}")

import pandas as pd
whales_df = pd.DataFrame(result["whales"])
print(whales_df[["timestamp", "ticker", "type", "strike",
"close", "volume", "transactions", "invested", "break_even"]])

Without a Massive API key, the function falls back to Yahoo-only results — daily resolution, no per-second breakdown. Still useful for screening, less useful for timing.

Real example: live output from today (2026–04–02)

Here’s actual output from a morning run on NVDA and NBIS. Spot prices at time of analysis: NVDA $176.51, NBIS $107.05.

NVDA — Aggressive bullish call stacking

19 whale events. $26.3M calls vs. $3.2M puts. C/P ratio: 8.21.

The dominant theme: institutions accumulating Apr 10 calls at the 172.5 and 177.5 strikes. The 177.5 strike alone saw $9.9M across 4 events — just above current spot, a high-conviction near-term directional bet.

Separately, $4.8M flowed into today’s 172.5 calls. At spot $176.51, those are deep ITM (delta ~0.90+). That’s not leverage hunting — that’s someone paying for near-stock exposure through options, either to avoid moving the stock price or because of position limits on shares. Classic high-conviction ITM play.

The only put position: $3.2M in Apr 10 170p — below spot, almost certainly portfolio hedging.

NBIS — Pure bullish call buying, zero puts

7 whale events. $2.4M calls. No put activity whatsoever.

Three tranches of Apr 10 105c placed between 10:08–10:22am — likely the same entity accumulating in pieces. The Apr 10 110c ($661K, break-even $114.90) requires a ~7% move by next Friday. The May 15 110c ($234K, break-even $121.69) extends the thesis six weeks out.

The Apr 2 103c bought this morning (break-even $105.50) are now deep ITM at spot $107.05. Whoever put on $578K there had conviction before the open.

What’s next

Full code at github.com/staskh/trading_skills. Happy to answer questions about the implementation, the Modified Z-Score approach, or the broader methodology.


How to Built an Institutional Options Trades Detector was originally published in DataDrivenInvestor on Medium, where people are continuing the conversation by highlighting and responding to this story.

This article was originally published on DataDrivenInvestor and is republished here under RSS syndication for informational purposes. All rights and intellectual property remain with the original author. If you are the author and wish to have this article removed, please contact us at [email protected].

NexaPay — Accept Card Payments, Receive Crypto

No KYC · Instant Settlement · Visa, Mastercard, Apple Pay, Google Pay

Get Started →