Go to file

Nicholai 3621d93643 feat(backtest): optimize exit strategy and position sizing

6 iterations of backtest refinements with key discoveries:
- stop losses don't work for prediction markets (prices gap)
- 50% take profit, no stop loss yields +9.37% vs +4.04% baseline
- diversification beats concentration: 100 positions → +18.98%
- added kalman filter, VPIN, regime detection scorers (research)

exit config: take_profit 50%, stop_loss disabled, 48h max hold
position sizing: kelly 0.40, max 30% per position, 100 max positions

2026-01-22 11:16:23 -07:00

data

feat: initial commit with code quality refactoring

2026-01-21 09:32:12 -07:00

scripts

feat(backtest): optimize exit strategy and position sizing

2026-01-22 11:16:23 -07:00

src

feat(backtest): optimize exit strategy and position sizing

2026-01-22 11:16:23 -07:00

.gitignore

feat(backtest): optimize exit strategy and position sizing

2026-01-22 11:16:23 -07:00

Cargo.toml

feat: initial commit with code quality refactoring

2026-01-21 09:32:12 -07:00

PROGRESS.md

feat(backtest): optimize exit strategy and position sizing

2026-01-22 11:16:23 -07:00

README.md

feat: initial commit with code quality refactoring

2026-01-21 09:32:12 -07:00

README.md

kalshi-backtest

quant-level backtesting framework for kalshi prediction markets, using a candidate pipeline architecture.

features

multi-timeframe momentum - detects divergence between short and long-term trends
bollinger bands mean reversion - signals when price touches statistical extremes
order flow analysis - tracks buying vs selling pressure via taker_side
kelly criterion position sizing - dynamic sizing based on edge and win probability
exit signals - take profit, stop loss, time stops, and score reversal triggers
category-aware weighting - different strategies for politics, weather, sports, etc.
ensemble scoring - combine multiple models with dynamic weighting
cross-market correlations - lead-lag relationships between related markets
ML ensemble (optional) - LSTM + MLP models via ONNX runtime

architecture

Historical Data (CSV)
         |
         v
+------------------+
|  Backtest Loop   |  <- simulates time progression
+------------------+
         |
         v
+------------------+
| Candidate Pipeline |
+------------------+
    |         |
    v         v
 Sources   Filters -> Scorers -> Selector
    |
    v
+------------------+
|  Trade Executor  |  <- kelly sizing, exit signals
+------------------+
         |
         v
+------------------+
|   P&L Tracker    |  <- tracks positions, returns
+------------------+
         |
         v
    Performance Metrics

data format

fetch data from kalshi API using the included script:

python scripts/fetch_kalshi_data.py

or download from https://www.deltabase.tech/

markets.csv:

ticker,title,category,open_time,close_time,result,status,yes_bid,yes_ask,volume,open_interest
PRES-2024-DEM,Will Democrats win?,politics,2024-01-01 00:00:00,2024-11-06 00:00:00,no,finalized,45,47,10000,5000

trades.csv:

timestamp,ticker,price,volume,taker_side
2024-01-05 12:00:00,PRES-2024-DEM,45,100,yes
2024-01-05 13:00:00,PRES-2024-DEM,46,50,no

usage

# build
cargo build --release

# run backtest with quant features
cargo run --release -- run \
    --data-dir data \
    --start 2024-01-01 \
    --end 2024-06-01 \
    --capital 10000 \
    --max-position 500 \
    --max-positions 10 \
    --kelly-fraction 0.25 \
    --max-position-pct 0.25 \
    --take-profit 0.20 \
    --stop-loss 0.15 \
    --max-hold-hours 72 \
    --compare-random

# view results
cargo run --release -- summary --results-file results/backtest_result.json

cli options

option	default	description
--data-dir	data	directory with markets.csv and trades.csv
--start	required	backtest start date
--end	required	backtest end date
--capital	10000	initial capital
--max-position	100	max shares per position
--max-positions	5	max concurrent positions
--kelly-fraction	0.25	fraction of kelly criterion (0.1=conservative, 1.0=full)
--max-position-pct	0.25	max % of capital per position
--take-profit	0.20	take profit threshold (20% gain)
--stop-loss	0.15	stop loss threshold (15% loss)
--max-hold-hours	72	time stop in hours
--compare-random	false	compare vs random baseline

scorers

basic scorers:

MomentumScorer - price change over lookback period
MeanReversionScorer - deviation from historical mean
VolumeScorer - unusual volume detection
TimeDecayScorer - prefer markets with more time to close

quant scorers:

MultiTimeframeMomentumScorer - analyzes 1h, 4h, 12h, 24h windows, detects divergence
BollingerMeanReversionScorer - triggers at upper/lower band touches (2 std)
OrderFlowScorer - buy/sell imbalance from taker_side
CategoryWeightedScorer - different weights per category
EnsembleScorer - combines models with dynamic weights
CorrelationScorer - cross-market lead-lag signals

ml scorers (requires ml feature):

MLEnsembleScorer - LSTM + MLP via ONNX

position sizing

uses kelly criterion with safety multiplier:

kelly = (odds * win_prob - (1 - win_prob)) / odds
safe_kelly = kelly * kelly_fraction
position = min(bankroll * safe_kelly, max_position_pct * bankroll)

exit signals

positions can exit via:

resolution - market resolves yes/no
take profit - pnl exceeds threshold
stop loss - pnl below threshold
time stop - held too long (capital rotation)
score reversal - strategy flips bearish

ml training (optional)

train ML models using pytorch, then export to ONNX:

# install dependencies
pip install torch pandas numpy

# train models
python scripts/train_ml_models.py \
    --data data/trades.csv \
    --markets data/markets.csv \
    --output models/ \
    --epochs 50

# enable ml feature
cargo build --release --features ml

metrics

total return ($ and %)
sharpe ratio (annualized)
max drawdown
win rate
average trade P&L
average hold time
trades per day
return by category

extending

add custom scorers by implementing the Scorer trait:

use async_trait::async_trait;

pub struct MyScorer;

#[async_trait]
impl Scorer for MyScorer {
    fn name(&self) -> &'static str {
        "MyScorer"
    }

    async fn score(
        &self,
        context: &TradingContext,
        candidates: &[MarketCandidate],
    ) -> Result<Vec<MarketCandidate>, String> {
        // compute scores...
    }

    fn update(&self, candidate: &mut MarketCandidate, scored: MarketCandidate) {
        if let Some(score) = scored.scores.get("my_score") {
            candidate.scores.insert("my_score".to_string(), *score);
        }
    }
}

then add to the pipeline in backtest.rs.